-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Hello, sorry for the fact I couldn't find the solution in issues and if the question is dumb, but looking for the answer and trying by myself didn't give the result.
Details:
The problem is I have low-end PC which is capable of running Alpaca and Vicuna (both 7B), but quite slowly. On the other hand, trying different models I saw that models under 1B parameters run quite well. Mainly they are based on Flan-T5. They give good results as for my machine and quickly enough (about 3-5 tokens per second). Using it with text is another better point. For example, asking it "basing on this text, answer -..." I have almost perfect answer. But giving it text each time is bad practice as for me. I mean, time spend etc.
Short question:
Is there any way to use this tool with any of these models?
LaMini-Flan-T5-783M
Flan-T5-Alpaca (770M or something)
RWKV (under 1.5B)
(any other good small models, under 1B parameters)
If you give the detailed manual I will be very grateful! Solutions, other than OwnGPT, privateGPT etc. are also welcome!
Thank you for understanding, answers and sorry for any inconvenience!