-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support starcoder family architectures (1B/3B/7B/13B) #3076
Comments
I was also looking for a small coding model. Someone on reddit recommended me to use stablecode 3b, it's based on neox architecture. I only just noticed that the model card says it's not supported by llama.cpp, but I did see that there's a convert script in this repo for gpt-neox so it might still be possible. 1b would of course be amazing too! |
Yes, we can add more architectures - the main requirements are:
The https://github.com/ggerganov/ggml/tree/master/examples So it is a good starting point to bring it here |
Great - will start working on it |
done in #3187 |
run
|
probably not relevant about what model it is since found same problem in here #4530. |
Related Issues:
#1901
#1441
#1326
Previously, it wasn't recommended to incorporate non-llama architectures into llama.cpp. However, in light of the recent addition of the Falcon architecture (see Pull Request #2717), it might be worth reconsidering this stance.
One distinguishing feature of Starcoder is its ability to provide a complete series of models ranging from 1B to 13B. This capability can prove highly beneficial for speculative decoding and making coding models available for edge devices (e.g., M1/M2 Macs).
I can contribute the PR if it matches llama.cpp's roadmap.
The text was updated successfully, but these errors were encountered: