-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GGUF support #441
GGUF support #441
Conversation
case ModelFamilyLlama: | ||
switch mf.Name() { | ||
case "gguf": | ||
opts.NumGQA = 0 // TODO: remove this when llama.cpp runners differ enough to need separate newLlama functions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the one difference in the interface between the current mainline llama.cpp and the ggml runner. Im keeping them on the same running logic in hopes that the interface will stay roughly the same. We will see if that changes.
53efb91
to
44d53db
Compare
acb4d2a
to
d6ca778
Compare
44d53db
to
78bca49
Compare
78bca49
to
5d3007a
Compare
0b6c1e0
to
647678f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good... it was slightly confusing trying to figure out the differences between GGUF and GGML. I'm guessing there aren't a lot of differences between the two though.
) | ||
|
||
type ModelFamily string | ||
|
||
const ModelFamilyUnknown ModelFamily = "unknown" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these (and the constants below here) generic, or specific to GGML?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/ggerganov/llama.cpp/blob/d59bd97065cd7ded6c4ecab54b1d5e0b1b11e318/llama.cpp#L1661
Its the number of layers, I dont believe this is GGML specific
} | ||
|
||
c.version = version | ||
return nil | ||
return nil, nil | ||
} | ||
|
||
type containerGGJT struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just went down the rabbit hole on the GGJT controversy. Wow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spill the tea?
beca647
to
a6f17b1
Compare
🚀 |
03f4738
to
4c8bb99
Compare
Overall looks great! Left a small comment on the |
4c8bb99
to
15c3af2
Compare
This change adds support for running GGUF models which are currently in beta with llama.cpp. We will continue to run GGML models and this transition will be seamless to users.
GGUF
modelsAs mentioned in #423