-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce enum llama_ftype #709
Conversation
llama.cpp
Outdated
@@ -100,7 +102,7 @@ struct llama_hparams { | |||
int32_t n_head = 32; | |||
int32_t n_layer = 32; | |||
int32_t n_rot = 64; | |||
int32_t f16 = 1; | |||
int32_t f16 = LLAMA_FTYPE_MOSTLY_F16; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How hard is it to change name of this field? This is a private struct, so should be easy, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be a problem; I just didn't touch any variable names because I didn't know the intended meaning e.g. of the i
in itype
. Shall I rename f16
to ftype
and print a user-friendly string?
fprintf(stderr, "%s: ftype = %s\n", __func__, ftype_str[hparams.ftype]);
Hopefully no-one tries to parse the stderr outputs, otherwise this might break that. I might do the same for type
3 lines further down.
Thumbs-up from me on adding names. But I’d argue that the file type field ( In the long run I’d like to see llama.cpp scrap the file type field entirely in favor of just using the tensor type and size fields to initialize the ggml tensor metadata. |
@comex :
You're right. So actually So there would be another enum, but should that be public? enum llama_ttype {
LLAMA_TTYPE_F32 = 0,
LLAMA_TTYPE_F16 = 1,
LLAMA_TTYPE_Q4_0 = 2,
LLAMA_TTYPE_Q4_1 = 3,
}; We might want to make I'm not sure about the naming - the |
It's not written to the file, so that might be a good idea. No strong opinions. |
@comex: Will this conflict with your plans in #711 (comment) ? You said you want to get rid of |
Even after my change makes the core loading path not care about file type, there will still be some uses, like in the quantization code (which has to accept a file type as an argument and then write it to the file header). So a change like this would still be useful, and I’ll deal with any merge conflicts. However, I still wouldn’t recommend merging this as-is, for the reason I mentioned before. |
I decided to just use |
As discussed in #678, this introduces
enum llama_ftype
to llama.h which represents the various file types used by llama.cppThe goal is to improve maintainability of the code by not having integer literals in various places. I hope I didn't miss one.
I used the names and comments from @comex's pending changes to the Python scripts in #545. I have not touched the scripts so as not to cause conflicts with their work.