-
Notifications
You must be signed in to change notification settings - Fork 12.1k
gguf : enforce that tensor names are unique #6905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A linear search is definitely not optimal, but we don't really have a good alternative at the moment, since we don't have implementations of the data structures that would be necessary to prevent this in ggml. Currently the models are small enough that it is not likely to cause a significant performance issues, and fixing the immediate issue is more important. So I think it is ok to use |
@slaren FYI, here is the model that I used for testing: https://huggingface.co/ngxson/test_gguf_models/blob/main/stories260K_duplicated_tensors.gguf I edited it with hex editor to have 2 times |
Co-authored-by: slaren <slarengh@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I verified that this prevents both loading existing models with duplicated tensors, and creating models with duplicated tensors. The GPT-2 arch conversion in convert-hf-to-gguf.py
will need to be updated to avoid duplicating output.weight
.
@slaren Thanks for the confirmation. I'm not very familiar with |
Yes, that's a different issue, I was just pointing that's the reason why some models have duplicated tensors. Merge at will. |
* not allow adding duplicated tensor name * no duplicated tensor while reading gguf * typo * throw exception inside llama_model_loader Co-authored-by: slaren <slarengh@gmail.com> --------- Co-authored-by: slaren <slarengh@gmail.com>
Resolve #6836
For now, only adding duplicated tensor is prevented:
ggml.c
usesgguf_find_tensor
to check for duplicated tensor. This is a linear search so may not be a suitable implementation when loading the gguf (but acceptable when writing gguf, since writing is not happen very often). What do you think about this point @slaren ?GGUFWriter
from gguf-py uses a set to keep track of tensor names when writing a gguf. We can use the same logic forGGUFReader