-
Notifications
You must be signed in to change notification settings - Fork 9.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert fixes #3967
Convert fixes #3967
Conversation
In recent downloads of LLAMA2 dataset the norm_eps is set to 1e-06, this leads to convert.py erroneously considering the model to be LLAMA1 and setting the context to 2k tokens. Fix it by extending the existing hack to also check for the 1e-06 value.
When vocab_size is detected to be -1 simply remove its value from the parsed params.json and fallback to using the tok_embeddings.weight. Fixes ggerganov#3900
55a939d
to
f36a777
Compare
@@ -250,9 +250,14 @@ def loadOriginalParamsJson(model: LazyModel, config_path: Path) -> Params: | |||
if config.get("rope_theta") == 1000000: | |||
# CodeLlama | |||
n_ctx = 16384 | |||
elif config["norm_eps"] == 1e-05: | |||
elif config["norm_eps"] in (1e-05, 1e-06): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then how do we detect LLaMA-1? See #2384
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, LLAMA2 files with norm_eps equal to 1e-06.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you tell me where to get an original LLaMA-2 model with norm_eps=1e-6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I got it via the LLAMA2's repository download script and utilizing the link that was sent to my email. Since it's tied to my identity and accepting the EULA I'd say it's not prudent to share it? I guess if you create a new registration just now and try to download llama 7b you'd see it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just downloaded LLaMA-2 7B and got this:
{"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-05, "vocab_size": -1}
So you'll have to be more specific.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, so I have actually downloaded llama2-7b-chat :
cat llama-2-7b-chat/params.json
{"dim": 4096, "multiple_of": 256, "n_heads": 32, "n_layers": 32, "norm_eps": 1e-06, "vocab_size": -1}
cat llama-2-7b-chat/checklist.chk
0c4837f3ef97f648452f91faed308a07 consolidated.00.pth
1c39bc3c6b51079fd807cc105b86c9df params.json
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can confirm this.
@ggerganov Our "hack to determine LLaMA v1 vs v2" is apparently insufficient. Maybe we need a CLI arg to disambiguate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, this is quite messy. I guess we do need to add a CLI arg that would be required in cases where the vocab_size
is -1. In such cases we abort and ask the user to specify explicitly LLaMA v1 or LLaMA v2.
# LLaMA v2 | ||
n_ctx = 4096 | ||
# For some reason FB writes -1 to vocab size for their LLAMA2 models | ||
# simply remove this bogus value and let the return statement belo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spelling
Partially obsoleted by #4258 |
Fixes for two issues in the convert.py script: