Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLAMA 2 70B convert fails with: failed to find n_mult for (n_ff=28672, n_embd=8192) #2286

Closed
Melbar666 opened this issue Jul 20, 2023 · 8 comments · Fixed by #2427
Closed

Comments

@Melbar666
Copy link

python C:\src\llama.cpp\convert.py --outfile ggml-model-f16.bin --outtype f16 .
Loading model file model-00001-of-00015.safetensors
Loading model file model-00001-of-00015.safetensors
Loading model file model-00002-of-00015.safetensors
Loading model file model-00003-of-00015.safetensors
Loading model file model-00004-of-00015.safetensors
Loading model file model-00005-of-00015.safetensors
Loading model file model-00006-of-00015.safetensors
Loading model file model-00007-of-00015.safetensors
Loading model file model-00008-of-00015.safetensors
Loading model file model-00009-of-00015.safetensors
Loading model file model-00010-of-00015.safetensors
Loading model file model-00011-of-00015.safetensors
Loading model file model-00012-of-00015.safetensors
Loading model file model-00013-of-00015.safetensors
Loading model file model-00014-of-00015.safetensors
Loading model file model-00015-of-00015.safetensors
Loading vocab file tokenizer.model
Traceback (most recent call last):
File "C:\src\llama.cpp\convert.py", line 1264, in
main()
File "C:\src\llama.cpp\convert.py", line 1253, in main
params = Params.load(model_plus)
File "C:\src\llama.cpp\convert.py", line 203, in load
params = Params.loadHFTransformerJson(model_plus.model, hf_transformer_config_path)
File "C:\src\llama.cpp\convert.py", line 187, in loadHFTransformerJson
n_mult = find_n_mult(n_ff, n_embd);
File "C:\src\llama.cpp\convert.py", line 140, in find_n_mult
raise Exception(f"failed to find n_mult for (n_ff={n_ff}, n_embd={n_embd}).")
Exception: failed to find n_mult for (n_ff=28672, n_embd=8192).

@Green-Sky
Copy link
Collaborator

see #2276

@klosax
Copy link
Contributor

klosax commented Jul 20, 2023

Workaround: #2276 (comment)

@Green-Sky
Copy link
Collaborator

can you try 4096 ?
(taken from https://huggingface.co/meta-llama/Llama-2-70b/blob/main/params.json)

@klosax
Copy link
Contributor

klosax commented Jul 20, 2023

can you try 4096 ?

It does not matter as the n_ff parameter gets hardcoded by the PR:

llama.cpp/llama.cpp

Lines 1029 to 1030 in 2d2bb6b

//const uint32_t n_ff = ((2*(4*hparams.n_embd)/3 + hparams.n_mult - 1)/hparams.n_mult)*hparams.n_mult;
const uint32_t n_ff = 28672;

@FNsi
Copy link
Contributor

FNsi commented Jul 20, 2023

I'd think u just need to delete that exception about line 140 in convert.py

@Green-Sky
Copy link
Collaborator

Green-Sky commented Jul 20, 2023

It does not matter as the n_ff parameter gets hardcoded by the PR:

ah well, it does not matter, since both are workarounds.

gguf will have an n_ff hyperparameter.

@nalzok
Copy link

nalzok commented Jul 27, 2023

Workaround: #2276 (comment)

Looks like it should be n_mult = 4096 instead of n_mult = 256? #2276 (comment)

@Green-Sky
Copy link
Collaborator

Green-Sky commented Jul 27, 2023

the continuation of this issue is tracked here: #2376

after #2276 solved most issues

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants