Closed
Description
Hey!
When running export Llama 3 8B instruct
python -m examples.models.llama2.export_llama --checkpoint <consolidated.00.pth> -p <params.json> -kv --use_sdpa_with_kv_cache -X -qmode 8da4w --group_size 128 -d fp32 --metadata '{"get_bos_id":128000, "get_eos_id":128001}' --embedding-quantize 4,32 --output_name="llama3_kv_sdpa_xnn_qe_4_32.pte"
as instructed here results in
...
size mismatch for layers.4.attention_norm.weight: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for layers.4.ffn_norm.weight: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for norm.weight: copying a param with shape torch.Size([4096]) from checkpoint, the shape in current model is torch.Size([64]).
size mismatch for output.weight: copying a param with shape torch.Size([128256, 4096]) from checkpoint, the shape in current model is torch.Size([512, 64]).
Is there an example of a params.json
file / should there be a different export script?
Thanks in advance!
Metadata
Metadata
Assignees
Labels
No labels