Skip to content

Llama 2 70B: Update needed to convert.py to support 70B HF format model files #2376

Closed
@TheBloke

Description

@TheBloke

Following from discussions in the Llama 2 70B PR: #2276 :

Since that PR, converting Llama 2 70B models from Meta's original PTH format files works great.

But it is not possible to make usable Llama 2 70B models from HF format. The models convert and quantise fine, but always produce gibberish, as in this example:

 ### Human: write a story about llamas\n### Assistant:20 300202000 B00A0

@klosax reports:

It looks like the tensors gets transformed with the new permute using the GQA parameter num_local_key_value_heads and num_key_value_heads somehow:
https://github.com/huggingface/transformers/blob/b257c46a075419c09e5ce5c5aa39bc346ecdb9a5/src/transformers/models/llama/convert_llama_weights_to_hf.py#L173-L195

For reference, here are all the changes that happened in Transformers' convert_llama_weights_to_hf.py for the Llama 2 release: huggingface/transformers@07360b6#diff-110a445233a8b15a0875998eeaf75cb8607b38a5daa736291dd058766879bbdd

Would anyone be able to look into this? It's a bit beyond my experience.

I'm getting multiple requests a day for 70B fine tune quants for FreeWilly 2, Llama2-Guanaco, and the newly released Airoboros 1.4.1 70B, and would love to be able to provide them for people.

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions