Error when converting "state-spaces/mamba2-130m" weights to huggingface-compatible format #32496

learning-chip · 2024-08-07T15:30:59Z

System Info

Transformers version: 4.40.0

Who can help?

@molbap @ArthurZucker

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

I tried to load https://huggingface.co/state-spaces/mamba2-130m into HF-compatible Mamba-2 (#32080), using the convert_mamba2_ssm_checkpoint_to_pytorch.py script. But the script assumes model weights to be in safetensors format:

transformers/src/transformers/models/mamba2/convert_mamba2_ssm_checkpoint_to_pytorch.py

Lines 32 to 35 in 984bc11

    
           with safe_open(mamba2_checkpoint_path, framework="pt") as f: 
        
               for k in f.keys(): 
        
                   newk = k.removeprefix("model.") 
        
                   original_state_dict[newk] = f.get_tensor(k).clone()

but the weight file is is in torch bin format and cannot be opened in this way.

Also, the script requires a tokenizer path:

transformers/src/transformers/models/mamba2/convert_mamba2_ssm_checkpoint_to_pytorch.py

Lines 55 to 61 in 984bc11

    
           parser.add_argument( 
        
               "-c", 
        
               "--tokenizer_model_path", 
        
               type=str, 
        
               required=True, 
        
               help="Path to a `config.json` file corresponding to a Mamba2Config of the original mamba2_ssm model.", 
        
           )

but state-spaces/mamba2-130m reuses EleutherAI/gpt-neox-20b tokenizer instead of having its own.

Expected behavior

convert_mamba2_ssm_checkpoint_to_pytorch.py should be able to convert those Mamba-2 weights:

The text was updated successfully, but these errors were encountered:

molbap · 2024-08-07T19:28:44Z

Thanks for the issue! Yes, the current conversion script is made for the mistral/codestral mamba 2 release, which uses safetensors + their own tokenizer - if you want to work on a modification to the conversion script on a new PR, feel free to do so and we can help! Else, I'll take a look at that later :)

learning-chip added the bug label Aug 7, 2024

ArthurZucker added the Good Second Issue Issues that are more difficult to do than "Good First" issues - give it a try if you want! label Aug 8, 2024

vasqu mentioned this issue Aug 10, 2024

Mamba2 conversion script for original models #32580

Merged

5 tasks

ArthurZucker closed this as completed in #32580 Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error when converting "state-spaces/mamba2-130m" weights to huggingface-compatible format #32496

Error when converting "state-spaces/mamba2-130m" weights to huggingface-compatible format #32496

learning-chip commented Aug 7, 2024 •

edited

Loading

molbap commented Aug 7, 2024

Error when converting "state-spaces/mamba2-130m" weights to huggingface-compatible format #32496

Error when converting "state-spaces/mamba2-130m" weights to huggingface-compatible format #32496

Comments

learning-chip commented Aug 7, 2024 • edited Loading

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

molbap commented Aug 7, 2024

learning-chip commented Aug 7, 2024 •

edited

Loading