Skip to content

Model saved from deepspeed and accelerate cannot be loaded or incomeplete #7490

@manitadayon

Description

@manitadayon

I train/fine-tune the Mistral-24B small model (mistralai/Mistral-Small-24B-Instruct-2501) using deepspeed and accelerate
and I saved the model using following commands:

if accelerator.is_main_process:
      model =accelerator.unwrap_model(trainer.model)
      model.save_pretrained(model_path)
      tokenizer.save_pretrained(model_path)

However when I try to load the model using
AutoModelForCausalLM.from_pretrained(model_path)
I got weight mismatch size error.

This is how the folder look like can

folder/
         chat_template.jinja
         config.json
         generate_config.json
         model.safetensors
         special_tokens_map.json
         tokenizer_config.json
         tokenizer.json

Can anyone tell me what is going on here and how to solve the problem.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions