Skip to content

The pre-trained weight file (checkpoint) does not match the model architecture defined by your current code. #6

@LTC232

Description

@LTC232

Thank you for your research contributions and for open-sourcing the code in this field. However, when we tried to run “python run.py --cfg configs/liger_gla.yaml” and “python run.py --cfg configs/liger_gsa.yaml” (https://huggingface.co/linear-moe-hub/Liger-GLA-8B), we encountered the following error for both:

raise RuntimeError(
RuntimeError: Error(s) in loading state_dict for Embedding:
size mismatch for weight: copying a param with shape torch.Size([128256, 4096]) from checkpoint, the shape in current model is torch.Size([32000, 4096]).”

This seems to indicate a discrepancy between the model configuration provided by the project and the checkpoint weights. Could you please provide a solution?

Looking forward to your reply. Best wishes for your work!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions