Skip to content

Question about the shape of lm_head.weight #1850

Open
@Jay-zzcoder

Description

@Jay-zzcoder

Question

I found that if I don't pass the BitsAndBytesConfig args in LlavaLlamaForCausalLM.from_pretrained() like this:
model = LlavaLlamaForCausalLM.from_pretrained( tokenizer_path, torch_dtype=torch.bfloat16 , #**bnb_model_from_pretrained_args )
the shape of lm_head.weight is torch.Size([32000, 5120])

But if I pass BitsAndBytesConfig args in LlavaLlamaForCausalLM.from_pretrained() like:
model = LlavaLlamaForCausalLM.from_pretrained( tokenizer_path, torch_dtype=torch.bfloat16 , **bnb_model_from_pretrained_args )
the shape of lm_head.weight will change and become torch.Size([81920000, 1])

Why the shape of lm_head.weight changes?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions