Open
Description
Question
I found that if I don't pass the BitsAndBytesConfig args in LlavaLlamaForCausalLM.from_pretrained() like this:
model = LlavaLlamaForCausalLM.from_pretrained( tokenizer_path, torch_dtype=torch.bfloat16 , #**bnb_model_from_pretrained_args )
the shape of lm_head.weight is torch.Size([32000, 5120])
But if I pass BitsAndBytesConfig args in LlavaLlamaForCausalLM.from_pretrained() like:
model = LlavaLlamaForCausalLM.from_pretrained( tokenizer_path, torch_dtype=torch.bfloat16 , **bnb_model_from_pretrained_args )
the shape of lm_head.weight will change and become torch.Size([81920000, 1])
Why the shape of lm_head.weight changes?
Metadata
Metadata
Assignees
Labels
No labels