Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Question] Reason behind removing lm_head in modules #13

Open
NanoCode012 opened this issue May 25, 2023 · 4 comments
Open

[Question] Reason behind removing lm_head in modules #13

NanoCode012 opened this issue May 25, 2023 · 4 comments

Comments

@NanoCode012
Copy link

NanoCode012 commented May 25, 2023

Hello,

Thank you for the amazing repo. I was curious about this code below.

qlora/qlora.py

Lines 221 to 222 in e381744

if 'lm_head' in lora_module_names: # needed for 16-bit
lora_module_names.remove('lm_head')

Why is lm_head removed? What does it mean by for "needed for 16-bit"? Does it mean targeting this module for fp16 or so is incorrect?

@mallorbc
Copy link

Had the same thought. Have you figured it out? I didn't see anything in the paper either. If you want to add new tokens, you need to target the lm_head anyways.

@mallorbc
Copy link

@artidoro @TimDettmers some insight on this would be greatly appreciated.

@anhdungitvn
Copy link

Could someone share any observations or evaluations regarding the use of 'lm_head' as a target module?
Thanks!

@dandingsky
Copy link

dandingsky commented Dec 9, 2023

Not sure if this is related with this issue, but I found that when applying lora to Llama-2, and include target_modules=['lm_head', 'q_proj', 'v_proj'] in LoraConfig, the lora on lm_head will be removed when the model is being distributed to gpus. I was trying to apply lora on both embed_tokens and lm_head, but peft or deepspeed seems to forbid lora on these 2 modules without explicit warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants