-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FIX Make DoRA work with Conv1D layers #1588
FIX Make DoRA work with Conv1D layers #1588
Conversation
Previously, code was not transposing weights when calculating the norm. In addition, update some DoRA tests to use BitsAndBytesConfig.
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Line 399 and 405 also needs to be updated. I needed to transpose the delta_weight
in weight_norm = self._get_weight_norm(orig_weights, delta_weight, scaling=1)
and transpose dora_factor.view(-1, 1)
to match the dimensions.
Similarly for unsafe merge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback. Do you have a code snippet that results in an error without these additional changes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing that I can share, unfortunately. To replicate, it should be enough to initialize a DoRA model on gpt2 and call merge_and_unload() on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There you go:
import transformers, peft
model = transformers.AutoModelForCausalLM.from_pretrained('gpt2')
dora_config = peft.LoraConfig(r=4, use_dora=True)
model = peft.get_peft_model(model, dora_config)
model.merge_and_unload('/tmp/m/')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unloading and merging model: 5%|▌ | 9/176 [00:00<00:01, 113.93it/s]
Traceback (most recent call last):
File "...python3.9/site-packages/IPython/core/interactiveshell.py", line 3526, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-4-3672d1da59ca>", line 1, in <module>
model.merge_and_unload('/tmp/m/')
File "...python3.9/site-packages/peft/tuners/lora/model.py", line 784, in merge_and_unload
return self._unload_and_optionally_merge(
File "...python3.9/site-packages/peft/tuners/lora/model.py", line 438, in _unload_and_optionally_merge
target.merge(safe_merge=safe_merge, adapter_names=adapter_names)
File "...python3.9/site-packages/peft/tuners/lora/layer.py", line 420, in merge
weight_norm = self._get_weight_norm(base_layer.weight, delta_weight, scaling=1).detach()
File "...python3.9/site-packages/peft/tuners/lora/layer.py", line 176, in _get_weight_norm
weight = weight + scaling * lora_weight
RuntimeError: The size of tensor a (768) must match the size of tensor b (2304) at non-singleton dimension 1```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot, I added a fix and tests for merge_and_unload
. If you could check again, that would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works on my end. Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @BenjaminBossan !
Previously, code was not transposing weights when calculating the norm.
In addition, update some DoRA tests to use BitsAndBytesConfig.