Skip to content

[Core] LoRA improvements pt. 3 #4842

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 81 commits into from
Sep 4, 2023
Merged

[Core] LoRA improvements pt. 3 #4842

merged 81 commits into from
Sep 4, 2023

Conversation

sayakpaul
Copy link
Member

What does this PR do?

Currently, using different scales during LoRA fusion is supported but how to best support it for non-fusion workflows is yet to be brainstormed.

From here:

Make sure that "lora_scale" is applied to all LoRA layers. Instead of passing a cross_attention_kwargs through the attention processors, it's probably better to work with a "setter"-method here that iterates through all layers and if the layer is a LoRA layer it just sets the "lora_scale" for the specific class

Do we do this when a pipeline call is made and if a LoRACompatible* layer has its lora_layer set, then we multiply the scale with the output of the lora_layer. The scale I think still needs to be retrieved from cross_attention_kwargs.

Or, do we set lora_scale attribute in the LoRALinear and LoRAConv layers when they're initialized? But doing so will not allow having different scale values during inference IIUC.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 30, 2023

The documentation is not available anymore as the PR was closed or merged.

@apolinario
Copy link
Collaborator

apolinario commented Aug 30, 2023

But doing so will not allow having different scale values during inference IIUC.

I think that is not ideal - experimentation with the pipe loaded and changing the LoRA weight on inference time to find the sweet spot is a common use-case in the community, many LoRA authors suggest a range and users usually try to find the best scale within the range for their prompt

Also, slider LoRAs are a growing use-case, and they require different values during inference, as it is expected from users to play with different scale values during inference to see how the scale influence the results

E.g.: the curly hair slider LoRA will build subjects with a super straight hair if the LoRA scale is -8 and super curly if the LoRA scale is 8

Such LoRAs have more than 250K downloads on CivitAI in total, and this technique is growing, so I would consider this use-case seriously in the design of this API

@sayakpaul
Copy link
Member Author

Do we do this when a pipeline call is made and if a LoRACompatible* layer has its lora_layer set, then we multiply the scale with the output of the lora_layer. The scale I think still needs to be retrieved from cross_attention_kwargs.

Let me proceed with this one and see how this factors out.

@sayakpaul
Copy link
Member Author

I am still working on it. Please don't commit anything to it yet.

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Super nice! Left mostly nits.

  • It would be nice to call _adjust_lora_scale_text_encoder part directly in the encode_prompt function
  • In addition to the tests that have been added, can we also test that: fuse lora with scales gives the same result as normal lora with scales?

Copy link
Contributor

@patrickvonplaten patrickvonplaten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice! Good to merge for me whenever :-)

@sayakpaul
Copy link
Member Author

@patrickvonplaten still some problems (specifically the test case you mentioned here) and I am debugging them currently. About the tests failing I think they're acting flaky because without the lora_scale related changes, they sometimes fail and sometimes don't (cc: @DN6).

I will request for another review from you when this PR's ready. Thanks for your reviews so far!

@sayakpaul
Copy link
Member Author

sayakpaul commented Sep 4, 2023

@patrickvonplaten in e60f450, I added a test to check for the equivalence (as mentioned in #4842). However, it currently fails. I looked into the potential suspects but they seem alright to me (indicated by the print statements one can see). If you have some time, it would be great to have some suggestions here.

To clear the air, I am shifting to #4855.

@sayakpaul
Copy link
Member Author

Back at it.

@sayakpaul
Copy link
Member Author

sayakpaul commented Sep 4, 2023

There's something wrong with our LoRA attention processor, likely (maybe due to the changes introduced in #4833).

I am confirming this because if we load non-diffusers LoRA checkpoints the effects of scale on both LoRA fusion and non-fusion seem to be preserved. Here's my exploration notebook that confirms this behavior: https://gist.github.com/sayakpaul/ddeed50516ecbd5be89e760c830da46f.

So, this probably narrows down the area we need to look for to solve the current bug. Looking into it.

@patrickvonplaten patrickvonplaten merged commit c81a88b into main Sep 4, 2023
with torch.no_grad():
for parameter in lora_attn_parameters:
if randn_weight:
parameter[:] = torch.randn_like(parameter)
parameter[:] = torch.randn_like(parameter) * var
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should allow to have smaller LoRA weights so that they we can keep a reasonable precision in the tests.

@patrickvonplaten patrickvonplaten deleted the lora-improvements-pt3 branch September 4, 2023 21:53
@patrickvonplaten
Copy link
Contributor

@sayakpaul I think your PR was just fine. I'm quite certain that the test only failed due to numerical precision issues which were fixed by passing LoRA layers that are less extreme in values (e.g. are distributed via N(0, 0.1) instead of N(0, 1)).

Also ran the slow tests locally and they all passed. Think we're good here. Nice job!

@xhinker
Copy link
Contributor

xhinker commented Sep 23, 2023

@sayakpaul Thank you for the PR, I was confused on why you add the

self.lora_scale = lora_scale

in the end of _fuse_lora function, after reading through #4473 , I realized that this line code is for enabling a correct unfuse, if not adding it, the unfuse don't have the scale value to remove the LoRA weight off the base model. am I right?

@sayakpaul
Copy link
Member Author

Precisely.

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* throw warning when more than one lora is attempted to be fused.

* introduce support of lora scale during fusion.

* change test name

* changes

* change to _lora_scale

* lora_scale to call whenever applicable.

* debugging

* lora_scale additional.

* cross_attention_kwargs

* lora_scale -> scale.

* lora_scale fix

* lora_scale in patched projection.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* styling.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* remove unneeded prints.

* remove unneeded prints.

* assign cross_attention_kwargs.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* clean up.

* refactor scale retrieval logic a bit.

* fix nonetypw

* fix: tests

* add more tests

* more fixes.

* figure out a way to pass lora_scale.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* unify the retrieval logic of lora_scale.

* move adjust_lora_scale_text_encoder to lora.py.

* introduce dynamic adjustment lora scale support to sd

* fix up copies

* Empty-Commit

* add: test to check fusion equivalence on different scales.

* handle lora fusion warning.

* make lora smaller

* make lora smaller

* make lora smaller

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
AmericanPresidentJimmyCarter pushed a commit to AmericanPresidentJimmyCarter/diffusers that referenced this pull request Apr 26, 2024
* throw warning when more than one lora is attempted to be fused.

* introduce support of lora scale during fusion.

* change test name

* changes

* change to _lora_scale

* lora_scale to call whenever applicable.

* debugging

* lora_scale additional.

* cross_attention_kwargs

* lora_scale -> scale.

* lora_scale fix

* lora_scale in patched projection.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* styling.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* remove unneeded prints.

* remove unneeded prints.

* assign cross_attention_kwargs.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* clean up.

* refactor scale retrieval logic a bit.

* fix nonetypw

* fix: tests

* add more tests

* more fixes.

* figure out a way to pass lora_scale.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* unify the retrieval logic of lora_scale.

* move adjust_lora_scale_text_encoder to lora.py.

* introduce dynamic adjustment lora scale support to sd

* fix up copies

* Empty-Commit

* add: test to check fusion equivalence on different scales.

* handle lora fusion warning.

* make lora smaller

* make lora smaller

* make lora smaller

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

LoRA Still Influencing Output Despite Setting "scale" to 0
5 participants