-
Notifications
You must be signed in to change notification settings - Fork 6.1k
[HiDream LoRA] optimizations + small updates #11381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…sed as well 2. save model card even if model is not pushed to hub 3. remove scheduler initialization from code example - not necessary anymore (it's now if the base model's config) 4. add skip_final_inference - to allow to run with validation, but skip the final loading of the pipeline with the lora weights to reduce memory reqs
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@sayakpaul now that I'm thinking about it, even when validation is enabled, since we're not optimizing the text encoders, can't we just pre-encode the validation prompt embeddings as well? and then we don't need to keep or load text encoders for validation at all and simply pass the embeddings to |
We should. Let's do that. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the optims. LMK if the comments make sense or if anything is unclear.
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
@bot /style |
Style fixes have been applied. View the workflow run here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Left some more comments. I think we can merge this today. Also would be good to see if we test this on a 40GB GPU.
Thanks @sayakpaul! I added prints using your snippet (here) - right before caching & and pre-computation of prompt embeddings and straight after deleting them.
without offloading & caching -
|
…s only pre-encoded if custom prompts are provided, but should be pre-encoded either way)
@bot /style |
Style fixes have been applied. View the workflow run here. |
@bot /style |
Style fixes have been applied. View the workflow run here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks Linoy!
@@ -1140,7 +1131,7 @@ def main(args): | |||
if args.lora_layers is not None: | |||
target_modules = [layer.strip() for layer in args.lora_layers.split(",")] | |||
else: | |||
target_modules = ["to_k", "to_q", "to_v", "to_out.0"] | |||
target_modules = ["to_k", "to_q", "to_v", "to_out"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can add a comment explaining that including to_out
will target all the expert layers.
…rovided and change example to 3d icons
@bot /style |
some memory optimizations:
--skip_final_inference
- to allow to run with validation, but skip the final loading of the pipeline with the lora weights to reduce memory reqsother changes:
todo:
Yarn Art LoRA

training config