Merge changes by Skquark · Pull Request #213 · Skquark/diffusers

Skquark · 2025-07-04T07:51:15Z

No description provided.

* Fix typos in strings and comments Signed-off-by: co63oc <co63oc@users.noreply.github.com> * Update src/diffusers/hooks/hooks.py Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Update src/diffusers/hooks/hooks.py Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Update layerwise_casting.py * Apply style fixes * update --------- Signed-off-by: co63oc <co63oc@users.noreply.github.com> Co-authored-by: Aryan <contact.aryanvs@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

update torchao doc link

Use float32 for RoPE on MPS in Wan

misc changes in the bnb tests for consistency.

chore: rename lora model-level tests.

* cache * feedback

* initial * update * hunyuanvideo * ltx * fix * wan * gen guide * feedback * feedback * pipeline-level quant config * feedback * ltx

* update * update * update * update * update * update * update * update * update * update * update * updatee * update * update * update * update * update * update * update * update * update * update * update * update * update * update

…in. (#11656) bring PipelineQuantizationConfig at the top of the import chain.

[examples] flux-control: use num_training_steps_for_scheduler in get_scheduler instead of args.max_train_steps * accelerator.num_processes Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* use deterministic to get stable result Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add deterministic for int8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add a test for group offloading + compilation. * tests

* initial support * make fix-copies * fix no split modules * add conversion script * refactor * add pipeline test * refactor * fix bug with mask * fix for reference images * remove print * update docs * update slices * update * update * update example

* fix how compiler tester mixins are used. * propagate * more

…ocess (#11596) * update * update * update * update * update * update * update

* Add community class StableDiffusionXL_T5Pipeline Will be used with base model opendiffusionai/stablediffusionxl_t5 * Changed pooled_embeds to use projection instead of slice * "make style" tweaks * Added comments to top of code * Apply style fixes

…ly the inpainted area (#11658) * Update pipeline_flux_inpaint.py to fix padding_mask_crop returning only the inpainted area and not the entire image. * Apply style fixes * Update src/diffusers/pipelines/flux/pipeline_flux_inpaint.py

* allow loading from repo with dot in name * put new arg at the end to avoid breaking compatibility * add test for loading repo with dot in name --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

support Flux Control LoRA with bnb 8bit.

* fix: vae sampling mode * fix a typo

…or test cases (#11654) * enable torchao cases on XPU Signed-off-by: Matrix YAO <matrix.yao@intel.com> * device agnostic APIs Signed-off-by: YAO Matrix <matrix.yao@intel.com> * more Signed-off-by: YAO Matrix <matrix.yao@intel.com> * fix style Signed-off-by: YAO Matrix <matrix.yao@intel.com> * enable test_torch_compile_recompilation_and_graph_break on XPU Signed-off-by: YAO Matrix <matrix.yao@intel.com> * resolve comments Signed-off-by: YAO Matrix <matrix.yao@intel.com> --------- Signed-off-by: Matrix YAO <matrix.yao@intel.com> Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* start adding compilation tests for quantization. * fixes * make common utility. * modularize. * add group offloading+compile * xfail * update * Update tests/quantization/test_torch_compile_utils.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fixes --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* add clarity in documentation for device_map * docs * fix how compiler tester mixins are used. * propagate * more * typo. * fix tests * fix order of decroators. * clarify more. * more test cases. * fix doc * fix device_map docstring in pipeline_utils. * more examples * more * update * remove code for stuff that is already supported. * fix stuff.

* improve docstrings for wan * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * make style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix: remove redundant indexing * style

* add compilation bits to the bitsandbytes docs. * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * finish --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* [rfc][compile] compile method for DiffusionPipeline * Apply suggestions from code review Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> * Apply style fixes * Update docs/source/en/optimization/fp16.md * check --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* add test for checking compile on different shapes. * update * update * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

…11809) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* support flux kontext * make fix-copies * add example * add tests * update docs * update * add note on integrity checker * make fix-copies issue * add copied froms * make style * update repository ids * more copied froms

* support flux kontext * make fix-copies * add example * add tests * update docs * update * add note on integrity checker * initial commit * initial commit * add readme section and fixes in the training script. * add test * rectify ckpt_id * fix ckpt * fixes * change id * update * Update examples/dreambooth/train_dreambooth_lora_flux_kontext.py Co-authored-by: Aryan <aryan@huggingface.co> * Update examples/dreambooth/README_flux.md --------- Co-authored-by: Aryan <aryan@huggingface.co> Co-authored-by: linoytsaban <linoy@huggingface.co> Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>

fix

* disable onnx, mps, flax * remove

* update * update * update * update * update * user property instead

…1804) * update * add test * address review comments * update * fixes * change decorator order to fix tests * try fix * fight tests

* fix: lora unloading behvaiour * fix * update

* feat: use exclude modules to loraconfig. * version-guard. * tests and version guard. * remove print. * describe the test * more detailed warning message + shift to debug * update * update * update * remove test

* ENH Improve speed of expanding LoRA scales Resolves #11816 The following call proved to be a bottleneck when setting a lot of LoRA adapters in diffusers: https://github.com/huggingface/diffusers/blob/cdaf84a708eadf17d731657f4be3fa39d09a12c0/src/diffusers/loaders/peft.py#L482 This is because we would repeatedly call unet.state_dict(), even though in the standard case, it is not necessary: https://github.com/huggingface/diffusers/blob/cdaf84a708eadf17d731657f4be3fa39d09a12c0/src/diffusers/loaders/unet_loader_utils.py#L55 This PR fixes this by deferring this call, so that it is only run when it's necessary, not earlier. * Small fix --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

remove print

…11825) * add resolution changes tests to hotswapping test suite. * fixes * docs * explain duck shapes * fix

* reset deterministic in tearDownClass Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * fix deterministic setting Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update * update * update docs

fix single_file example.

* use real instead of complex tensors in Wan2.1 RoPE * remove the redundant type conversion * unpack rotary_emb * register rotary embedding frequencies as non-persistent buffers * Apply style fixes --------- Co-authored-by: Aryan <aryan@huggingface.co> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* draft * fix * fix * feedback * feedback

add warning Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

@nitinmukesh

…ansformer` (#11807) * add `WandVACETransformer3DModel` in`SINGLE_FILE_LOADABLE_CLASSES` * add rename keys for `VACE` add rename keys for `VACE` * fix typo Sincere thanks to @nitinmukesh 🙇‍♂️ * support for `1.3B VACE` model Sincere thanks to @nitinmukesh again🙇‍♂️ * update * update * Apply style fixes --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* wan vace. * update * update * import problem

* update: FluxKontextInpaintPipeline support * fix: Refactor code, remove mask_image_latents and ruff check * feat: Add test case and fix with pytest * Apply style fixes * copies --------- Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* initial commit * initial commit * initial commit * fix import * fix prefix * remove print * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* update * Update docs/source/en/using-diffusers/schedulers.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update schedulers.md --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* start overhauling the benchmarking suite. * fixes * fixes * checking. * checking * fixes. * error handling and logging. * add flops and params. * add more models. * utility to fire execution of all benchmarking scripts. * utility to push to the hub. * push utility improvement * seems to be working. * okay * add torchprofile dep. * remove total gpu memory * fixes * fix * need a big gpu * better * what's happening. * okay * separate requirements and make it nightly. * add db population script. * update secret name * update secret. * population db update * disable db population for now. * change to every monday * Update .github/workflows/benchmark.yml Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * quality improvements. * reparate hub upload step. * repository * remove csv * check * update * update * threading. * update * update * updaye * update * update * update * remove peft dep * upgrade runner. * fix * fixes * fix merging csvs. * push dataset to the Space repo for analysis. * warm up. * add a readme * Apply suggestions from code review Co-authored-by: Luc Georges <McPatate@users.noreply.github.com> * address feedback * Apply suggestions from code review * disable db workflow. * update to bi weekly. * enable population * enable * updaye * update * metadata * fix --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>

co63oc and others added 30 commits May 30, 2025 18:49

[docs] update torchao doc link (#11634)

b975bce

update torchao doc link

Use float32 RoPE freqs in Wan with MPS backends (#11643)

3a31b29

Use float32 for RoPE on MPS in Wan

[chore] misc changes in the bnb tests for consistency. (#11355)

d4dc4d7

misc changes in the bnb tests for consistency.

[tests] chore: rename lora model-level tests. (#11481)

20273e5

chore: rename lora model-level tests.

[docs] Caching methods (#11625)

9f48394

* cache * feedback

[docs] Model cards (#11112)

c934720

* initial * update * hunyuanvideo * ltx * fix * wan * gen guide * feedback * feedback * pipeline-level quant config * feedback * ltx

[chore] bring PipelineQuantizationConfig at the top of the import cha…

0142f6f

…in. (#11656) bring PipelineQuantizationConfig at the top of the import chain.

[examples] flux-control: use num_training_steps_for_scheduler (#11662)

745199a

[examples] flux-control: use num_training_steps_for_scheduler in get_scheduler instead of args.max_train_steps * accelerator.num_processes Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

use deterministic to get stable result (#11663)

0f91f2f

* use deterministic to get stable result Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * add deterministic for int8 test Signed-off-by: jiqing-feng <jiqing.feng@intel.com> --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

[tests] add test for torch.compile + group offloading (#11670)

16c955c

* add a test for group offloading + compilation. * tests

Wan VACE (#11582)

73a9d58

* initial support * make fix-copies * fix no split modules * add conversion script * refactor * add pipeline test * refactor * fix bug with mask * fix for reference images * remove print * update docs * update slices * update * update * update example

fixed axes_dims_rope init (huggingface#11641) (#11678)

f46abfe

[tests] Fix how compiler mixin classes are used (#11680)

7c6e9ef

* fix how compiler tester mixins are used. * propagate * more

Introduce DeprecatedPipelineMixin to simplify pipeline deprecation pr…

5b0dab1

…ocess (#11596) * update * update * update * update * update * update * update

Allow remote code repo names to contain "." (#11652)

b79803f

* allow loading from repo with dot in name * put new arg at the end to avoid breaking compatibility * add test for loading repo with dot in name --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[LoRA] support Flux Control LoRA with bnb 8bit. (#11655)

8e88495

support Flux Control LoRA with bnb 8bit.

[Wan] Fix VAE sampling mode in WanVideoToVideoPipeline (#11639)

e27142a

* fix: vae sampling mode * fix a typo

Improve Wan docstrings (#11689)

f3e0911

* improve docstrings for wan * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * make style --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

Set _torch_version to N/A if torch is disabled. (#11645)

447ccd0

Avoid DtoH sync from access of nonzero() item in scheduler (#11696)

b272807

Apply Occam's Razor in position embedding calculation (#11562)

47ef794

* fix: remove redundant indexing * style

swap out token for style bot. (#11701)

648e895

anijain2305 and others added 29 commits June 26, 2025 08:41

adjust tolerance criteria for test_float16_inference in unit test (#…

27bf7fc

…11809) Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

Flux Kontext (#11812)

eea7689

* support flux kontext * make fix-copies * add example * add tests * update docs * update * add note on integrity checker * make fix-copies issue * add copied froms * make style * update repository ids * more copied froms

Kontext fixes (#11815)

d7dd924

fix

remove syncs before denoising in Kontext (#11818)

21543de

[CI] disable onnx, mps, flax from the CI (#11803)

e8e44a5

* disable onnx, mps, flax * remove

TorchAO compile + offloading tests (#11697)

cdaf84a

* update * update * update * update * update * user property instead

Support dynamically loading/unloading loras with group offloading (#1…

76ec3d1

…1804) * update * add test * address review comments * update * fixes * change decorator order to fix tests * try fix * fight tests

[lora] fix: lora unloading behvaiour (#11822)

05e7a85

* fix: lora unloading behvaiour * fix * update

[lora]feat: use exclude modules to loraconfig. (#11806)

bc34fa8

* feat: use exclude modules to loraconfig. * version-guard. * tests and version guard. * remove print. * describe the test * more detailed warning message + shift to debug * update * update * update * remove test

Remove print statement in SCM Scheduler (#11836)

f064b3b

remove print

[tests] add test for hotswapping + compilation on resolution changes (#…

87f83d3

…11825) * add resolution changes tests to hotswapping test suite. * fixes * docs * explain duck shapes * fix

[tests] Fix failing float16 cuda tests (#11835)

3f3f0c1

* update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[single file] Cosmos (#11801)

a79c3af

* update * update * update docs

[docs] fix single_file example. (#11847)

4704586

fix single_file example.

[docs] Batch generation (#11841)

d31b8ce

* draft * fix * fix * feedback * feedback

[docs] Deprecated pipelines (#11838)

64a9210

add warning Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

fix norm not training in train_control_lora_flux.py (#11832)

5ef74fd

[lora] tests for exclude_modules with Wan VACE (#11843)

6f1d669

* wan vace. * update * update * import problem

[Flux Kontext] Support Fal Kontext LoRA (#11823)

f864a9a

* initial commit * initial commit * initial commit * fix import * fix prefix * remove print * Apply style fixes --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Skquark merged commit c071386 into Skquark:main Jul 4, 2025
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge changes#213

Merge changes#213
Skquark merged 108 commits intoSkquark:mainfrom
huggingface:main

Skquark commented Jul 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

Skquark commented Jul 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants