Skip to content

Merge changes#213

Merged
Skquark merged 108 commits intoSkquark:mainfrom
huggingface:main
Jul 4, 2025
Merged

Merge changes#213
Skquark merged 108 commits intoSkquark:mainfrom
huggingface:main

Conversation

@Skquark
Copy link
Owner

@Skquark Skquark commented Jul 4, 2025

No description provided.

co63oc and others added 30 commits May 30, 2025 18:49
* Fix typos in strings and comments

Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------

Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
misc changes in the bnb tests for consistency.
chore: rename lora model-level tests.
* initial

* update

* hunyuanvideo

* ltx

* fix

* wan

* gen guide

* feedback

* feedback

* pipeline-level quant config

* feedback

* ltx
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updatee

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
…in. (#11656)

bring PipelineQuantizationConfig at the top of the import chain.
[examples] flux-control: use num_training_steps_for_scheduler in get_scheduler instead of args.max_train_steps * accelerator.num_processes

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* use deterministic to get stable result

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add deterministic for int8 test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
* add a test for group offloading + compilation.

* tests
* initial support

* make fix-copies

* fix no split modules

* add conversion script

* refactor

* add pipeline test

* refactor

* fix bug with mask

* fix for reference images

* remove print

* update docs

* update slices

* update

* update

* update example
* fix how compiler tester mixins are used.

* propagate

* more
…ocess (#11596)

* update

* update

* update

* update

* update

* update

* update
* Add community class StableDiffusionXL_T5Pipeline
Will be used with base model opendiffusionai/stablediffusionxl_t5

* Changed pooled_embeds to use projection instead of slice

* "make style" tweaks

* Added comments to top of code

* Apply style fixes
…ly the inpainted area (#11658)

* Update pipeline_flux_inpaint.py to fix padding_mask_crop returning only the inpainted area and not the entire image.

* Apply style fixes

* Update src/diffusers/pipelines/flux/pipeline_flux_inpaint.py
* allow loading from repo with dot in name

* put new arg at the end to avoid breaking compatibility

* add test for loading repo with dot in name

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
support Flux Control LoRA with bnb 8bit.
* fix: vae sampling mode

* fix a typo
…or test cases (#11654)

* enable torchao cases on XPU

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* device agnostic APIs

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* more

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable test_torch_compile_recompilation_and_graph_break on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* resolve comments

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
* start adding compilation tests for quantization.

* fixes

* make common utility.

* modularize.

* add group offloading+compile

* xfail

* update

* Update tests/quantization/test_torch_compile_utils.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* add clarity in documentation for device_map

* docs

* fix how compiler tester mixins are used.

* propagate

* more

* typo.

* fix tests

* fix order of decroators.

* clarify more.

* more test cases.

* fix doc

* fix device_map docstring in pipeline_utils.

* more examples

* more

* update

* remove code for stuff that is already supported.

* fix stuff.
* improve docstrings for wan

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* make style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* fix: remove redundant indexing

* style
* add compilation bits to the bitsandbytes docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* finish

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
anijain2305 and others added 29 commits June 26, 2025 08:41
* [rfc][compile] compile method for DiffusionPipeline

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply style fixes

* Update docs/source/en/optimization/fp16.md

* check

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* add test for checking compile on different shapes.

* update

* update

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
…11809)

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
* support flux kontext

* make fix-copies

* add example

* add tests

* update docs

* update

* add note on integrity checker

* make fix-copies issue

* add copied froms

* make style

* update repository ids

* more copied froms
* support flux kontext

* make fix-copies

* add example

* add tests

* update docs

* update

* add note on integrity checker

* initial commit

* initial commit

* add readme section and fixes in the training script.

* add test

* rectify ckpt_id

* fix ckpt

* fixes

* change id

* update

* Update examples/dreambooth/train_dreambooth_lora_flux_kontext.py

Co-authored-by: Aryan <aryan@huggingface.co>

* Update examples/dreambooth/README_flux.md

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: linoytsaban <linoy@huggingface.co>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
* disable onnx, mps, flax

* remove
* update

* update

* update

* update

* update

* user property instead
…1804)

* update

* add test

* address review comments

* update

* fixes

* change decorator order to fix tests

* try fix

* fight tests
* fix: lora unloading behvaiour

* fix

* update
* feat: use exclude modules to loraconfig.

* version-guard.

* tests and version guard.

* remove print.

* describe the test

* more detailed warning message + shift to debug

* update

* update

* update

* remove test
* ENH Improve speed of expanding LoRA scales

Resolves #11816

The following call proved to be a bottleneck when setting a lot of LoRA
adapters in diffusers:

https://github.com/huggingface/diffusers/blob/cdaf84a708eadf17d731657f4be3fa39d09a12c0/src/diffusers/loaders/peft.py#L482

This is because we would repeatedly call unet.state_dict(), even though
in the standard case, it is not necessary:

https://github.com/huggingface/diffusers/blob/cdaf84a708eadf17d731657f4be3fa39d09a12c0/src/diffusers/loaders/unet_loader_utils.py#L55

This PR fixes this by deferring this call, so that it is only run when
it's necessary, not earlier.

* Small fix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
…11825)

* add resolution changes tests to hotswapping test suite.

* fixes

* docs

* explain duck shapes

* fix
* reset deterministic in tearDownClass

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix deterministic setting

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* update

* update

* update docs
* use real instead of complex tensors in Wan2.1 RoPE

* remove the redundant type conversion

* unpack rotary_emb

* register rotary embedding frequencies as non-persistent buffers

* Apply style fixes

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* draft

* fix

* fix

* feedback

* feedback
add warning

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
…ansformer` (#11807)

* add `WandVACETransformer3DModel` in`SINGLE_FILE_LOADABLE_CLASSES`

* add rename keys for `VACE`

add rename keys for `VACE`

* fix typo

Sincere thanks to @nitinmukesh 🙇‍♂️

* support for `1.3B VACE` model

Sincere thanks to @nitinmukesh again🙇‍♂️

* update

* update

* Apply style fixes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* wan vace.

* update

* update

* import problem
* update: FluxKontextInpaintPipeline support

* fix: Refactor code, remove mask_image_latents and ruff check

* feat: Add test case and fix with pytest

* Apply style fixes

* copies

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* initial commit

* initial commit

* initial commit

* fix import

* fix prefix

* remove print

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* update

* Update docs/source/en/using-diffusers/schedulers.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update schedulers.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* start overhauling the benchmarking suite.

* fixes

* fixes

* checking.

* checking

* fixes.

* error handling and logging.

* add flops and params.

* add more models.

* utility to fire execution of all benchmarking scripts.

* utility to push to the hub.

* push utility improvement

* seems to be working.

* okay

* add torchprofile dep.

* remove total gpu memory

* fixes

* fix

* need a big gpu

* better

* what's happening.

* okay

* separate requirements and make it nightly.

* add db population script.

* update secret name

* update secret.

* population db update

* disable db population for now.

* change to every monday

* Update .github/workflows/benchmark.yml

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* quality improvements.

* reparate hub upload step.

* repository

* remove csv

* check

* update

* update

* threading.

* update

* update

* updaye

* update

* update

* update

* remove peft dep

* upgrade runner.

* fix

* fixes

* fix merging csvs.

* push dataset to the Space repo for analysis.

* warm up.

* add a readme

* Apply suggestions from code review

Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>

* address feedback

* Apply suggestions from code review

* disable db workflow.

* update to bi weekly.

* enable population

* enable

* updaye

* update

* metadata

* fix

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
@Skquark Skquark merged commit c071386 into Skquark:main Jul 4, 2025
1 of 2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.