[Docs] Update Wan Docs with memory optimizations#11089

Merged

DN6 merged 2 commits intomainfrom

Mar 28, 2025

Collaborator

DN6 commented Mar 17, 2025

What does this PR do?

Based on feedback here
https://huggingface.slack.com/archives/C065E480NN9/p1742176300453069

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline?
Did you read our philosophy doc (important for complex PRs)?
Was this discussed/approved via a GitHub issue or the forum? Please add a link to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.


          update

e793adc

HuggingFaceDocBuilderDev commented Mar 17, 2025

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

DN6 requested review from a-r-r-o-w and asomoza

March 20, 2025 12:22

a-r-r-o-w approved these changes

View reviewed changes

Contributor

a-r-r-o-w left a comment

Looks great, thanks!

asomoza approved these changes

View reviewed changes

Member

asomoza left a comment

thanks, looks great, just a a couple of comments that aren't blockers, just my opinion.

docs/source/en/api/pipelines/wan.md


		We will first need to install some addtional dependencies.

		```shell

Member

asomoza Mar 21, 2025

maybe we should start telling the users what the additional dependencies are and a link to them so they feel more secure and understand what are they installing?

we can add just a link to the pypi page too: https://pypi.org/project/ftfy/

Also now that I see it, maybe this shouldn't be an required dependency but an optional one? I'll take a look later on how it's used.

docs/source/en/api/pipelines/wan.md Outdated

+              ## Recommendations for Inference:
+              - Keep `AutencoderKLWan` in `torch.float32` for better decoding quality.
+              - `num_frames` should be of the form `4 * k + 1`, for example `49` or `81`.

Member

asomoza Mar 21, 2025

maybe we can be more clear here at write that k is the frames per second or fps in a more common language?

zhangvia reviewed

View reviewed changes

docs/source/en/api/pipelines/wan.md


		#### Block Level Group Offloading

		We can reduce our VRAM requirements by applying group offloading to the larger model components of the pipeline; the `WanTransformer3DModel` and `UMT5EncoderModel`. Group offloading will break up the individual modules of a model and offload/onload them onto your GPU as needed during inference. In this example, we'll apply `block_level` offloading, which will group the modules in a model into blocks of size `num_blocks_per_group` and offload/onload them to GPU. Moving to between CPU and GPU does add latency to the inference process. You can trade off between latency and memory savings by increasing or decreasing the `num_blocks_per_group`.

zhangvia Mar 21, 2025

could we apply group offload on vae?


          update

5b413d9

DN6 merged commit 617c208 into main

4 checks passed

tin2tin commented Mar 29, 2025

Thank you for this, super useful information. Have been struggling to get Wan i2v and Group Offloading working. I've tried many things to get Wan i2v to work, and properly bnb too. Are quantizations (w. ex. bitsandbytes) supposed to work on Wan too?

tin2tin reviewed

View reviewed changes

docs/source/en/api/pipelines/wan.md

+              from diffusers import AutoencoderKLWan, WanTransformer3DModel, WanImageToVideoPipeline
+              from diffusers.hooks.group_offloading import apply_group_offloading
+              from diffusers.utils import export_to_video, load_image
+              from transformers import UMT5EncoderModel, CLIPVisionMode

tin2tin Mar 29, 2025

CLIPVisionMode is missing CLIPVisionModel

tin2tin reviewed

View reviewed changes

docs/source/en/api/pipelines/wan.md

+                  "An astronaut hatching from an egg, on the surface of the moon, the darkness and depth of space realised in "
+                  "the background. High quality, ultrarealistic detail and breath-taking movie-like camera shot."
+              )
+              negative_prompt = "Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards

tin2tin Mar 29, 2025

Is missing "

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet