[Wan] Standardize vae.encode() sampling mode in WanVideoToVideoPipeline#11639
[Wan] Standardize vae.encode() sampling mode in WanVideoToVideoPipeline#11639DN6 merged 2 commits intohuggingface:mainfrom
Wan] Standardize vae.encode() sampling mode in WanVideoToVideoPipeline#11639Conversation
WanVideoToVideoPipeline
WanVideoToVideoPipelineWan] Fix VAE sampling mode in WanVideoToVideoPipeline
a-r-r-o-w
left a comment
There was a problem hiding this comment.
Looks correct to me because we use argmax in the other pipelines as well IIRC
|
In your "current" fix mp4 it appears that you have outliers "pixels" above / below the RGB threshold. I'm talking about those red spots and the light beam that moves across the video. |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
When I complete SkyReels-V2 integration, I will return to this PR. It is almost done. |
|
Thanks for merging! I have been thinking about this PR. The |
Wan] Fix VAE sampling mode in WanVideoToVideoPipelineWan] Standardize vae.encode() sampling mode in WanVideoToVideoPipeline
While integrating SkyReels-V2 models, I came across this: Major Wan-related repos including Wan-Video/Wan2.1, modelscope/DiffSynth-Studio, and SkyworkAI/SkyReels-V2 prefer
sample_mode == "argmax"for the encoding's output of their VAEs. Also, at the other Wan pipelines indiffusers, too. Am I correct?Also, fixes a typo.
hiker.mp4
wan-v2v.mp4
wan-v2v-fixed.mp4
I am unsure if there is supposed to be a visible fix 🤔.
Reproducer
@DN6 @a-r-r-o-w