-
Notifications
You must be signed in to change notification settings - Fork 6.1k
Add stochastic sampling to FlowMatchEulerDiscreteScheduler #11369
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This PR adds stochastic sampling to FlowMatchEulerDiscreteScheduler based on Lightricks/LTX-Video@b1aeddd ltx_video/schedulers/rf.py
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
@bot /style |
Style fixes have been applied. View the workflow run here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @apolinario
dt = sigma_next - sigma | ||
|
||
prev_sample = sample + dt * model_output | ||
# Determine whether to use stochastic sampling for this step | ||
use_stochastic = stochastic_sampling if stochastic_sampling is not None else self.config.stochastic_sampling |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think just have this in config is enough no?
@bot /style |
Style fixes have been applied. View the workflow run here. |
src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py
Outdated
Show resolved
Hide resolved
src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py
Outdated
Show resolved
Hide resolved
|
||
current_sigma = per_token_sigmas[..., None] | ||
next_sigma = lower_sigmas[..., None] | ||
dt = next_sigma - current_sigma # Equivalent to sigma_next - sigma |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@apolinario
here it seems to reversed, no?
before:
dt = (per_token_sigmas - lower_sigmas)[..., None]
now:
dt = ower_sigmas - per_token_sigmas
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
Quick question, Should it be LTXPipeline or LTXConditionPipeline? |
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
thanks @apolinario ! |
@nitinmukesh I think it's probably better in |
Thank you for adding the sampling. Please could you share few sample outputs which you created. I am not getting good results so want to compare if something wrong in code. |
LTXPipeline. The model here is 0.9.6-distilled (the only one that uses the stochastic sampling as of now). "0.9.5" is included because the transformer and scheduler from 0.9.6 are inserted, which is fine because nothing else in the pipeline is different from 0.9.5 and there's currently nothing the lightricks 0.9.6 repos. 0.9.6-distilled is guidance distilled so it does not work in the condition pipeline, while 0.9.6 does |
Also, I find increasing the schedulers shift while using distilled helps to boost coherence. This is inline with FastVideo (PCM distillation) which says to set the shift to 17. The 1.0 in https://huggingface.co/multimodalart/ltxv-2b-0.9.6-distilled/blob/main/scheduler/scheduler_config.json doesn't seem like a good default. I'm unsure what it is in the original LTX repo Some different shifts
0.25 0.25.mp40.5 0.5.mp41.0 1.mp42.0 2.mp44.0 4.mp48.0 8.mp416.0 16.mp432 32.mp464 64.mp416 seems like a good default (tested by adding |
Thank you @Ednaordinary The information you provided is very helpful. Getting better results than before. distilled_scheduler1.mp4 |
249 frames distilled_scheduler2.mp4 |
Looks great! What I've noticed so far is that the background is often very repetitive in a weird way like shown in your 249 frame example. Sometimes this can be solved by increasing the shift to an insanely large amount (think in the 200s) but that also incurs everything else that comes from running shift that high (eventually, everything just turns into blobs) |
Sure will try that, thank you. Next gonna try if distilled support image to video (LTXImageToVideoPipeline). |
I2V is working good. distilled_scheduler6.mp4Result from 0.9.1 using same image |
What do you suggest for 0.9.6 dev, LTXPipeline or Conditioning pipeline, in case you tried. |
0.9.6 should be the condition pipeline I'm pretty sure |
What does this PR do?
This PR adds stochastic sampling to FlowMatchEulerDiscreteScheduler based on Lightricks/LTX-Video@b1aeddd
ltx_video/schedulers/rf.py
, which was added with th release of 0.9.6-distilled. I decoupled the next and current sigma to try to get closer to therf.py
implementation of the stochastic sampling, but a second pair of eyes on this would be great.To try it:
Who can review?
@yiyixuxu