Add Timestep Sampling Function from SD3 Branch to SD (dev base) #1671

gesen2egee · 2024-10-04T14:09:32Z

This PR introduces the timestep_sampling feature from SD3 Branch into the original SD model. The new timestep sampling options offer a more concentrated probability distribution compared to the default uniform sampling, which helps the model focus on specific aspects of learning. The new options can be used via the --timestep_sampling argument.

Default (uniform): Keeps the original uniform timestep sampling, which evenly distributes the learning steps.
Sigmoid: Behaves similarly to shift by default --discrete_flow_shift = 1.
Shift: Skews the timestep distribution, helping the model concentrate on style learning or specific objects.
Flux Shift: This option shift with the image size, similar to how FLUX behaves in larger images

Additional Parameters:

--discrete_flow_shift: By default set to 1, uses random normal distribution to sample timesteps. A rightward shift helps the model focus more on style, while a leftward shift aids in learning specific objects. The default value is 1, providing a balanced form.
--sigmoid_scale: Adjusts the shape of the sigmoid function.

While flux_shift maintains the distortion effects of FLUX, its application may vary in SD due to differences in model nature and training at fixed resolutions.

It is recommended to use the --timestep_sampling sigmoid option, combined with --soft_min_snr_gamma = 1
By rockerBOO #1068
#1068
for better results, as these settings seem to significantly improve model performance.

Suggested to merge this PR along with the soft min snr gamma PR.

recris · 2024-10-04T15:31:42Z

Great job! But I wonder if we could improve the approach to adding additional sampling functions.

The space of sampling strategies is vast and there are other valid functions, for example:

If we keep adding more parameters every time a new approach shows up it will make maintenance more difficult. I've been thinking on this problem for some time and I'd like to propose a more flexible mechanism.

The idea is to make timestep sampling pluggable in the same way the choice of LR schedule or optimizer are. We could have a generic interface (class), say TimestepSampler, and different functions would be implemented as subclasses. Then we chose the class to use via some configuration parameter (like we do for optimizers using optimizer_type and optimizer_args).

This way we can add other functions in the future without touching the existing code, and would also simplify the maintenance of private forks by reducing the need to deal with rebase conflicts (which is the main reason I am interested in this).

I leave it to @kohya-ss to comment on this as long term approach.

FurkanGozukara · 2024-10-04T18:34:13Z

@gesen2egee thank you so much

for flux dev i found that Model Prediction Type = raw and Timestep Sampling = sigmoid

What do you think about this?

Update train_util.py Update train_util.py

kohya-ss · 2024-10-06T12:49:38Z

Thank you for updating the PR! This seems to be effective, but also what recris said makes sense. Please give me some time to consider it.

bghira · 2024-10-16T15:40:53Z

sampling continuous timesteps for a discrete model seems like a problem

more_timestep_sampling

2001f46

gesen2egee mentioned this pull request Oct 4, 2024

Add Timestep Sampling Function from SD3 Branch to SD #1668

Closed

Update train_util.py

061ff89

fix

10c3af8

Update train_util.py Update train_util.py

gesen2egee force-pushed the timestep_sampling branch from 659cf31 to 10c3af8 Compare October 5, 2024 08:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Timestep Sampling Function from SD3 Branch to SD (dev base) #1671

Add Timestep Sampling Function from SD3 Branch to SD (dev base) #1671

gesen2egee commented Oct 4, 2024

recris commented Oct 4, 2024

FurkanGozukara commented Oct 4, 2024

kohya-ss commented Oct 6, 2024

bghira commented Oct 16, 2024

Add Timestep Sampling Function from SD3 Branch to SD (dev base) #1671

Are you sure you want to change the base?

Add Timestep Sampling Function from SD3 Branch to SD (dev base) #1671

Conversation

gesen2egee commented Oct 4, 2024

Additional Parameters:

recris commented Oct 4, 2024

FurkanGozukara commented Oct 4, 2024

kohya-ss commented Oct 6, 2024

bghira commented Oct 16, 2024