Add Timestep Sampling Function from SD3 Branch to SD (dev base) #1671
+45
−6
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces the
timestep_sampling
feature from SD3 Branch into the original SD model. The new timestep sampling options offer a more concentrated probability distribution compared to the default uniform sampling, which helps the model focus on specific aspects of learning. The new options can be used via the--timestep_sampling
argument.uniform
): Keeps the original uniform timestep sampling, which evenly distributes the learning steps.shift
by default --discrete_flow_shift = 1.Additional Parameters:
--discrete_flow_shift
: By default set to1
, uses random normal distribution to sample timesteps. A rightward shift helps the model focus more on style, while a leftward shift aids in learning specific objects. The default value is1
, providing a balanced form.--sigmoid_scale
: Adjusts the shape of the sigmoid function.While flux_shift maintains the distortion effects of FLUX, its application may vary in SD due to differences in model nature and training at fixed resolutions.
It is recommended to use the
--timestep_sampling sigmoid
option, combined with--soft_min_snr_gamma = 1
By rockerBOO #1068
#1068
for better results, as these settings seem to significantly improve model performance.
Suggested to merge this PR along with the soft min snr gamma PR.