You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @crowsonkb , long time no see! I'm opening this issue to discuss the potential improvement for sampling methods with SDXL.
As I listed in #43, SDXL with DPM++2M will have apparent artifacts due to the numerical instability, especially for SDE solvers.
One possible way is to let the final step be the first-order solver, e.g., sampling with 5 steps will be [1,2,2,2,1] orders instead of [1,2,2,2,2] orders, as discussed in #43 and I also list more examples in huggingface/diffusers#5541 .
Another possible way is to change the step size scheduler. For example, your implemented Karra's step size scheduler is the most widely-used step size scheduler in the community, and it can significantly improve the sample quality. Recently I find that Karra's step size with $\rho=7$ is much related to my "uniform logSNR" scheduler, which is proposed in the original paper of DPM-Solver.
Specifically, note that the definition of "Karras sigmas" is equivalent to $\alpha_t / \sigma_t = \exp(\lambda_t)$, so the "log sigmas" in Karras' setting is just $\lambda_t$. Moreover, as Karras uses an exponential splitting for sigmas with a hyperparameter
, we can prove that when $\rho$ goes to infinity, the step sizes are equivalent to uniform $\lambda_t$, because of the definition of the exponential function, $\exp(x) = \lim_{\rho \rightarrow \infty} (1 + \frac{x}{\rho})^{\rho}$. As $\rho=7$ is already quite large, the samples by Karras sigmas and my uniform lambdas are similar when using ODE solvers, and both can reduce the discretization errors.
However, for SDE solvers, Karra's step size and my uniform logSNR step size are quite different, due to the Gaussian noise during the trajectory. For example, here is an example for a cat, DPM++2M SDE, steps=25, with SDXL (no refiner):
I think the uniform logSNR step size is quite interesting and it can also provide beautiful samples, so it may bring new insights to the community. Could you please also integrate this step size scheduler in your k-diffusion?
I intended to implement the first order last step when I first saw this issue but forgot, I will get to it soon (it will be an option that will be off by default).
Hi @crowsonkb , long time no see! I'm opening this issue to discuss the potential improvement for sampling methods with SDXL.
As I listed in #43, SDXL with
DPM++2M
will have apparent artifacts due to the numerical instability, especially for SDE solvers.One possible way is to let the final step be the first-order solver, e.g., sampling with 5 steps will be [1,2,2,2,1] orders instead of [1,2,2,2,2] orders, as discussed in #43 and I also list more examples in huggingface/diffusers#5541 .
Another possible way is to change the step size scheduler. For example, your implemented Karra's step size scheduler is the most widely-used step size scheduler in the community, and it can significantly improve the sample quality. Recently I find that Karra's step size with$\rho=7$ is much related to my "uniform logSNR" scheduler, which is proposed in the original paper of DPM-Solver.
Specifically, note that the definition of "Karras sigmas" is equivalent to$\alpha_t / \sigma_t = \exp(\lambda_t)$ , so the "log sigmas" in Karras' setting is just $\lambda_t$ . Moreover, as Karras uses an exponential splitting for sigmas with a hyperparameter$\rho$ goes to infinity, the step sizes are equivalent to uniform $\lambda_t$ , because of the definition of the exponential function, $\exp(x) = \lim_{\rho \rightarrow \infty} (1 + \frac{x}{\rho})^{\rho}$ . As $\rho=7$ is already quite large, the samples by Karras sigmas and my uniform lambdas are similar when using ODE solvers, and both can reduce the discretization errors.
, we can prove that when
However, for SDE solvers, Karra's step size and my uniform logSNR step size are quite different, due to the Gaussian noise during the trajectory. For example, here is an example for
a cat
,DPM++2M SDE
,steps=25
, with SDXL (no refiner):I think the uniform logSNR step size is quite interesting and it can also provide beautiful samples, so it may bring new insights to the community. Could you please also integrate this step size scheduler in your k-diffusion?
The code is quite simple, for example: huggingface/diffusers@892fec9
The text was updated successfully, but these errors were encountered: