Unsatisfactory result. #23

GuHuangAI · 2022-12-13T10:05:12Z

I have trained a 1000-step diffusion model, and get fine results using 1000-step reverse process.
This is the original imgs (the two are concatenated):

and this is the generated imgs:

However, when using dpm_solver, i get unsatisfactory or even worse results.
Here is the 20-step img:

100-step:

What happended and what should i do ?

LuChengTHU · 2022-12-13T11:01:11Z

Hi @GuHuangAI ,

Could you please provide a detailed example code for using DPM-Solver? (e.g. which algorithm and which hyperparameters)

GuHuangAI · 2022-12-13T11:10:36Z

Hi @GuHuangAI ,

Could you please provide a detailed example code for using DPM-Solver? (e.g. which algorithm and which hyperparameters)

Thanks to your reply.
Actually, i modified the original Unet and designed a mask-controlled diffusion model.
Therefore, my model has an additional input, and i modified the "model_fn" function of DPM_Solver.
My code is like this:
noise_schedule = NoiseScheduleVP(schedule='cosine')
model_fn = model_wrapper( model, noise_schedule, model_type="noise", # or "x_start" or "v" or "score" # model_kwargs=None, )
x_T = torch.randn((mask.shape[0], 3, *mask.shape[2:]), device=mask.device)
x_sample = dpm_solver.sample( x_T, mask, steps=20, order=2, skip_type="time_uniform", method="multistep", )
In a wod, i use the mask to generate imgs:
Input:

Output:

Since my work is used for industrial projects. I'm sorry that i can not share my code.

LuChengTHU · 2022-12-13T15:53:45Z

Hi @GuHuangAI ,

Thank you for the detailed settings!

In my opinion, the mask-conditioned diffusion model is the classical "inpainting" problem in diffusion models. The traditional way for solving such problem is to use masked input at each guided sampling procedure. Here is an example for implementing inpainting: https://github.com/LuChengTHU/dpm-solver/blob/main/example_v2/stable-diffusion/scripts/diffedit_inpaint.ipynb

However, a key for implementing inpainting is that we do not use t_start=1.0 as the starting time. Instead, we often use something like t_start=0.5 or 0.6 for the starting time. And the starting value is a stochastic encoding of your mask image at time t_start (please check the above example for details).

In addition, I have another question: does your model use a continuous cosine schedule? If not, please use "discrete" schedule and provide the betas for the NoiseScheduleVP.

GuHuangAI · 2022-12-14T08:00:41Z

@LuChengTHU , Thanks for your suggestions. I have tried a lot of combinations of superparametes, unfortunately，they all failed. Maybe my model is so sensitive that it can only satisfy with much steps. (-. -)

LuChengTHU · 2022-12-14T11:57:50Z

Hi @GuHuangAI ,

Have you tried the original DDIM? Can it provide clean samples? If it can not provide clean samples either, then maybe it is related to your sensitive model. Please follow this instruction to check whether dpmsolver is suitable for your model: https://github.com/LuChengTHU/dpm-solver#suggestions-for-choosing-the-hyperparameters

CreamyLong · 2023-03-01T04:56:44Z

I have the same problem and I did not get clean pictures by using DDIM in guided-diffusion. what is the reason why the DDIM or dpm does not work?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unsatisfactory result. #23

Unsatisfactory result. #23

GuHuangAI commented Dec 13, 2022

LuChengTHU commented Dec 13, 2022

GuHuangAI commented Dec 13, 2022 •

edited

Loading

LuChengTHU commented Dec 13, 2022

GuHuangAI commented Dec 14, 2022

LuChengTHU commented Dec 14, 2022

CreamyLong commented Mar 1, 2023

Unsatisfactory result. #23

Unsatisfactory result. #23

Comments

GuHuangAI commented Dec 13, 2022

LuChengTHU commented Dec 13, 2022

GuHuangAI commented Dec 13, 2022 • edited Loading

LuChengTHU commented Dec 13, 2022

GuHuangAI commented Dec 14, 2022

LuChengTHU commented Dec 14, 2022

CreamyLong commented Mar 1, 2023

GuHuangAI commented Dec 13, 2022 •

edited

Loading