-
Notifications
You must be signed in to change notification settings - Fork 867
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update debiased estimation loss function to accommodate V-pred #1715
Conversation
1) Updates debiased estimation loss function for V-pred. 2) Prevents now-deprecated scaling of loss if ztSNR is enabled.
At the current time I believe you never actually want to use scale_v_prediction_loss_like_noise_prediction, as it was originally a "fix" for an already outdated and wrongly implemented version of MinSNR (which is fixed now). Tying it to ZSNR doesn't really make any sense. |
You're right. I guess a warning only will have to suffice. |
Difficult for me to say affirmatively. I'd prefer waiting for a second or third opinion on it.
I can remove the deprecation notice if you would like. |
Your plots are a bit confusing, I assume "Debiased estimation" is the one in this PR with v_prediction=False, "Debiased estimation vpred" is with v_prediction=True, so far good, but with "SNR" it gets unclear. If you use "Scale vpred like noise vpred" with "SNR weighted loss vpred", then "Scale vpred like noise vpred" definitely doesn't look right to me. The weighting shouldn't decrease with timesteps increasing. The purpose of this change is to clamp debiased weighting between (0,1) to accomodate for v-prediction loss, which looks right. |
Sorry for the confusion. Each corresponds as follows.
As I understand it, |
Tried to do some math and plot this as well. I think it's important to see how each weighting strategy changes the effective SNR. However, in practice no one trains v-pred without ZSNR, it would make more sense to plot using ZSNR schedule. Notice that
If there's anything of note here, it's that Debiased+vpred WITH vpred-like loss look suspiciously alike to MinSNR+vpred at higher timesteps, which can suggest that Either way, discarding |
Thank you for the great diagrams and insight! I did not expect
It's true that they are for noise pred, so it can't be helped that they have no meaning with v-pred. @catboxanon |
Done. |
Considering that the original paper was published a long time ago, I thought it would be a good idea to refer to the cited paper for updates. |
I'm not exactly sure what do you mean by "color contamination", but if it's about weird color splotches, I do think this is rather strange. My theory is that they appear because of the new prediction target and schedule being wonky at first, and this will eventually go away with sufficient training. In fact, this model was trained on 300k samples using
I only found these loosely related papers, but looks like they don't focus on what this PR attempts to do at all. |
I've merged. Sorry for the delay. |
This PR:
[Feature Request] Update apply_debiased_estimation to work properly with v-prediction. #1058 (comment)
2) Adds a deprecation notice forscale_v_pred_loss_like_noise_pred
.For reference:
#934
https://github.com/TiankaiHang/Min-SNR-Diffusion-Training/blob/main/guided_diffusion/gaussian_diffusion.py#L864
Removed per discussion below.
cc @feffy380 @sdbds, let me know if I'm missing something.