-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Inquiry on some details of the method. #8
Comments
Hi, Yes, we generate all tokens in one diffusion step. We use ddim sampling to predict Besides, the corresponding predicted Hope this helps. If you have more questions please feel free to contact with me. |
Yes, that's right. DDIM sampling helps to trade off speed and generation quality. And predicting |
Hi, In fact, Since a masked token loses all its information, the expected information loss of the i-th token at Hope this helps. |
Hi @leekum2018, you can refer this : https://openreview.net/forum?id=h7-XixPCAL¬eId=xm7onR_Sg0L Hope it helps! |
As said in the second paragraph of Section 4.3, "We attribute the superior performance of DiffusionBERT to its onetime sampling of all tokens". I wonder the meaning of "onetime sampling of all tokens", does it mean generating all the tokens in a sentence at a time? If it does, it seems to conflict with the demonstration in Table 1. Thank you!
The text was updated successfully, but these errors were encountered: