Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mode collapse is a serious problem in Bayesian GAN #8

Closed
htt210 opened this issue Mar 1, 2018 · 2 comments
Closed

Mode collapse is a serious problem in Bayesian GAN #8

htt210 opened this issue Mar 1, 2018 · 2 comments

Comments

@htt210
Copy link

htt210 commented Mar 1, 2018

Dear authors,
As can be seen from generated samples in figure 2, 6, 7 and 8 mode collapse is a serious problem in Bayesian GAN. Every generator has mode collapse and different generators collapse to the same modes.
In figure 6, for example, generator 1 and 4 both have mode collapse and they collapse to the same mode (row 2, col 3 of generator 1 and row 3, col 3 of generator 4). If we consider the mode count method based on birthday paradox (Arora et al. 2017) then when mode collapse happens with high probability then the number of mode in the model distribution is about the same as the batch size. Mode collapse happens with batch size of only 16, that implies that each generator captures only tens of modes. The total capacity of 10 generators is, therefore, much smaller than a single generator trained with normal method.
This is contrast to your claim that Bayesian GAN explore a broader region of the target distribution. In my opinion, the current setting for Bayesian GAN makes mode collapse worse.

@andrewgordonwilson
Copy link
Owner

andrewgordonwilson commented Mar 7, 2018

As can be seen in Figure 1, the Bayesian GAN is better at modelling multimodal distributions and avoiding mode collapse. Multiple generators also have higher capacity than a single generator and can represent a much broader region of the target distribution. We weren't sure what you meant with your comments about multiple generators having less capacity than a single generator. When sampling from the target distribution, we would not take a single generator and sample many times, but we would sample from a posterior over generators, and then draw a sample from that generator, and then take another sample from the posterior over generators and draw a sample from that generator, etc. We do the former in some of the figures to show the differences between generators and that indeed there is some amount of mode collapse within each generator, which is in fact partly why it can be critically useful to represent a posterior over multiple generators. As we say in the text (and show with our own figures, which you are referencing) the Bayesian GAN is not immune to mode collapse, particularly within an arbitrary generator sampled from the posterior, nor was it intended to be. Modifications, for example using the Wasserstein metric, can further help alleviate mode collapse within generators if that is desired. It may not be desired: mode collapse within some generators is acceptable, if these generators are part of an uncountably infinite set of plausible generators.

@htt210
Copy link
Author

htt210 commented Mar 7, 2018

Hi Andrew,
As can be seen in figure 7, all of your generators have mode collapse. According to the birthday paradox, if there are n modes in the model distribution and we randomly select sqrt(n) samples then the probability of having at least two samples belonging to the same mode is about 0.5 (Arora et al.). For your generated generators, mode collapse is observed when 64 images are generated. That implies the number of mode in each generator is less than 64 * 64 = 4096 (because the collapse probability is likely to be greater than 0.5, the actual number of mode might be less than 4096). The maximum number of modes 10 generators can generate is < 4096 * 10, which is much smaller than the capacity of generators trained with normal methods.
Also in figure 7, we can see that the first and the third generator collapse to the same modes. They do not model different regions of the target distribution as claimed in the paper.

Arora, Sanjeev, and Yi Zhang. "Do GANs actually learn the distribution? an empirical study." arXiv preprint arXiv:1706.08224 (2017).
https://arxiv.org/abs/1706.08224

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants