Comparison experiments between different loss functions and weight regularization techiniques. For these experiments I chose DCGAN architecture.
Comparisons:
- BCE loss vs. W-loss
- Gradient penalty (GP) vs. Spectral norm (SN)
- Transposed Conv (Deconv) vs. Upsampling + Conv (for intermadiate layers)
- BatchNorm (BN) vs. No BatchNorm
Every model was trained for 50 epochs using Simpson Faces dataset from Kaggle. To be more precise, using a cleaned (a bit) version of its cropped
data.
❕ Experiments were not meant to demonstrate the best looking generated images. Rather effects of different training stabilization techniques. Feel free to play around with model architectures/hyperparameters/number of epochs to achieve better results.
❕❕ Nor results were meant to estimate general "goodness" of any tested combination of techniques. One should make decisions according to their specific case.
- Clone the repository
git clone https://github.com/ivankunyankin/gan.git
cd gan
- Create an environment and install the dependencies
python3 -m venv env
source env/bin/activate
pip3 install -r requirements.txt
cd
into the directory of the model you want to train
In order to start training you need to run:
python3 train.py
Add --upsample
flag if you want to train an upsampling + conv generator
You can play around with hyper-parameters values. You will find them in the same config.yml
You can watch the training process and see the intermediate results in tensorboard. Run the following:
tensorboard --logdir logs/
On the images below you can see that when using BCE as a loss function, removing BatchNorm led to a mode collapse. Which at the same time is not true when using Upsampling + Conv instead of Deconv layers. Changing the way we increse the intermediate image size indeed can help with low-level artificats inherent to Transposed convolutions. But most importantly, take a look at the colors. They are much less saturated with BCE than with W-loss.
With W-loss colors are much better than with BCE loss. Interestingly, BatchNorm didn't improve image quality. As for the image size increasing techniques, it seems that they have shown comparable results. Deconv produced a bit more head-like shaped objects but I think that neither the difference is significant nor the results are trustworthy enough to say that one is better than the other.
Overall, experiments with spectral norm were not successful. Although, that can happen because this approach is just more sensitive to the model architecture and hyperparameters during training, given that it is a bit harder weight normalization technique than gradient penalty.