This toy project is for the study of various generative models applied to small digit images. The goal of this project is to gain a better understanding of different generative model types.
For simplicity, this project focuses on small digit images (resized 12x12) from the MNIST dataset. Some models uses a flattened version of the digit images for linear layers or 1D domain applications.
Each generative model has its own directory. The main file for each model is located in its respective directory. You can select model types, as well as mode like train or inference, using arguments. Replace [model], [modes], and [submodels] with the appropriate values as described in each model section.
python . [model] [mode] --model [submodels]Model files are saved in the result directory and default file name is model. You can change the file name for saving or loading models using the --save_file and --load_file arguments:
python . [model] [mode] --model [submodels] --save_file [file_name] --load_file [file_name]The following generative models have been implemented considering various concepts such as autoregressive, flow-based, diffusion, etc. Each model is designed with either linear or convolutional layers to suit the application domain, whether it is one or two dimension representations.
Autoregressive models generate data sequentially by learning the probability distribution of each element based on the preceding elements.
- model:
arm - modes:
trainorinference - submodels:
cnn1d,pixelcnn
Original
Dilated 1D Convolution Model Result
PixelCNN Model Result
- An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
- Predictive Sampling with Forecasting Autoregressive Models
- Wavenet: A Generative Model for Raw Audio
- Pixel recurrent neural networks
Flow-based models learn an invertible transformation between the data distribution and a simple prior distribution, enabling efficient sampling and density estimation.
- model:
flow - modes:
trainorinference - submodels:
realnvp
Original
RealNVP Model Result
GANs consist of a generator and a discriminator that are trained together in a two-player adversarial game.
- model:
gan - modes:
trainorinference - submodels:
lingan
Original
GAN Model with linear layers Result
VAEs are generative models that learn a probabilistic mapping between data and latent spaces by optimizing the ELBO, providing an efficient way to learn complex data distributions.
- model:
vae - modes:
trainorinference - submodels:
mlp
Original
VAE with linear layers Result
Diffusion models generate data using a series of Gaussian diffusion processes, with a forward diffusion process and a reverse denoising process to optimize the ELBO.
- model:
diffusion - modes:
trainorinference - submodels:
naive-linorddm-unet
Original
Naive Diffusion Model with linear layers Result
Deep Denoising Diffusion Model with U-Net Result






