Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
DWT_IDWT		DWT_IDWT
assets		assets
datasets_prep		datasets_prep
pytorch_fid		pytorch_fid
score_sde		score_sde
.gitignore		.gitignore
EMA.py		EMA.py
LICENSE		LICENSE
compute_dataset_stat.py		compute_dataset_stat.py
diffusion.py		diffusion.py
readme.md		readme.md
requirements.txt		requirements.txt
run.sh		run.sh
run_ddgan.sh		run_ddgan.sh
test_ddgan.py		test_ddgan.py
test_wddgan.py		test_wddgan.py
train_ddgan.py		train_ddgan.py
train_wddgan.py		train_wddgan.py
utils.py		utils.py

Repository files navigation

Official PyTorch implementation of "Wavelet Diffusion Models are fast and scalable Image Generators"

WaveDiff is a novel wavelet-based diffusion structure that employs low-and-high frequency components of wavelet subbands from both image and feature levels. These are adaptively implemented to accelerate the sampling process while maintaining good generation quality. Experimental results on CelebA-HQ, CIFAR-10, LSUN-Church, and STL-10 datasets show that WaveDiff provides state-of-the-art training and inference speed, which serves as a stepping-stone to offering real-time and high-fidelity diffusion models.

Details of the model architecture and experimental results can be found in our following paper:

@article{hao2022wavelet,
  title={Wavelet Diffusion Models are fast and scalable Image Generators},
  author={Hao Phung and Quan Dao and Anh Tran},
  journal={arXiv preprint arXiv:<submit_number>},
  year={2022}
}

Please CITE our paper whenever this repository is used to help produce published results or incorporated into other software.

Installation

Latest Pytorch version is required.

Install neccessary libraries:

pip install -r requirements.txt

For pytorch_wavelets, please follow here.

Dataset preparation

We trained on four datasets, including CIFAR10, STL10, LSUN Church Outdoor 256 and CelebA HQ (256 & 512).

For CIFAR10 and STL10, they will be automatically downloaded in the first time execution.

For CelebA HQ (256) and LSUN, please check out here for dataset preparation.

For CelebA HQ (512), please download data at here and then generate LMDB format dataset by Torch Toolbox.

Once a dataset is downloaded, please put it in data/ directory as follows:

data/
├── STL-10
├── celeba
├── celeba_512
├── cifar-10
└── lsun

How to run

We provide a bash script for our experiments on different datasets. The syntax is following:

bash run.sh <DATASET> <MODE> <#GPUS>

where:

<DATASET>: cifar10, stl10, celeba_256, celeba_512, and lsun.
<MODE>: train and test.
<#GPUS>: the number of gpus (e.g. 1, 2, 4, 8).

Note, please set agrument --exp correspondingly for both train and test mode. All of detailed configurations are well set in run.sh.

GPU allocation: Our work is experimented on NVIDIA 40GB A100 GPUs. For train mode, we use a single GPU for CIFAR10 and STL10, 2 GPUs for CelebA-HQ 256, 4 GPUs for LSUN, and 8 GPUs for CelebA-HQ 512. For test mode, only a single GPU is required for all experiments.

Results

Model performance and pretrained checkpoints are provided as below:

Model	FID	Recall	Time (s)	Checkpoints
CIFAR-10	4.01	0.55	0.08	netG_1300.pth
STL-10	12.93	0.41	0.38	netG_600.pth
CelebA-HQ (256 x 256)	5.94	0.37	0.79	netG_475.pth
CelebA-HQ (512 x 512)	6.40	0.35	0.59	netG_350.pth
LSUN Church	5.06	0.40	1.54	netG_400.pth

Inference time is computed over 300 trials on a single NVIDIA A100 GPU for a batch size of 100, except for the one of high-resolution CelebA-HQ $(512 \times 512)$ is computed for a batch of 25 samples.

Downloaded pre-trained models should be put in saved_info/wdd_gan/<DATASET>/<EXP> directory where <DATASET> is defined in How to run section and <EXP> corresponds to the folder name of pre-trained checkpoints.

Evaluation

Inference

Samples can be generated by calling run.sh with test mode.

FID

To compute fid of pretrained models on a specific epoch, we can add additional arguments including --compute_fid and --real_img_dir /path/to/real/images of the corresponding experiments in run.sh.

Recall

We adopt the official Pytorch implementation of StyleGAN2-ADA to compute Recall of generated samples.

Acknowledgments

Thanks to Xiao et al for releasing their official implementation of the DDGAN paper.

Contacts

If you have any problems, please open an issue in this repository or ping an email to tienhaophung@gmail.com.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of contents

Official PyTorch implementation of "Wavelet Diffusion Models are fast and scalable Image Generators"

Installation

Dataset preparation

How to run

Results

Evaluation

Inference

FID

Recall

Acknowledgments

Contacts

About

Releases

Packages

Contributors 3

Languages

License

VinAIResearch/WaveDiff

Folders and files

Latest commit

History

Repository files navigation

Table of contents

Official PyTorch implementation of "Wavelet Diffusion Models are fast and scalable Image Generators"

Installation

Dataset preparation

How to run

Results

Evaluation

Inference

FID

Recall

Acknowledgments

Contacts

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages