This is a unoffical implementation about Image Super-Resolution via Iterative Refinement(SR3) by Pytorch.
There are some implements with paper description, which maybe different with actual SR3 structure due to details missing.
- We used the Res-Net block and channel concatenation style in vanilla DDPM.
- We used the attention mechanism in low resolution feature(16×16 ) like vanilla DDPM.
- We encoding the gama as FilM strcutrue did in Wave Grad, and embedding it without affine.
- 16×16 -> 128×128 on FFHQ-CelebaHQ
- 64×64 -> 512×512 on FFHQ-CelebaHQ
- 64×64 -> 256×256 on ImageNet
- 1024×1024 face generation by a cascade of 3 models
- log/logger
- metrics evaluation
- resume training
- multi-gpu support
We set the maximum reverse steps budget to 2000 now.
Tasks/Metrics | SSIM(+) | PSNR(+) | FID(-) | IS(+) |
---|---|---|---|---|
16×16 -> 128×128 | 0.675 | 23.26 | ||
64×64 -> 512×512 | ||||
1024×1024 |
-
16×16 -> 128×128 on FFHQ-CelebaHQ [More Results]
# Resize to get 16×16 LR_IMGS and 128×128 HR_IMGS, then prepare 128×128 Fake SR_IMGS by bicubic interpolation
python prepare.py --path [dataset root] --out [output root] --size 16,128 -l
Tasks | Google Drive | Aliyun Drive |
---|---|---|
16×16 -> 128×128 on FFHQ-CelebaHQ | https://drive.google.com/drive/folders/12jh0K8XoM1FqpeByXvugHHAF3oAZ8KRu?usp=sharing | https://www.aliyundrive.com/s/EJXxgxqKy9z |
# Download the pretrain model and edit basic_ddpm.json about "resume_state":
"resume_state": [your pretrain model path]
We have not trained the model until converged for time reason, which means there are a lot room to optimization.
# Edit basic_sr3.json to adjust network function and hyperparameters
python run.py -p train -c config/basic_sr3.json
# Edit basic_sr3.json to add pretrain model path
python run.py -p val -c config/basic_sr3.json
# Quantitative evaluation using SSIM/PSNR metrics on given dataset root
python eval.py -p [dataset root]
Our work is based on the following theoretical work:
- Denoising Diffusion Probabilistic Models
- Image Super-Resolution via Iterative Refinement
- WaveGrad: Estimating Gradients for Waveform Generation
- Large Scale GAN Training for High Fidelity Natural Image Synthesis
and we are benefit a lot from following projects: