Implementation of our PR 2020 paper:Unsupervised Text-to-Image Synthesis
In this paper, we proposed to train one text-to-image synthesis model in one unsupervised manner, without resorting to any pairwise image-text data. To the best of our knowledge, this is the first attempt to tackle such an unsupervised text-to-image synthesis task.
##Getting Started Python 3.6+, Pytorch 1.2, torchvision 0.4, cuda10.0, at least 3.8GB GPU memory and other requirements. All codes are tested on Linux Distributions (centos 7), and other platforms have not been tested yet.
- Download
pretrainsfrom OneDrive or BaiduPan with extract code5bx6and then move the pretrains.zip to thedatadirectory and unzip this file. - Download
assetsfrom OneDrive or BaiduPan with extract code5bx6and then move thedatato thedatadirectory. - Download
MSCOCOfrom the COCO site and extract the train2014.zip and val2014.zip todata/coco/images.
If you want to reproduce our model, the following pipeline is your need.
- Train Concept-to-Sentence model.
sh scripts/con2sen_train.sh- Pseudo Image-Text pair construction.
sh scripts/con2sen_infer.sh- Train DAMSM model.
sh scripts/DAMSM.sh- Train Stage-I ut2i model(VCD).
sh scripts/vcd.sh- Train Stage-II ut2i model(GSC).
sh scripts/gsc.shOur model adopts Evaluation code in ObjGAN