Skip to content
/ ViCo Public
forked from haoosz/ViCo

Official PyTorch codes for the paper: "ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation"

License

Notifications You must be signed in to change notification settings

eltociear/ViCo

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ViCo

arXiv License

teaser

⏳ To Do

  • Release inference code
  • Release pretrained models
  • Release training code
  • Hugging Face demo

⚙️ Set-up

Create a conda environment vico using

conda env create -f environment.yaml
conda activate vico

⏬ Download

Download the pretrained stable diffusion v1-4 under models/ldm/stable-diffusion-v1.

We provide the pretrained checkpoints at 300. 350, and 400 steps of 8 objects. You can download the sample images and their corresponding pretrained checkpoints. You can also download the data of any object:

Object Sample images Checkpoints
barn image ckpt
batman image ckpt
clock image ckpt
dog7 image ckpt
monster toy image ckpt
pink sunglasses image ckpt
teddybear image ckpt
wooden pot image ckpt

🚀 Inference

Before run the inference command, please set:

  • REF_IMAGE_PATH: Path of the reference image. It can be any image in the samples like batman/1.jpg.
  • CHECKPOINT_PATH: Path of the checkpoint weight. Its subfolder should be similar to checkpoints/*-399.pt.
  • OUTPUT_PATH: Path of the generated images. For example, it can be like outputs/batman.
python scripts/vico_txt2img.py \
--ddim_eta 0.0  --n_samples 4  --n_iter 2  --scale 7.5  --ddim_steps 50  \
--ckpt_path models/ldm/stable-diffusion-v1/sd-v1-4.ckpt  \
--image_path REF_IMAGE_PATH \
--ft_path CHECKPOINT_PATH \
--load_step 399 \
--prompt "a photo of * on the beach" \
--outdir OUTPUT_PATH

You can specify load_step (300,350,400) and personalize prompt (a prefix "a photo of" usually makes better results).

💻 Training

Coming soon!

📖 Citation

If you use this code in your research, please consider citing our paper:

@inproceedings{Hao2023ViCo,
  title={ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation},
  author={Shaozhe Hao and Kai Han and Shihao Zhao and Kwan-Yee K. Wong},
  year={2023}
}

About

Official PyTorch codes for the paper: "ViCo: Detail-Preserving Visual Condition for Personalized Text-to-Image Generation"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 89.9%
  • Python 10.0%
  • Shell 0.1%