[📄Paper] [🚩Project Page]
We propose Theatergen, a tuning-free method for consistent multi-turn image generation. The key idea is to utilize LLM for character management with layout
and id
and customize each character
to avoid attention leakage. We further propose the CMIGBench
for evaluating the consistency in multi-turn image generation.
- Deployment with GPT interface
- Release Benchmark
- Release code
- [2024.04.26] We have released our code and benchmark
To install requirements:
pip install -r requirements.txt
Generate with CMIGBench
or replace with your own demo
python generate.py --task story --sd_version '1.5' --dataset_path CMIGBench
If you have any questions, please feel free to email us at howe4884@outlook.com.
Our work is based on stable diffusion, Grounded-SAM, T2I-Adapter, and IP-Adapter. We appreciate their outstanding contributions.