Understanding-in-Generation: Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation
Yuanhuiyi Lyu1, Chi-Kit Wong1, Chenfei Liao1, Lutao Jiang1, Xu Zheng1, Zexin Lu4, Linfeng Zhang2, Xuming Hu4,
1The Hong Kong University of Science and Technology (Guangzhou)
2Shanghai Jiao Tong University
3The Hong Kong University of Science and Technology
4Huawei Hong Kong Research Center
- Clone the repository:
git clone https://github.com/qc-ly/UiG cd UiG - Create an environment:
conda create -n UiG python==3.10 -y conda activate UiG - Install the required packages:
pip install -r requirements.txt pip install flash_attn==2.5.8 --no-build-isolation
-
Please follow official instruction to download the
BAGEL-7B-MoTcheckpoint and save the checkpoint to./ckpts. -
Generate images from the prompts in
./prompts/test_prompt.txt:bash scripts/infer.sh
for slurm:
bash scripts/infer_slurm.sh
-
Generate images from input prompts:
python infer.py --prompt_text "A larger person in yellow clothing is partially hidden by a smaller person in a different color."
We follow the official settings of TIIF-Bench and WISE-Bench to evaluate UiG.
The evaluation results are provided in Google Drive
Our codes are built on open-source codes, thanks to the following projects:
Thanks for their outstanding works and open-source!
If you find this repository useful, please consider giving stars ⭐ and citations
@article{lyu2025understanding,
title={Understanding-in-Generation: Reinforcing Generative Capability of Unified Model via Infusing Understanding into Generation},
author={Lyu, Yuanhuiyi and Wong, Chi Kit and Liao, Chenfei and Jiang, Lutao and Zheng, Xu and Lu, Zexin and Zhang, Linfeng and Hu, Xuming},
journal={arXiv preprint arXiv:2509.18639},
year={2025}
}
If you have questions, suggestions, and bug reports, please email:
ryan.lyu.mail@gmail.com
