RealCustom Series

📖 Introduction

Existing text-to-image customization methods (i.e., subject-driven generation) face a fundamental challenge due to the entangled influence of visual and textual conditions. This inherent conflict forces a trade-off between subject fidelity and textual controllability, preventing simultaneous optimization of both objectives.We present RealCustom to disentangle subject similarity from text controllability and thereby allows both to be optimized simultaneously without conflicts. The core idea of RealCustom is to represent given subjects as real words that can be seamlessly integrated with given texts, and further leveraging the relevance between real words and image regions to disentangle visual condition from text condition.

🔥 News

[10/2025] 🔥 RealCustom++ is accepted by T-PAMI.
[04/2025] 🔥 We release our newly customization framework UNO
[04/2025] 🔥 The code and model of RealCustom is released.

⚡️ Quick Start

🔧 Requirements and Installation

Install the requirements

bash envs/init.sh

🎭 Download Models

You can dowload all the models in huggingface and put them in ckpts/.

✍️ Inference

single image inference

bash inference/inference_single_image.sh

batch image inference

bash inference/inference_batch_images.sh

🌟 Gradio Demo

python inference/app.py

🖌️ Multi-Round Generation

our real-word paradigm naturally supports multi-round generation, where the output from each round serves as the reference subject image for the next. This enables flexible customization in each round by specifying different target real words. For example, in the first row, the initial round uses "dog" as the target word, preserving only the dog's characteristics. In the second round, the target word "dog with the pink hat" incorporates the pink hat generated in the previous round, allowing RealCustom++ to retain both features. This demonstrates the strong generalization capability of RealCustom++, enabling the progressive accumulation and preservation of subject characteristics across multiple rounds.

🎨 Enjoy on Dreamina

RealCustom is previously commercially applied in Dreamina and Doubao, ByteDance. You can also enjoy the more advanced customization algorithm in Dreamina!

Step 1: Create A Character:

Create character images and corresponding appearance descriptions through prompt descriptions, uploading reference images. Specifically: 1. Character Image: Best in clean background, close-up, prominent subject, high-quality resolution. 2. Character Description: Brief, includes the subject and key appearance elements.

Step 2: Character-Driven Generation:

Input prompts where the subject is replaced by the selected character, guiding the character to make corresponding changes such as style, actions, expressions, scenes, and modifiers. There is no need to add descriptions of the subject in the prompt. "Face Reference Strength" is the weight for ID retention, and "Body Reference Strength" is the weight for IP retention.

Citation

If you find this project useful for your research, please consider citing our papers:

@inproceedings{huang2024realcustom,
  title={RealCustom: narrowing real text word for real-time open-domain text-to-image customization},
  author={Huang, Mengqi and Mao, Zhendong and Liu, Mingcong and He, Qian and Zhang, Yongdong},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={7476--7485},
  year={2024}
}
@ARTICLE{11206511,
  author={Mao, Zhendong and Huang, Mengqi and Ding, Fei and Liu, Mingcong and He, Qian and Zhang, Yongdong},
  journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, 
  title={RealCustom++: Representing Images as Real Textual Word for Real-Time Customization}, 
  year={2025},
  volume={},
  number={},
  pages={1-18},
  doi={10.1109/TPAMI.2025.3623025}
}
@article{wu2025less,
  title={Less-to-More Generalization: Unlocking More Controllability by In-Context Generation},
  author={Wu, Shaojin and Huang, Mengqi and Wu, Wenxu and Cheng, Yufeng and Ding, Fei and He, Qian},
  journal={arXiv preprint arXiv:2504.02160},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
assets		assets
configs		configs
envs		envs
inference		inference
models		models
prompts		prompts
schedulers		schedulers
.gitignore		.gitignore
README.md		README.md
license.txt		license.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

RealCustom Series

📖 Introduction

🔥 News

⚡️ Quick Start

🔧 Requirements and Installation

🎭 Download Models

✍️ Inference

🌟 Gradio Demo

🖌️ Multi-Round Generation

🎨 Enjoy on Dreamina

Step 1: Create A Character:

Step 2: Character-Driven Generation:

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

bytedance/RealCustom

Folders and files

Latest commit

History

Repository files navigation

RealCustom Series

📖 Introduction

🔥 News

⚡️ Quick Start

🔧 Requirements and Installation

🎭 Download Models

✍️ Inference

🌟 Gradio Demo

🖌️ Multi-Round Generation

🎨 Enjoy on Dreamina

Step 1: Create A Character:

Step 2: Character-Driven Generation:

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages