DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization

Personalized diffusion models have shown remarkable success in Text-to-Image (T2I) generation by enabling the injection of user-defined concepts into diverse contexts. However, balancing concept fidelity with contextual alignment remains a challenging open problem. In this work, we propose an RL-based approach that leverages the diverse outputs of T2I models to address this issue. Our method eliminates the need for human-annotated scores by generating a synthetic paired dataset for DPO-like training using external quality metrics. These better–worse pairs are specifically constructed to improve both concept fidelity and prompt adherence. Moreover, our approach supports flexible adjustment of the trade-off between image fidelity and textual alignment. Through multi-step training, our approach outperforms a naive baseline in convergence speed and output quality. We conduct extensive qualitative and quantitative analysis, demonstrating the effectiveness of our method across various architectures and fine-tuning techniques.

DreamBoothDPO leverages synthetic preference pairs and CLIP-based metrics to automate personalized generation, dynamically optimizing the trade-off between concept accuracy and prompt alignment through iterative multi-stage training.

Updates

[27/05/2025] 🔥🔥🔥 DreamBoothDPO release. Paper has been published on Arxiv.

Prerequisites

You need following hardware and python version to run our method.

Linux
NVIDIA GPU + CUDA CuDNN
Conda 24.1.0+ or Python 3.11+

Installation

Clone this repo:

git clone https://github.com/ControlGenAI/DreamBoothDPO.git
cd DreamBoothDPO

Create Conda environment:

conda create -n dbdpo python=3.11
conda activate dbdpo

Install the dependencies in your environment:

pip install -r requirements.txt

Training pipeline

Prompts preparation

# 0.0 Download and extract COCO annotations
wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip
unzip annotations_trainval2014.zip

# 0.1 Collect prompts from COCO.
python gen_prompts_from_coco.py <args...>

# 0.2 Merge COCO prompts with ChatGPT prompts.
python data/merge.py <args...>

Base model preparation

# 1.1 Train base Personalized Generation model (i.e., DreamBooth).
bash scripts/train_dreambooth.sh

# 1.2 Generate images for validational prompts with the base model.
bash scripts/generate_val.sh

# 1.3 Get CLIP scores to find best checkpoint on the Pareto frontier for the base model.
bash scripts/evaluate_exp.sh

Generate DPO dataset

# 2.1. Get subsample of prompts
python data/subset.py <args...>

# 2.2 Generate images for collected prompts.
bash scripts/generate_prompts*.sh 

# 2.3 Get CLIP scores for generated samples.
bash scripts/evaluate_exp.sh

# 2.4 Collect pairs of generated samples based on score differences and angles.
bash scripts/collect_pairs.sh

Train Diffusion-DPO model

# 3 Train DPO on collected pairs.
bash scripts/train_ddpo_pairs*.sh

Evaluate the final model

# 4.1 Generate images for validational prompts with the trained model.
bash scripts/generate_val*.sh 

# 4.2 Score samples.
bash scripts/evaluate_exp.sh

Multistep training pipeline

# After step 1 and before step 2: Split prompt set for each step.
python data/split_to_parts.py <args...>

# Repeat steps 2 and 3 with the appropriate checkpoints.
bash scripts/pipeline*.sh

Main paper setups are:

SD2-DB: scripts/pipeline.sh
SD2-SVD: scripts/pipeline_svd_full.sh
SDXL-SVD: scripts/pipeline_sdxl.sh

References & Acknowledgments

The repository has used several codebases:

Citation

If you use this code or our findings for your research, please cite our paper:

@misc{ayupov2025dreamboothdpoimprovingpersonalizedgeneration,
      title={DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization}, 
      author={Shamil Ayupov and Maksim Nakhodnov and Anastasia Yaschenko and Andrey Kuznetsov and Aibek Alanov},
      year={2025},
      eprint={2505.20975},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.20975}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
data		data
nb_utils		nb_utils
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
clean_exp_ckpts.py		clean_exp_ckpts.py
collect_pairs.py		collect_pairs.py
evaluate_exp.py		evaluate_exp.py
evaluate_hpsv2.py		evaluate_hpsv2.py
gen_prompts_from_coco.py		gen_prompts_from_coco.py
generate.py		generate.py
generate_sdxl.py		generate_sdxl.py
generate_sdxl_lora.py		generate_sdxl_lora.py
generate_svd.py		generate_svd.py
generate_svd_full.py		generate_svd_full.py
pipeline.py		pipeline.py
pipeline_sdxl.py		pipeline_sdxl.py
pipeline_svd.py		pipeline_svd.py
pipeline_svd_full.py		pipeline_svd_full.py
requirements.txt		requirements.txt
svd_diff.py		svd_diff.py
train_ddpo.py		train_ddpo.py
train_ddpo_sdxl.py		train_ddpo_sdxl.py
train_ddpo_sdxl_lora.py		train_ddpo_sdxl_lora.py
train_ddpo_svd.py		train_ddpo_svd.py
train_ddpo_svd_full.py		train_ddpo_svd_full.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization

Updates

Prerequisites

Installation

Training pipeline

Prompts preparation

Base model preparation

Generate DPO dataset

Train Diffusion-DPO model

Evaluate the final model

Multistep training pipeline

References & Acknowledgments

Citation

About

Uh oh!

Releases

Packages

Languages

License

ControlGenAI/DreamBoothDPO

Folders and files

Latest commit

History

Repository files navigation

DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization

Updates

Prerequisites

Installation

Training pipeline

Prompts preparation

Base model preparation

Generate DPO dataset

Train Diffusion-DPO model

Evaluate the final model

Multistep training pipeline

References & Acknowledgments

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages