Skip to content

This is the official implementation of "DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization"

License

Notifications You must be signed in to change notification settings

ControlGenAI/DreamBoothDPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization

License

Personalized diffusion models have shown remarkable success in Text-to-Image (T2I) generation by enabling the injection of user-defined concepts into diverse contexts. However, balancing concept fidelity with contextual alignment remains a challenging open problem. In this work, we propose an RL-based approach that leverages the diverse outputs of T2I models to address this issue. Our method eliminates the need for human-annotated scores by generating a synthetic paired dataset for DPO-like training using external quality metrics. These better–worse pairs are specifically constructed to improve both concept fidelity and prompt adherence. Moreover, our approach supports flexible adjustment of the trade-off between image fidelity and textual alignment. Through multi-step training, our approach outperforms a naive baseline in convergence speed and output quality. We conduct extensive qualitative and quantitative analysis, demonstrating the effectiveness of our method across various architectures and fine-tuning techniques.

Teaser
DreamBoothDPO leverages synthetic preference pairs and CLIP-based metrics to automate personalized generation, dynamically optimizing the trade-off between concept accuracy and prompt alignment through iterative multi-stage training.

Updates

  • [27/05/2025] 🔥🔥🔥 DreamBoothDPO release. Paper has been published on Arxiv.

Prerequisites

You need following hardware and python version to run our method.

  • Linux
  • NVIDIA GPU + CUDA CuDNN
  • Conda 24.1.0+ or Python 3.11+

Installation

  • Clone this repo:
git clone https://github.com/ControlGenAI/DreamBoothDPO.git
cd DreamBoothDPO
  • Create Conda environment:
conda create -n dbdpo python=3.11
conda activate dbdpo
  • Install the dependencies in your environment:
pip install -r requirements.txt

Training pipeline

Prompts preparation

# 0.0 Download and extract COCO annotations
wget http://images.cocodataset.org/annotations/annotations_trainval2014.zip
unzip annotations_trainval2014.zip

# 0.1 Collect prompts from COCO.
python gen_prompts_from_coco.py <args...>

# 0.2 Merge COCO prompts with ChatGPT prompts.
python data/merge.py <args...>

Base model preparation

# 1.1 Train base Personalized Generation model (i.e., DreamBooth).
bash scripts/train_dreambooth.sh

# 1.2 Generate images for validational prompts with the base model.
bash scripts/generate_val.sh

# 1.3 Get CLIP scores to find best checkpoint on the Pareto frontier for the base model.
bash scripts/evaluate_exp.sh

Generate DPO dataset

# 2.1. Get subsample of prompts
python data/subset.py <args...>

# 2.2 Generate images for collected prompts.
bash scripts/generate_prompts*.sh 

# 2.3 Get CLIP scores for generated samples.
bash scripts/evaluate_exp.sh

# 2.4 Collect pairs of generated samples based on score differences and angles.
bash scripts/collect_pairs.sh 

Train Diffusion-DPO model

# 3 Train DPO on collected pairs.
bash scripts/train_ddpo_pairs*.sh 

Evaluate the final model

# 4.1 Generate images for validational prompts with the trained model.
bash scripts/generate_val*.sh 

# 4.2 Score samples.
bash scripts/evaluate_exp.sh 

Multistep training pipeline

# After step 1 and before step 2: Split prompt set for each step.
python data/split_to_parts.py <args...>

# Repeat steps 2 and 3 with the appropriate checkpoints.
bash scripts/pipeline*.sh

Main paper setups are:

  1. SD2-DB: scripts/pipeline.sh
  2. SD2-SVD: scripts/pipeline_svd_full.sh
  3. SDXL-SVD: scripts/pipeline_sdxl.sh

References & Acknowledgments

The repository has used several codebases:

Citation

If you use this code or our findings for your research, please cite our paper:

@misc{ayupov2025dreamboothdpoimprovingpersonalizedgeneration,
      title={DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization}, 
      author={Shamil Ayupov and Maksim Nakhodnov and Anastasia Yaschenko and Andrey Kuznetsov and Aibek Alanov},
      year={2025},
      eprint={2505.20975},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2505.20975}, 
}

About

This is the official implementation of "DreamBoothDPO: Improving Personalized Generation using Direct Preference Optimization"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published