Skip to content

hatchetProject/CaO2

Repository files navigation

CaO2

Official repository for CaO2: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation.

Update Log

(2025.7.7) Upload distilled datasets.

(2025.6.26) Starting to update the repository.

Preparation

Dataset

Download the ImageNet dataset, and place it at your intended IMAGENET_PATH.

Installation

git clone https://github.com/hatchetProject/CaO2.git
cd CaO2
conda env create -f environment.yaml
conda activate cao2

Download the pretrained DiT model (DiT/XL 256×256), and place it in pretrained_models.

Download the pretrained classification models for RDED evaluation from RDED and place them in data/pretrain_models. We also provide the script for running evaluation without teacher supervision.

Usage

We provided an example script in run.sh for the users to run the code. Note that the users will need to modify the variables in the script for it to run properly.

To gain better performance, hyperparameters can be tuned, but the variations are usually marginal.

Distilled Datasets

For direct evaluation and benchmarking, the distilled datasets are uploaded here (>7GB).

TODO

  • Complete the code upload
  • Upload the distilled datasets
  • Update code for MAR distillation

Acknowledgements

Thanks to these amazing repositories: Minimax Diffusion, RDED and many other inspiring works.

Citation

If you find this work useful, please consider citing:

@misc{wang2025cao2rectifyinginconsistenciesdiffusionbased,
      title={CaO$_2$: Rectifying Inconsistencies in Diffusion-Based Dataset Distillation}, 
      author={Haoxuan Wang and Zhenghao Zhao and Junyi Wu and Yuzhang Shang and Gaowen Liu and Yan Yan},
      year={2025},
      eprint={2506.22637},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2506.22637}, 
}

About

[ICCV 2025] Rectifying Inconsistencies in Diffusion-Based Dataset Distillation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published