Hello!!! Thanks for checking out our repo and paper! 🍻
Clone our codebase by following the command below
git clone https://github.com/justincui03/tesla.git
Then install the required environments
cd tesla
conda env create -f requirements.yaml
Please make sure to activate the environtment before starting any of the following commands by
conda activate tesla
Our method is developed based on MTT, therefore we follow its general workflow. The first step is to generate the expert trajectories by following the command below
python buffer.py --dataset=CIFAR10 --model=ConvNet --train_epochs=50 --num_experts=100 --zca
The command we used to generate ImageNet-1K trajectories is
python buffer.py --dataset=ImageNet --model=ConvNetD4 --train_epochs=50 --num_experts=50
In order to download the ImageNet dataset, please follow the steps here
Below are the a few example commands to get started.
python distill.py --dataset=CIFAR10 --ipc=1 --syn_steps=50 --expert_epochs=2 --max_start_epoch=5 --lr_img=1000 --lr_lr=1e-07 --lr_teacher=0.01
python distill.py --dataset=CIFAR100 --ipc=50 --syn_steps=80 --expert_epochs=2 --max_start_epoch=40 --lr_img=1000 --lr_lr=1e-05 --lr_teacher=0.01 --batch_syn=125 --zca
python distill.py --dataset=ImageNet --ipc=1 --syn_steps=10 --expert_epochs=3 --max_start_epoch=10 --lr_img=10000 --lr_lr=1e-04 --lr_teacher=0.01 --batch_syn=100 --model=ConvNetD4 --teacher_label
Part of the ImageNet-1K expert trajectories (5GB) can be downloaded here for quick experimentation.
Our code is developed based on the following codebases, thanks for sharing
- Dataset Distillation
- Dataset Distillation by Matching Training Trajectories
- Dataset Condensation with Differentiable Siamese Augmentation
- Dataset Distillation using Neural Feature Regression
- Flexible Dataset Distillation: Learn Labels Instead of Images
👍 Special thanks to Bo Zhao, George Cazenavette and Mingyang Chen for their valuable feedback.
If you find our code useful for your research, please cite our paper.
@inproceedings{cui2023scaling,
title={Scaling up dataset distillation to imagenet-1k with constant memory},
author={Cui, Justin and Wang, Ruochen and Si, Si and Hsieh, Cho-Jui},
booktitle={International Conference on Machine Learning},
pages={6565--6590},
year={2023},
organization={PMLR}
}