MaskedKD (Masked Knowledge Distillation)

This is the official implementation of a paper titled "The Role of Masking for Efficient Supervised Knowledge Distillation of Vision Transformers (ECCV 2024)"

Paper | Project page

Seungwoo Son, Jegwang Ryu, Namhoon Lee, Jaeho Lee
Pohang University of Science and Technology (POSTECH)

Summary

Our method, MaskedKD, reduces supervision cost by masking teacher ViT input based on student attention, maintaining student accuracy while saving computation.

Installation

Since we have implemeneted on DeiT and MAE's official code, just following MAE's guideline.

Data preparation

Download and extract ImageNet train and val images from http://image-net.org/. The directory structure is:

│path/to/imagenet/
├──train/
│  ├── n01440764
│  │   ├── n01440764_10026.JPEG
│  │   ├── n01440764_10027.JPEG
│  │   ├── ......
│  ├── ......
├──val/
│  ├── n01440764
│  │   ├── ILSVRC2012_val_00000293.JPEG
│  │   ├── ILSVRC2012_val_00002138.JPEG
│  │   ├── ......
│  ├── ......

Experiments

Training on ImageNet-1K

To train a DeiT-Small student with a DeiT-Base teacher, run:

python -m torch.distributed.launch --nproc_per_node=8 --use_env main.py \
--model deit_small_patch16_224  --teacher_model deit_base --epochs 300 \
--batch-size 128 --data-path /path/to/ILSVRC2012/ --distillation-type soft \
--distillation-alpha 0.5 --distillation-tau 1  --input-size 224 --maskedkd --len_num_keep 98 \
--output_dir /path/to/output_dir/

Acknowledgment

This repo is based on DeiT, MAE and pytorch-image-models.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
materials		materials
README.md		README.md
augment.py		augment.py
cait.py		cait.py
datasets.py		datasets.py
deit3.py		deit3.py
engine.py		engine.py
losses.py		losses.py
main.py		main.py
models_student.py		models_student.py
models_teacher.py		models_teacher.py
samplers.py		samplers.py
transforms_factory.py		transforms_factory.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MaskedKD (Masked Knowledge Distillation)

Paper | Project page

Summary

Installation

Data preparation

Experiments

Training on ImageNet-1K

Acknowledgment

About

Uh oh!

Releases

Packages

Contributors 2

Languages

effl-lab/MaskedKD

Folders and files

Latest commit

History

Repository files navigation

MaskedKD (Masked Knowledge Distillation)

Paper | Project page

Summary

Installation

Data preparation

Experiments

Training on ImageNet-1K

Acknowledgment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages