Official PyTorch implementation from our paper: Lossless Token Merging Even Without Fine-Tuning in Vision Transformers (ECAI 2025)

To reproduce our experiments, run the following command:
python main.py --eval --resume /path/to/model --model deit_small_patch16_224 --data-path /path/to/ILSVRC2012 --batch-size 1024 --alpha 0.99 --beta 0.04 --lower-bound 0.88The optimal hyperparameter combination can be found in the Appendix.
Our proposed ATM adaptively merges similar token pairs based on their image characteristics.

Our code is based on DeiT and ToMe. Thanks for their great works.