MATE is the first 3D Test-Time Training (TTT) method which makes 3D object recognition architectures robust to distribution shifts which can commonly occur in 3D point clouds. MATE follows the classical TTT paradigm of using an auxiliary objective to make the network robust to distribution shifts at test-time. To this end, MATE employs the self-supervised test-time objective of reconstructing aggressively masked input point cloud patches.
In this repository we provide our pre-trained models and codebase to reproduce the results reported in our paper.
PyTorch >= 1.7.0 < 1.11.0
python >= 3.7
CUDA >= 9.0
GCC >= 4.9
To install all additional requirements (open command line and run):
pip install -r requirements.txt
cd ./extensions/chamfer_dist
python setup.py install --user
cd ..
cd ./extensions/emd
python setup.py install --user
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
pip install --upgrade https://github.com/unlimblue/KNN_CUDA/releases/download/0.2/KNN_CUDA-0.2-py3-none-any.whl
Our code currently supports three different datasets: ModelNet40, ShapeNetCore and ScanObjectNN.
To use these datasets with our code, first download them from the following sources:
-
ScanObjectNN (It is necessary to first agree to the terms of use here)
Then, extract all of these folders into the same directory for easier use.
To add distribution shifts to the data, corruptions from ModelNet40-C are used.
For experiments on corrupted ModelNet data, the ModelNet40-C dataset can be downloaded here.
Compute the same corruptions for ShapeNetCore and ScanObjectNN, if needed.
python ./datasets/create_corrupted_dataset.py --main_path <path/to/dataset/parent/directory> --dataset <dataset_name>
Replace <dataset_name>
with either scanobjectnn
or shapenet
as required.
Note that for computation of the corruptions "occlusion" and "lidar", model meshes are needed. These are computed with the open3d library.
All our pretrained models are available at this Google-Drive.
The jt
models are jointly trained for reconstruction and classification, src_only
models are trained for only the classification task.
For TTT, go to cfgs/tta/tta_<dataset_name>.yaml
and set the tta_dataset_path
variable to the relative path of the dataset parent directory.
E.g. if your data for ModelNet-C is in ./data/tta_datasets/modelnet-c
, set the variable to ./data/tta_datasets
.
A jointly trained model can be used for test-time training by:
CUDA_VISIBLE_DEVICES=0 python ttt.py --dataset_name <dataset_name> --online --grad_steps 1 --config cfgs/tta/tta_<dataset_name>.yaml --ckpts <path/to/pretrained/model>
This will run the TTT-Online (for one gradient step)
.
For running the TTT-Standard
, following command can be used:
CUDA_VISIBLE_DEVICES=0 python ttt.py --dataset_name <dataset_name> --grad_steps 20 --config cfgs/tta/tta_<dataset_name>.yaml --ckpts <path/to/pretrained/model>
To train a new model on one of the three datasets, go to cfgs/dataset_configs/<dataset_name>.yaml
and set the DATA_PATH
variable in the file to the relative path of the dataset folder.
After setting the paths, a model can be jointly trained by
CUDA_VISIBLE_DEVICES=0 python train.py --jt --config cfgs/pre_train/pretrain_<dataset_name>.yaml --dataset <dataset_name>
A model for a supervised only baseline can be trained by
CUDA_VISIBLE_DEVICES=0 python train.py --only_cls --config cfgs/pre_train/pretrain_<dataset_name>.yaml --dataset <dataset_name>
The trained models can then be found in the corresponding experiments
subfolder.
For a basic inference baseline without adaptation, use
CUDA_VISIBLE_DEVICES=0 python test.py --dataset_name <dataset_name> --config cfgs/pre_train/pretrain_<dataset_name>.yaml --ckpts <path/to/pretrained/model> --test_source
Scripts for pretraining, testing and test-time training can also be found in commands.sh
.
@article{mirza2023mate,
author = {Mirza, M. Jehanzeb and Shin, Inkyu and Lin, Wei and Schriebl, Andreas and Sun, Kunyang and
Choe, Jaesung and Kozinski, Mateusz and Possegger, Horst and Kweon, In So and Yoon, Kun-Jin and Bischof, Horst},
title = {MATE: Masked Autoencoders are Online 3D Test-Time Learners},
journal = {Proceedings of the IEEE/CVF International Computer Vision Conference (ICCV)},
year = {2023}
}
We also acknowledge PointMAE for their open source implementation, which we use extensively in this project.