MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries. Paper - Project Page
This repo implements the paper MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries. We built our implementation upon MMdetection3D.
The major part of the code is in the directory plugin/track
. To use this code with MMDetection3D, we need older versions of MMDetection3D families(see Environment section), and you need to replace mmdet3d/api
with the mmdet3d/api
provided here.
First, install:
- mmcv==1.3.14
- mmdetection==2.12.0
- nuscenses-devkit
- Note: for tracking we need to install:
motmetrics==1.1.3
, not newer version, likemotmetrics==1.2.0
!!
Second, clone mmdetection3d==0.13.0, but replace its mmdet3d/api/
from mmdetection3d by mmdet3d/api/
in this repo.
e.g.
git clone https://github.com/open-mmlab/mmdetection3d.git
cd mmdetection3d
git checkout v0.13.0
# cp -r ../mmdet3d/api mmdet3d/
# cp ../mmdet3d/models/builder.py mmdet3d/models/
# cp ../mmdet3d/models/detectors/mvx_two_stage.py mmdet3d/models/detectors/mvx_two_stage.py
# replace the mmdetection3d/mmdet3d with the mmdet3d_full
cp -r ../mmdet3d_full ./mmdet3d
cp -r ../plugin ./
cp -r ../tools ./
# then install mmdetection3d following its instruction.
# and mmdetection3d becomes your new working directories.
After preparing the nuScenes Dataset following mmdetection3d, you need to generate a meta file or say .pkl
file.
python3 tools/data_converter/nusc_track.py
I provide a template config file in plugin/track/configs/resnet101_fpn_3frame.py
. You can directly use this config or read this file, especially its comments, and modify whatever you want. I recommend using DETR3D pre-trained models or other nuScenes 3D Detection pre-trained models.
basic training scripts on a machine with 8 GPUS:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash tools/dist_train_tracker.sh plugin/track/configs/resnet101_fpn_3frame.py 8 --work-dir=work_dirs/experiment_name
basic test scripts
# You can perform inferece, then save the result file
python3 tools/test.py plugin/track/configs/resnet101_fpn_3frame.py <model-path> --format-only --eval-options jsonfile_prefix=<dir-name-for-saving-json-results>
# or you can perform inference and directly perform the evaluation
python3 tools/test.py plugin/track/configs/resnet101_fpn_3frame.py <model-path> --eval --bbox
For visualization, I suggest user to generate the results json file first. I provide some sample code at tools/nusc_visualizer.py
for visualizing the predictions, see function _test_pred()
in tools/nusc_visualize.py
for examples.
Backbones | AMOTA-val | AMOTP-val | IDS-val | Download |
---|---|---|---|---|
ResNet-101 w/ FPN | 29.5 | 1.498 | 4388 | model | val results |
ResNet-50 w/ FPN | 25.2 | 1.573 | 3899 | model | val results |
For the implementation, we rely heavily on MMCV, MMDetection, MMDetection3D,MOTR, and DETR3D
- DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries
- FUTR3D: A Unified Sensor Fusion Framework for 3D Detection
- For more projects on Autonomous Driving, check out our camera-centered autonomous driving projects page webpage
@article{zhang2022mutr3d,
title={MUTR3D: A Multi-camera Tracking Framework via 3D-to-2D Queries},
author={Zhang, Tianyuan and Chen, Xuanyao and Wang, Yue and Wang, Yilun and Zhao, Hang},
journal={arXiv preprint arXiv:2205.00613},
year={2022}
}
Contact: Tianyuan Zhang at: tianyuaz@andrew.cmu.edu
or tianyuanzhang1998@gmail.com