This page provides basic tutorials about the usage of MMTracking. For installation instructions, please see install.md.
-
It is recommended to symlink the root of the datasets to
$MMTRACKING/data
. For example,mkdir data # object detection: symlink MS COCO ln -s $MSCOCO_ROOT/images data/coco/source_data ln -s $MSCOCO_ROOT/annotations data/coco/json_annotations # video object detection: symlink ImageNet DET and ImageNet VID ln -s $IMAGENETDET_IMAGENETVID_ROOT data/imagenetdet_imagenetvid/source_data # single object tracking: symlink LaSOT ln -s $LASOT_ROOT data/lasot/source_data # multiple object tracking: symlink MOT17 ln -s $MOT17 data/MOT17
Download the txt files for the training of video object detection, and put these txt files into
data/imagenetdet_imagenetvid/data/Lists/
. -
If your folder structure is different from the following, you may need to change the corresponding paths in config files.
mmtracking ├── mmtrack ├── tools ├── configs ├── data │ ├── coco │ │ ├── source_data │ │ │ ├── train2017 │ │ ├── json_annotations │ │ │ ├── instances_train2017.json │ ├── imagenetdet_imagenetvid │ │ ├── source_data │ │ │ ├── Data │ │ │ │ ├── DET │ │ │ │ ├── VID │ │ │ ├── Annotations │ │ │ │ ├── DET │ │ │ │ ├── VID │ │ ├── json_annotations │ │ │ ├── imagenet_det_30plus1cls.json (generated by tools/convert_datasets/imagenet2coco_det.py) │ │ │ ├── imagenet_vid_train.json (generated by tools/convert_datasets/imagenet2coco_vid.py) │ │ │ ├── imagenet_vid_val.json (generated by tools/convert_datasets/imagenet2coco_vid.py) │ ├── lasot │ │ ├── source_data │ │ │ ├── airplane-1 │ │ │ ├── airplane-13 │ │ ├── json_annotations │ │ │ ├── lasot_test.json (generated by tools/convert_datasets/lasot2coco.py) | ├── MOT17 | | ├── train | | ├── test
-
Generate the json annotations of MS COCO dataset, LaSOT dataset, ImageNet DET and ImageNet VID dataset.
# Generate imagenet_det_30plus1cls.json python ./tools/convert_datasets/imagenet2coco_det.py \ -i ./data/imagenetdet_imagenetvid/source_data \ -o ./data/imagenetdet_imagenetvid/json_annotations # Generate imagenet_vid_train.json and imagenet_vid_val.json python ./tools/convert_datasets/imagenet2coco_vid.py \ -i ./data/imagenetdet_imagenetvid/source_data \ -o ./data/imagenetdet_imagenetvid/json_annotations # Generate lasot_test.json python ./tools/convert_datasets/lasot2coco.py \ -i ./data/lasot/source_data \ -o ./data/lasot/json_annotations # Generate annotations files for MOT17 python ./tools/convert_datasets/mot2coco.py \ -i ./data/MOT17/ \ -o ./data/MOT17/annotations \ --split-train --convert-det
We provide testing scripts to evaluate a whole dataset, and also some high-level apis for easier integration to other projects.
- single GPU
- single node multiple GPU
You can use the following commands to test a dataset.
# single-gpu testing
python tools/test.py ${CONFIG_FILE} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
# multi-gpu testing
./tools/dist_test.sh ${CONFIG_FILE} ${GPU_NUM} [--checkpoint ${CHECKPOINT_FILE}] [--out ${RESULT_FILE}] [--eval ${EVAL_METRICS}]
Optional arguments:
CHECKPOINT_FILE
: Filename of the checkpoint. You do not need to define it when applying MOT tasks but specify the checkpoints in the config.RESULT_FILE
: Filename of the output results in pickle format. If not specified, the results will not be saved to a file.EVAL_METRICS
: Items to be evaluated on the results. Allowed values depend on the dataset, e.g.,bbox
is available for ImageNet VID,track
is available for LaSOT and MOT17.
Examples:
Assume that you have already downloaded the checkpoints to the directory checkpoints/
.
-
Test DFF on ImageNet VID, and evaluate the bbox mAP.
python tools/test.py configs/vid/dff/dff_faster_rcnn_r101_dc5_1x_imagenetvid.py \ --checkpoint checkpoints/dff_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172720-ad732e17.pth \ --out results.pkl \ --eval bbox
-
Test DFF with 8 GPUs, and evaluate the bbox mAP.
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/vid/dff/dff_faster_rcnn_r101_dc5_1x_imagenetvid.py \ --checkpoint checkpoints/dff_faster_rcnn_r101_dc5_1x_imagenetvid_20201218_172720-ad732e17.pth \ --out results.pkl \ --eval bbox
-
Test SiameseRPN++ on LaSOT, and evaluate the success and normed precision.
python tools/test.py configs/sot/siamese_rpn/siamese_rpn_r50_1x_lasot.py \ --checkpoint checkpoints/siamese_rpn_r50_1x_lasot_20201218_051019-3c522eff.pth \ --out results.pkl \ --eval bbox
-
Test SiameseRPN++ with 8 GPUs, and evaluate the success and normed precision.
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} configs/sot/siamese_rpn/siamese_rpn_r50_1x_lasot.py \ --checkpoint checkpoints/siamese_rpn_r50_1x_lasot_20201218_051019-3c522eff.pth \ --out results.pkl \ --eval bbox
-
Test Tracktor on MOT17, and evaluate CLEAR MOT metrics.
python tools/test.py configs/mot/tracktor/tracktor_faster-rcnn_r50_fpn_4e_mot17-public-half.py \ --eval track
-
Test Tracktor with 8 GPUs, and evaluate CLEAR MOT metrics.
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} \ configs/mot/tracktor/tracktor_faster-rcnn_r50_fpn_4e_mot17-public-half.py \ --eval track
MMDetection implements distributed training and non-distributed training,
which uses MMDistributedDataParallel
and MMDataParallel
respectively.
All outputs (log files and checkpoints) will be saved to the working directory,
which is specified by work_dir
in the config file.
By default we evaluate the model on the validation set after each epoch, you can change the evaluation interval by adding the interval argument in the training config.
evaluation = dict(interval=12) # This evaluate the model per 12 epoch.
Important: The default learning rate in config files is for 8 GPUs. According to the Linear Scaling Rule, you need to set the learning rate proportional to the batch size if you use different GPUs or images per GPU, e.g., lr=0.01 for 8 GPUs * 1 img/gpu and lr=0.04 for 16 GPUs * 2 img/gpu.
python tools/train.py ${CONFIG_FILE} [optional arguments]
If you want to specify the working directory in the command, you can add an argument --work_dir ${YOUR_WORK_DIR}
.
./tools/dist_train.sh ${CONFIG_FILE} ${GPU_NUM} [optional arguments]
Optional arguments are:
--no-validate
(not suggested): By default, the codebase will perform evaluation at every k (default value is 1, which can be modified like this) epochs during the training. To disable this behavior, use--no-validate
.--work-dir ${WORK_DIR}
: Override the working directory specified in the config file.--resume-from ${CHECKPOINT_FILE}
: Resume from a previous checkpoint file.--options 'Key=value'
: Overide some settings in the used config.
Difference between resume-from
and load-from
:
resume-from
loads both the model weights and optimizer status, and the epoch is also inherited from the specified checkpoint. It is usually used for resuming the training process that is interrupted accidentally.
load-from
only loads the model weights and the training epoch starts from 0. It is usually used for finetuning.
If you run MMTracking on a cluster managed with slurm, you can use the script slurm_train.sh
. (This script also supports single machine training.)
[GPUS=${GPUS}] ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}
Here is an example of using 16 GPUs to train DFF on the dev partition.
GPUS=16 ./tools/slurm_train.sh dev dff_r101_1x configs/dff_faster_rcnn_r101_dc5_1x_imagenetvid.py /nfs/xxxx/dff_faster_rcnn_r101_dc5_1x_imagenetvid
You can check slurm_train.sh for full arguments and environment variables.
If you have just multiple machines connected with ethernet, you can refer to PyTorch launch utility. Usually it is slow if you do not have high speed networking like InfiniBand.
If you launch multiple jobs on a single machine, e.g., 2 jobs of 4-GPU training on a machine with 8 GPUs, you need to specify different ports (29500 by default) for each job to avoid communication conflict.
If you use dist_train.sh
to launch training jobs, you can set the port in commands.
CUDA_VISIBLE_DEVICES=0,1,2,3 PORT=29500 ./tools/dist_train.sh ${CONFIG_FILE} 4
CUDA_VISIBLE_DEVICES=4,5,6,7 PORT=29501 ./tools/dist_train.sh ${CONFIG_FILE} 4
If you use launch training jobs with Slurm, there are two ways to specify the ports.
-
Set the port through
--options
. This is more recommended since it does not change the original configs.CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} --options 'dist_params.port=29500' CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR} --options 'dist_params.port=29501'
-
Modify the config files (usually the 6th line from the bottom in config files) to set different communication ports.
In
config1.py
,dist_params = dict(backend='nccl', port=29500)
In
config2.py
,dist_params = dict(backend='nccl', port=29501)
Then you can launch two jobs with
config1.py
angconfig2.py
.CUDA_VISIBLE_DEVICES=0,1,2,3 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config1.py ${WORK_DIR} CUDA_VISIBLE_DEVICES=4,5,6,7 GPUS=4 ./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} config2.py ${WORK_DIR}
We provide lots of useful tools under tools/
directory.
Before you upload a model to AWS, you may want to (1) convert model weights to CPU tensors, (2) delete the optimizer states and (3) compute the hash of the checkpoint file and append the hash id to the filename.
python tools/publish_model.py ${INPUT_FILENAME} ${OUTPUT_FILENAME}
E.g.,
python tools/publish_model.py work_dirs/dff_faster_rcnn_r101_dc5_1x_imagenetvid/latest.pth dff_faster_rcnn_r101_dc5_1x_imagenetvid_20201230.pth
The final output filename will be dff_faster_rcnn_r101_dc5_1x_imagenetvid_20201230-{hash id}.pth
.