This is the thesis conducted at Ho Chi Minh City University of Technology, Vietnam while we are students. In this project, we apply Deep Learning using Pytorch framework and based on MMDetection to do vehicles detection, tracking and speed estimation. The dataset is collected at the overpass in Ho Chi Minh City, Vietnam and labels by our team. You can find more information of our work in Project summary.
Our main work is summarized as following
- We divided the work into four parts for development: Detection part, tracking part, speed estimation part and dataset, in which we only focus on reading papers, perceive those ideas and apply them to improve the results.
- For object detection, we only research and apply various network architecture such as RetinaNet, Faster R-CNN as well as recent techniques for object detection including ATSS, data Augmentation, Focal KL Loss, etc. to push the accuracy.
- For tracking and speed estimation, we focus on applying IOU tracker and modify it for stable tracking results; applying formular V=S/t for speed estimation. We mainly evaluate the tracking result by human visualization because of the limitation of label for those parts.
- Make new dataset: The main problem we encounter is GPU resources for train Deep Learning Network. If we utilized the existed dataset which is extremely large and heavy, we could not do on that. Hence, we need a new dataset which is liter and apply transfer learning technique to reach our target. The details of our dataset is in the later section.
Structure of this README
- Overall architecture
- Installation
- Dataset preparation
- Train
- Test
- Object detection results summary
- Object detection visualize
- Team's information
- Citation
- OS: Ubuntu 18.04
- Python: 3.7.9
conda create -n vdts python=3.7 -y
conda activate vdts
conda install pytorch=1.5 torchvision -c pytorch
pip install "git+https://github.com/open-mmlab/cocoapi.git#subdirectory=pycocotools"
pip install git+https://github.com/open-mmlab/mmdetection.git@v2.2.0
pip install mmcv==0.6.2
Note: Make sure that your compilation CUDA version and runtime CUDA version match. You can check the supported CUDA version for precompiled packages on the Pytorch website
pip install git+https://github.com/thuyngch/cvut
pip install future tensorboard
pip install ipdb
- We collect the dataset of vehicles in Ho Chi Minh City at the overpass and label them handly. It contains 1029 images with 5 classes. The annotation files are in COCO Object Detection format. The detail of our dataset is shown in the table bellow:
- Download dataset from Google Drive Link and unzip it.
- The data after extracted should following the following structure:
- Make symblic link to the dataset you just downloaded from project directory:
ln -s <PATH TO DATASET> data
For Example, my dataset named data
is located at /home/tuan/Desktop
, I do the following command:
The result in the image above is that I make the symblic link name data
to the folder containing dataset.
- Run the following command in bash shell:
#!/usr/bin/env bash
set -e
WORKDIR="../trained_weight/atss_r50_fpn_1x_street" # directory for saving checkpoints during training
CONFIG="configs/street/atss_r50_fpn_1x_street.py" # path to your config file
GPUS=2 # number of GPU while training
LOAD_FROM="../pretrained/atss_r50_fpn_1x_coco.pth" # Pretrain weight from COCO dataset
export CUDA_VISIBLE_DEVICES=0,1
bash tools/dist_train.sh $CONFIG $GPUS --work-dir $WORKDIR --options DATA_ROOT=$DATA_ROOT --load_from $LOAD_FROM
-
In the above example, config file is
configs/street/atss_r50_fpn_1x_street.py
, pretrained weight isatss_r50_fpn_1x_coco.pth
saved at../pretrained
. Checkpoints will save under../transfer_weight/atss_r50_fpn_1x_street
. -
NOTE: The pretrained weight from COCO is download at MMDetection repo, following section will give the specific link.
- Run the following command in bash shell: NOTE: The trained weights can be downloaded in section 6. Result
#!/usr/bin/env bash
set -e
export PYTHONPATH="$(dirname $0)/..":$PYTHONPATH
CONFIG="configs/street/atss_r50_fpn_1x_street.py" # path to your config file
CHECKPOINT="../trained_weight/atss_r50_fpn_1x_street_epoch_12.pth" # path to your checkpoint file, in this case, checkpoint file is `atss_r50_fpn_1x_street_epoch_12.pth`
GPUS=2
export CUDA_VISIBLE_DEVICES=0,1
python -m torch.distributed.launch --nproc_per_node=$GPUS --master_port=$((RANDOM + 10000)) \
tools/test.py $CONFIG $CHECKPOINT --launcher pytorch --eval bbox
- Run the following command to test the speed of network:
python tools/benchmark.py <YOUR_CONFIG_FILE.py>
Our results of the object detection method are summarized in the following table:
No. | Method | Albu | Multiscale training | Scheduler | mAP | FPS | Image size | Config | COCO Pretrained | Our weight | Note |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | YOLOv3 (3fpn levels) | +Cutmix | yes | ---- | 0.63 | --- | x | Original |
|||
2 | Vanila RetinaNet - R50 | no | no | 1x | 0.632 | 18.2 | 1518x800 | retinanet_r50_fpn_1x_street.py | retinanet_r50_fpn_2x_coco.pth | retinanet_r50_fpn_1x_street | |
3 | Faster R-CNN | no | no | 1x | 0.481 | x | x | faster_rcnn_r50_fpn_1x_street.py | faster_rcnn_r50_fpn_2x_coco.pth | faster_rcnn_r50_fpn_1x_street | |
4 | ATSS-R50 | no | no | 1x | 0.747 | 17.9 | --- | atss_r50_fpn_1x_street.py | atss_r50_fpn_1x_coco.pth | atss_r50_fpn_1x_street | Baseline |
5 | ATSS+Net-R18 | no | no | 1x | 0.522 | 30 | --- | Different Backbones |
|||
6 | ATSS+MobileNetV3 | no | no | 1x | 0.646 | 32.5 | --- | ||||
7 | Vanila RetinaNet with MobileNetv3 | no | no | 1x | 0.38 | x | --- | ||||
8 | ATSS-R50 | yes | no | 1x | 0.686 | --- | --- | atss_r50_fpn_albu_1x_street.py | Augment | ||
9 | ATSS-R50 | no | no | 1x | 0.759 | --- | 4096x3072 | Big size | |||
10 | ATSS-R50 | no | no | 1x | 0.679 | --- | --- | lr=1e-3 |
|||
11 | ATSS-R50 (3fpn levels) | no | no | 1x | 0.656 | --- | --- | ||||
12 | ATSS-R50 | yes | no | 1x | 0.667 | --- | --- | ||||
13 | ATSS-R50 | no | yes | 2x | 0.728 | --- | --- | atss_r50_fpn_2x_street.py | 2x |
||
14 | ATSS-R50 | yes | yes | 2x | 0.75 | --- | --- | ||||
15 | ATSS-R50 | no | no | 2x | 0.747 | --- | --- | ||||
16 | ATSS-R50 - Focal KL loss | no | no | 1x | 0.607 | --- | --- | Focal KL Loss |
NOTE COCO trained weights are taken from MMDetection repo.
NOTE: The trained weights can be downloaded in section 6. Result Run following bash to visualize:
#!/usr/bin/env bash
set -e
CONFIG="configs/street/atss_r50_fpn_1x_street.py" # Path to your config file
CHECKPOINT="../trained_weight/atss_r50_fpn_1x_street_epoch_12.pth" # Path to checkpoint file. In this case, the checkpoint file is `atss_r50_fpn_1x_street_epoch_12.pth`
DATADIR="data/" # Path to your data directory
THR=0.5 # Detection threshold
OUTDIR="./cache/street" # Path to save output images
python tools/visualize_testset.py $CONFIG --ckpt $CHECKPOINT --data_dir $DATADIR --det_thr $THR --out_dir $OUTDIR --num_imgs 200
- Tuan Tang Ngoc - Develop object detection part
- Nam Cao Van - Develop object tracking and speed estimation part
- Ph.D. Hao Nguyen Vinh - Supervisor
@article{vdtse,
title = {Vehicles Detection Tracking Speed Estimation},
author = {Tuan Tang Ngoc, Nam Cao Van, Hao Nguyen Vinh},
journal= {https://github.com/TuanTNG/Vehicles-Detection-Tracking-Speed-estimation-pytorch-mmdet},
year={2020}
}