Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis

Authors: Jingwei Guo, Yitai Cheng, Meihui Wang, Ilya Ilyankou, Natchapon Jongwiriyanurak, Xiaowei Gao, Nicola Christie, James Haworth

This package is used for object detection, object tracking, and overtaking behaviour detection on panoramic (360°) equirectangular videos, initially developed as part of Jingwei Guo's MSc thesis.

The approach improves detection by projecting equirectangular frames into four overlapping perspective sub-images, applying detectors, and then reprojecting and merging bounding boxes to handle distortions and long objects. YOLOv12 models pre-trained on the COCO dataset are used as detectors. Tracking is based on StrongSORT (see https://github.com/yitai-cheng/StrongSORT), modified to incorporate object category information and boundary continuity, reducing false positives and ID switches in panoramic views. The overtaking detection module builds on these tracking results, identifying and classifying overtaking manoeuvres by vehicles around cyclists.

Dependencies and Installation

The library should be run under Python 3.8+ with the following libraries installed:

detectron2 (version updated before Aug 5, 2022 only)

First, clone the repository:

git clone https://github.com/SpaceTimeLab/360_object_tracking

To install all the dependencies (except Detectron2), run the following command in a new conda environment called, for example, 360:

conda create --name 360 -c conda-forge python=3.8
conda install pip
pip install -r requirements.txt

Since in the new versions of Detectron2 (updated after Aug 5, 2022), some APIs have been modified, here we install an old version of it:

pip install -e git+https://github.com/facebookresearch/detectron2.git@5aeb252b194b93dc2879b4ac34bc51a31b5aee13#egg=detectron2

pip install pillow==9.5.0 # see https://github.com/facebookresearch/detectron2/issues/5010#issuecomment-1752284625

Download the pre-trained ReID network used in DeepSORT:

cd deep_sort/deep/checkpoint
pip install gdown
gdown 'https://drive.google.com/uc?export=download&id=1_qwTWdzT9dWNudpusgKavj_4elGgbkUN'
cd ../../../

Main functionality

The implementation process of each functionality (object detection, object tracking, and overtaking detection) is explained in detail in Code Explanation.ipynb.

360 Object Detection

To realize object detection on panoramic videos of equirectangular projection, execute Object_Detection.py in the Terminal as below:

python Object_Detection.py [--input_video_path INPUT_VIDEO_PATH] [--output_video_path OUTPUT_VIDEO_PATH] [--classes_to_detect CLASSES_TO_DETECT] [--FOV FOV] [--THETAs THETAS] [--PHIs PHIS] [--sub_image_width SUB_IMAGE_WIDTH] [--model_type MODEL_TYPE] [--score_threshold SCORE_THRESHOLD] [--nms_threshold NMS_THRESHOLD] [--use_mymodel USE_MYMODEL]

The following arguments are provided:

Argument	Description	Required?	Defaults
INPUT_VIDEO_PATH	Path of the input video	✔️
OUTPUT_VIDEO_PATH	Path of the output video	✔️
CLASSES_TO_DETECT	Index numbers of the categories to detect in the COCO dataset		[0, 1, 2, 3, 5, 7, 9]
FOV	Field of view of the sub images		120
THETAS	A list which contains the theta of each sub image (The length should be the same as the number of sub images)		[0, 90, 180, 270]
PHIS	A list which contains the Phi of each sub image (The length should be the same as the number of sub images)		[-10, -10, -10, -10]
SUB_IMAGE_WIDTH	Width (or height) of the sub images		640
MODEL_TYPE	A string that determines which detector to use ("YOLO" or "Faster RCNN")		"YOLO"
SCORE_THRESHOLD	The threshold of the confidence score		0.4
NMS_THRESHOLD	The threshold of the Non Maximum Suppression		0.45
USE_MYMODEL	A boolean value which determines whether to use the improved object detection model, if False, instead of being split into 4 parts, the image will be detected as a whole		True
short_edge_size	The length of short edge		0

360 Object Tracking

To realize object tracking on panoramic videos of equirectangular projection, execute Object_Tracking.py in the Terminal as below:

python Object_Tracking.py [--input_video_path INPUT_VIDEO_PATH] [--output_video_path OUTPUT_VIDEO_PATH] [--MOT_text_path MOT_TEXT_PATH] [--prevent_different_classes_match PREVENT_DIFFERENT_CLASSES_MATCH] [--match_across_boundary MATCH_ACROSS_BOUNDARY] [--classes_to_detect CLASSES_TO_DETECT] [--FOV FOV] [--THETAs THETAS] [--PHIs PHIS] [--sub_image_width SUB_IMAGE_WIDTH] [--model_type MODEL_TYPE] [--score_threshold SCORE_THRESHOLD] [--nms_threshold NMS_THRESHOLD] [--use_mymodel USE_MYMODEL]

The following arguments are provided:

Argument	Description	Required?	Defaults
INPUT_VIDEO_PATH	Path of the input video	✔️
OUTPUT_VIDEO_PATH	Path of the output video	✔️
PREVENT_DIFFERENT_CLASSES_MATCH	A boolean value which determines whether to use the support for multiple categories in DeepSORT		True
MATCH_ACROSS_BOUNDARY	A boolean value which determines whether to use the support for boundary continuity in DeepSORT		True
CLASSES_TO_DETECT	Index numbers of the categories to detect in the COCO dataset		[0, 1, 2, 3, 5, 7, 9]
FOV	Field of view of the sub images		120
THETAS	A list which contains the theta of each sub image (The length should be the same as the number of sub images)		[0, 90, 180, 270]
PHIS	A list which contains the Phi of each sub image (The length should be the same as the number of sub images)		[-10, -10, -10, -10]
SHORT_EDGE_SIZE	Width (or height) of the sub images		1280
MODEL_TYPE	A string that determines which detector to use ("YOLO" or "Faster RCNN")		"YOLO"
SCORE_THRESHOLD	The threshold of the confidence score		0.4
NMS_THRESHOLD	The threshold of the Non Maximum Suppression		0.45
USE_MYMODEL	A boolean value which determines whether to use the improved object detection model, if False, instead of being split into 4 parts, the image will be detected as a whole		True

360 Overtaking Behaviour Detection

To realize overtaking behaviour detection on panoramic videos of equirectangular projection, execute Overtaking_Detection.py in the Terminal as below:

python Overtaking_Detection.py [--input_video_path INPUT_VIDEO_PATH] [--output_video_path OUTPUT_VIDEO_PATH] [--mode MODE] [--prevent_different_classes_match PREVENT_DIFFERENT_CLASSES_MATCH] [--match_across_boundary MATCH_ACROSS_BOUNDARY] [--classes_to_detect CLASSES_TO_DETECT] [--classes_to_detect_movement CLASSES_TO_DETECT_MOVEMENT] [--size_thresholds SIZE_THRESHOLDS] [--FOV FOV] [--THETAs THETAS] [--PHIs PHIS] [--sub_image_width SUB_IMAGE_WIDTH] [--model_type MODEL_TYPE] [--score_threshold SCORE_THRESHOLD] [--nms_threshold NMS_THRESHOLD] [--use_mymodel USE_MYMODEL]

The following arguments are provided:

Argument	Description	Required?	Defaults
INPUT_VIDEO_PATH	Path of the input video	✔️
OUTPUT_VIDEO_PATH	Path of the output video	✔️
MODE	A string that determines which kind of overtaking behaviour to detect, "Confirmed" or "Unconfirmed"		"Confirmed"
PREVENT_DIFFERENT_CLASSES_MATCH	A boolean value which determines whether to use the support for multiple categories in DeepSORT		True
MATCH_ACROSS_BOUNDARY	A boolean value which determines whether to use the support for boundary continuity in DeepSORT		True
CLASSES_TO_DETECT	Index numbers of the categories to detect in the COCO dataset		[0, 1, 2, 3, 5, 7, 9]
CLASSES_TO_DETECT_MOVEMENT	Index numbers of the categories for movement detection in the COCO dataset, which should be a subset of classes_to_detect		[2, 5, 7]
SIZE_THRESHOLDS	A set of size thresholds which should share the same length with classes_to_detect_movement, if the size of a track of a certain class is larger than the corresponding threshold, then it is considered as close to the user		[500 * 500, 900 * 900, 600 * 600]
FOV	Field of view of the sub images		120
THETAS	A list which contains the theta of each sub image (The length should be the same as the number of sub images)		[0, 90, 180, 270]
PHIS	A list which contains the Phi of each sub image (The length should be the same as the number of sub images)		[-10, -10, -10, -10]
SUB_IMAGE_WIDTH	Width (or height) of the sub images		1280
MODEL_TYPE	A string that determines which detector to use ("YOLO" or "Faster RCNN")		"YOLO"
SCORE_THRESHOLD	The threshold of the confidence score		0.4
NMS_THRESHOLD	The threshold of the Non Maximum Suppression		0.45
USE_MYMODEL	A boolean value which determines whether to use the improved object detection model, if False, instead of being split into 4 parts, the image will be detected as a whole		True

Examples

For better understanding, several examples of using this package is listed as below:

To detect the bicycles, cars and motorbikes ([1, 2, 3] in COCO) in a video called test.mp4 with the original Faster RCNN and output the result video as test_object_detection.mp4, run the following command:

python Object_Detection.py --input_video_path test.mp4 --output_video_path test_object_detection.mp4 --use_mymodel True  --model_type "YOLO"

To track the people and cars ([0, 2] in COCO) in a video called test.mp4 with the improved YOLO v5 whose input resolution is 1280, and to output the result video and MOT texts as test_object_tracking.mp4 and test_object_tracking.txt, run the following command:

python Object_Tracking.py --input_video_path test.mp4 --output_video_path test_object_tracking.mp4 --short_edge_size 1280

To track people, bicycles, cars, motorbikes, buses, trucks and traffic lights ([0, 1, 2, 3, 5, 7, 9] in COCO) in a video called test.mp4 and detect the close unconfirmed overtakes (size<160000) of only cars with the improved YOLO v5 whose input resolution is 640, and to output the result video as test_overtaking_detection.mp4, run the following command:

python Overtaking_Detection.py --input_video_path test.mp4 --output_video_path test_overtaking_detection.mp4 --mode 'Unconfirmed' --classes_to_detect_movement 2 5 7

Model evaluation

See evaluation_code/ for evaluation scripts. They:

Load ground truth annotations and compare against predictions
Compute standard COCO metrics (AP, AR) using cocoeval
Evaluate tracking performance using py-motmetrics
Run overtaking detection on the video dataset and compare the result with ground truth

To cite

If you find the project useful in your research, please consider citing:

@misc{guo2024multipleobjectdetectiontracking,
      title={Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis},
      author={Jingwei Guo and Yitai Cheng and Meihui Wang and Ilya Ilyankou and Natchapon Jongwiriyanurak and Xiaowei Gao and Nicola Christie and James Haworth},
      year={2024},
      eprint={2407.15199},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2407.15199},
}

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
deep_sort		deep_sort
detectron2 @ 909b0c3		detectron2 @ 909b0c3
evaluation_code		evaluation_code
images_in_markdown		images_in_markdown
lib		lib
panoramic_detection		panoramic_detection
strongsort @ 1a59c93		strongsort @ 1a59c93
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
Code Explanation.ipynb		Code Explanation.ipynb
Object_Detection.py		Object_Detection.py
Object_Tracking.py		Object_Tracking.py
Overtaking_Detection.py		Overtaking_Detection.py
README.md		README.md
extract_and_visualise.py		extract_and_visualise.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis

Dependencies and Installation

Main functionality

360 Object Detection

360 Object Tracking

360 Overtaking Behaviour Detection

Examples

Model evaluation

To cite

About

Uh oh!

Releases

Packages

Languages

SpaceTimeLab/360_object_tracking

Folders and files

Latest commit

History

Repository files navigation

Multiple Object Detection and Tracking in Panoramic Videos for Cycling Safety Analysis

Dependencies and Installation

Main functionality

360 Object Detection

360 Object Tracking

360 Overtaking Behaviour Detection

Examples

Model evaluation

To cite

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages