This is modified from the PyTorch
implementation of YOLOv2(by Long Chen https://github.com/longcw/yolo2-pytorch).
For the video process, I used OpenCV and FFmpeg for real-time and frame-based transformation.
Real-time Object Detection works on 720p videos of different types.
The model is mainly based on darkflow and darknet.
Used a Cython extension for postprocessing and
multiprocessing.Pool
for image preprocessing.
Testing an image in VOC2007 costs about 13~20ms.
For details about YOLO and YOLOv2 please refer to their project page and the paper: YOLO9000: Better, Faster, Stronger by Joseph Redmon and Ali Farhadi.
- python 3.6
- Anaconda3
- pytorch 0.3.0+
- gcc
- cuda 8.0+
-
Clone this repository
git clone git@github.com:judichunt/yolo2-pytorch-realtime-video
-
Build the reorg layer (
tf.extract_image_patches
)cd yolo2-pytorch ./make.sh
-
Install opencv
conda install -c conda-forge opencv
-
Download the trained model yolo-voc.weights.h5
Set the model pathh5_fname
, andinput_dir
filename
of video file(.avi, .mpeg, ...) inrealtime_OD.py
-
Run
python realtime_OD.py
. The real-time Object Dectection video will be played as running the model.And the output video will be saved as the same type as your input video.
Follow the instructions in
YOLOv2 in PyTorch.