Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

Deploy

In this section, we are trying to deploy light-weight, pruned or quantized YOLOv5 via different inference framework on different devices.

TensorRT

1. Install

pip install -U nvidia-tensorrt --index-url https://pypi.ngc.nvidia.com

2. Export

python export_onnx_trt.py --weights yolov5s.pt --device 0

Now get yolov5s.engine.

3. Detect

python export_onnx_trt.py --weights yolov5s.engine --device 0

ncnn

1. Install

Check Tencent/ncnn for help.

2. Export

python export_onnx_trt.py --weights yolov5s.pt --device 0 --train --simplify

Now get yolov5s.onnx. Then use NCNN's onnx2ncnn tool to convert *.onnx to *.param and *.bin.
Navigate to ncnn/build/tools/onnx, run

./onnx2ncnn yolov5s.onnx yolov5s.param yolov5s.bin

You can also use NCNN's ncnnoptimize tool to reduce model size.

3. Detect

We provide yolov5_ncnn.cpp for detection and timekeeping. Build it with NCNN, and run

./yolov5 test.jpg yolov5s

Experiment

On RTX3090

Backend Model File Size latency(ms per img)
TensorRT YOLOv5s 17M 2.3
TensorRT YOLOv5s-EagleEye@0.6 11M 2.0
TensorRT YOLOv5l-MobileNetv3Small 44M 2.9
TensorRT YOLOv5l-EfficientNetLite0 47M 3.0
ncnn(Vulkan) YOLOv5s 14M 235
ncnn(Vulkan) YOLOv5s-EagleEye@0.6 7.5M 215

Input size is 640x640.

On Jetson Xavier NX

Backend Model File Size latency(ms per img)
ncnn(Vulkan) YOLOv5s 14M 520
ncnn(Vulkan) YOLOv5s-EagleEye@0.6 7.5M 610

Input size is 640x640.

More statistics is coming soon...