Skip to content

Latest commit

 

History

History
executable file
·
176 lines (154 loc) · 19.4 KB

README_EN.md

File metadata and controls

executable file
·
176 lines (154 loc) · 19.4 KB

简体中文 | English

Introduction to SOPHON-DEMO

Introduction

SOPHON-DEMO is developed based on the SOPHONSDK interface and provides a series of samples for mainstream algorithms. It includes model compilation and quantization based on TPU-NNTC and TPU-MLIR, inference engine porting based on BMRuntime, and pre and post-processing algorithm migration based on BMCV/OpenCV.

SOPHONSDK is a custom deep learning SDK of SOPHGO based on its self-developed AI processor, covering model optimization, efficient runtime support, and other capabilities required for the inference phase of neural networks, providing an easy-to-use and efficient full-stack solution for deep learning application development and deployment. It is currently compatible with BM1684/BM1684X/BM1688(CV186X).

Directory Structure and Description.

The examples provided by SOPHON-DEMO are divided into three modules: tutorial, sample, and application, the tutorial module stores some examples of basic interfaces, the sample module stores some serial examples of classic algorithms on SOPHONSDK, and the application module stores some typical applications in typical scenarios.

tutorial introduction
resize resize api usage, rescale image data.
crop crop api usage, crop the target area from input image.
crop_and_resize_padding crop target area from input image, and resize the crop, and padding in another image, fill the padding with a pix which can be customly set.
ocv_jpgbasic decoding and encoding jpgs using sophon-opencv which is hardware accelerated.
ocv_vidbasic decoding video using sophon-opencv which is hardware accelerated, recording frames as jpgs or pngs.
blend blend two pictures.
stitch stitch two pictures.
avframe_ocv from avframe to cv::Mat.
ocv_avframe from bgr cv::mat to yuv420p avframe.
bm1688_2core2task_yolov5 yolov5 deployment using the 2core-2task feature of bm1688.
mmap mmap api, map TPU memory to CPU.
video_encode video encode and stream push.
contents category code BModel
LPRNet License Plate Recognition C++/Python FP32/FP16/INT8
ResNet Image classification C++/Python FP32/FP16/INT8
RetinaFace Face Detection C++/Python FP32/FP16/INT8
SCRFD Face Detection C++/Python FP32/FP16/INT8
segformer Semantic Segmentation C++/Python FP32/FP16
SAM Semantic Segmentation Python FP32/FP16
SAM2 Semantic Segmentation Python FP32/FP16
yolact Instance Segmentation C++/Python FP32/FP16/INT8
YOLOv8_seg Instance Segmentation C++/Python FP32/FP16/INT8
YOLOv9_seg Instance Segmentation C++/Python FP32/FP16/INT8
PP-OCR OCR C++/Python FP32/FP16
OpenPose Keypoint Detection C++/Python FP32/FP16/INT8
YOLOv8_pose Keypoint Detection C++/Python FP32/FP16/INT8
HRNet_pose Keypoint Detection C++/Python FP32/FP16/INT8
C3D Video Recognition C++/Python FP32/FP16/INT8
SlowFast Video Recognition C++/Python FP32/FP16/INT8
DeepSORT Object Tracking C++/Python FP32/FP16/INT8
ByteTrack Object Tracking C++/Python FP32/FP16/INT8
CenterNet Object Detection + Pose Estimation C++/Python FP32/FP16/INT8
YOLOv5 Object Detection C++/Python FP32/FP16/INT8
YOLOv34 Object Detection C++/Python FP32/INT8
YOLOX Object Detection C++/Python FP32/INT8
SSD Object Detection C++/Python FP32/INT8
YOLOv7 Object Detection C++/Python FP32/FP16/INT8
YOLOv8_det Object Detection C++/Python FP32/FP16/INT8
YOLOv5_opt Object Detection C++/Python FP32/FP16/INT8
YOLOv5_fuse Object Detection C++/Python FP32/FP16/INT8
YOLOv9_det Object Detection C++/Python FP32/FP16/INT8
YOLOv10 Object Detection C++/Python FP32/FP16/INT8
YOLOv11_det Object Detection C++/Python FP32/FP16/INT8
ppYOLOv3 Object Detection C++/Python FP32/FP16/INT8
ppYoloe Object Detection C++/Python FP32/FP16
YOLOv8_obb Oriented Object Detection C++/Python FP32/FP16
WeNet Speech Recognition C++/Python FP32/FP16
Whisper Speech Recognition Python FP16
Seamless Speech Recognition Python FP32/FP16
BERT Language C++/Python FP32/FP16
ChatGLM2 Large Language Model C++/Python FP16/INT8/INT4
Llama2 Large Language Model C++ FP16/INT8/INT4
ChatGLM3 Large Language Model Python FP16/INT8/INT4
Qwen Large Language Model Python FP16/INT8/INT4
MiniCPM Large Language Model C++ INT8/INT4
Baichuan2 Large Language Model Python INT8/INT4
ChatGLM4 Large Language Model Python FP16/INT8/INT4
StableDiffusionV1.5 Image Generation Python FP32/FP16
StableDiffusionXL Image Generation Python FP32/FP16
FLUX.1 Image Generation Python FP32/INT4
GroundingDINO MultiModal Object Detection Python FP16
Qwen-VL-Chat Large Vision Language Model Python FP16/INT8
InternVL2 Large Vision Language Model Python INT4
Real-ESRGAN Super Resolution C++/Python FP32/FP16/INT8
P2PNet Crowd Counting C++/Python FP32/FP16/INT8
CLIP Image Captioning C++/Python FP16
BLIP Large Image-Text Model Python FP32
SuperGlue Keypoint Matching C++ FP32/FP16
VITS_CHINESE Text To Speech Python FP32/FP16
DirectMHP Head pose estimation C++/Python FP32/FP16
application scenarios code
VLPR Multi-streams Vehicle License Plate Recognition C++/Python
YOLOv5_multi Multi-streams Object Detection C++
YOLOv5_multi_QT Multi-streams Object Detection + QT_HDMI display C++
Grounded-sam Automatic image detection and segmentation system Python
cv-demo Bilingual Fisheye and Wide-angle Stitching C++
YOLOv5_fuse_multi_QT Multi-streams Object Detection + QT_HDMI display C++

Release Notes

version description
0.2.6 Fix documentation and other issues. Release new samples including YOLOv11_det/FLUX.1/SlowFast/YOLOv8_obb.
0.2.5 Fix documentation and other issues. Remove all samples' common dependencies. Release new samples including SAM2/HRNet_pose/InternVL2/BLIP/DirectMHP/VITS_CHINESE, new applications cv-demo,YOLOv5_fuse_multi_QT.
0.2.4 Fix documentation and other issues. Fix host memory leak in VideoDecFFM. Release new samples including YOLOv8_pose/Qwen-VL-Chat, new application Grounded-sam.
0.2.3 Fix documentation and other issues. Release new samples including StableDiffusionXL/ChatGLM4/Seamless/YOLOv10, new tutorials including mmap/video_encode.
0.2.2 Fix documentation and other issues, some examples support CV186X. Release new samples including Whisper/Real-ESRGAN/SCRFD/P2PNet/MiniCPM/CLIP/SuperGlue/YOLOv5_fuse/YOLOv8_seg/YOLOv9_seg/Baichuan2, new tutorials including avframe_ocv/ocv_avframe/bm1688_2core2task_yolov5.
0.2.1 Fix documentation and other issues, some examples support CV186X, sample/YOLOv5 support SG2042, release new samples GroundingDINO and Qwen1_5, StableDiffusionV1_5 newly support multilize resolution models, Qwen/Llama2/ChatGLM3 add web and multi-session support. tutorial module add blend and stitch examples.
0.2.0 Fix documentation and other issues, release application/tutorial modules, release new samples ChatGLM3 and Qwen, add a web ui in SAM, BERT/ByteTrack/C3D support BM1688, YOLOv8 is renamed to YOLOv8_det and add cpp postproces acceleration, optimize auto_test in commonly used samples, upgrade TPU-MLIR installation to pip
0.1.10 Fix documentation and other issues, add ppYoloe/YOLOv8_seg/StableDiffusionV1.5/SAM, refactor yolact, CenterNet/YOLOX/YOLOv8 support BM1688, YOLOv5/ResNet/PP-OCR/DeepSORT add BM1688 performance statis, WeNet provide C++ cross compile option.
0.1.9 Fix documentation and other issues, add segformer/YOLOv7/Llama2, refactor YOLOv34/YOLOv5/ResNet/PP-OCR/DeepSORT/LPRNet/RetinaFace/YOLOv34/WeNet support BM1688, OpenPose postprocess acceleration, chatglm2 support int8/int4 and add compile method in readme.
0.1.8 Fix documentation and other issues, added BERT/ppYOLOv3/ChatGLM2, refactor YOLOX, added beam search to PP-OCR, added tpu-kernel post-processing acceleration to OpenPose, and updated the SFTP download method.
0.1.7 Fix documentation and other issues, some demos support BM1684 mlir, refactor PP-OCR/CenterNet, sail support YOLOv5.
0.1.6 Fix documentation and other issues, add ByteTrack/YOLOv5_opt/WeNet samples.
0.1.5 Fix documentation and other issues, add DeepSORT sample, refactor ResNet/LPRNet samples.
0.1.4 Fix documentation and other issues, add C3D and YOLOv8 samples
0.1.3 Add OpenPose sample, refactor YOLOv5 sample (including supporting arm PCIe, supporting TPU-MLIR to compile BM1684X model, using ffmpeg component to replace opencv decoding, etc.)
0.1.2 Fix documentation and other issues, refactor SSD related samples, LPRNet/cpp/lprnet_bmcv use ffmpeg component to replace opencv decoding
0.1.1 Fix documentation and other issues, refactor LPRNet/cpp/lprnet_bmcv with BMNN related classes
0.1.0 Provide LPRNet and other 10 samples, support BM1684X (x86 PCIe, SoC), BM1684 (x86 PCIe, SoC)

Environment dependencies

SOPHON-DEMO mainly depends on TPU-MLIR, TPU-NNTC, LIBSOPHON, SOPHON-FFMPEG, SOPHON-OPENCV, SOPHON-SAIL, for BM1684/BM1684X SOPHONSDK, version requirements are as follows:

SOPHON-DEMO TPU-MLIR TPU-NNTC LIBSOPHON SOPHON-FFMPEG SOPHON-OPENCV SOPHON-SAIL SOPHONSDK
0.2.6 >=1.10 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.8.0 >=v24.04.01
0.2.5 >=1.9 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.7.0 >=v24.04.01
0.2.4 >=1.9 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.7.0 >=v24.04.01
0.2.3 >=1.8 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.7.0 >=v24.04.01
0.2.2 >=1.8 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.7.0 >=v23.10.01
0.2.1 >=1.7 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.7.0 >=v23.10.01
0.2.0 >=1.6 >=3.1.7 >=0.5.0 >=0.7.3 >=0.7.3 >=3.7.0 >=v23.10.01
0.1.10 >=1.2.2 >=3.1.7 >=0.4.6 >=0.6.0 >=0.6.0 >=3.7.0 >=v23.07.01
0.1.9 >=1.2.2 >=3.1.7 >=0.4.6 >=0.6.0 >=0.6.0 >=3.7.0 >=v23.07.01
0.1.8 >=1.2.2 >=3.1.7 >=0.4.6 >=0.6.0 >=0.6.0 >=3.6.0 >=v23.07.01
0.1.7 >=1.2.2 >=3.1.7 >=0.4.6 >=0.6.0 >=0.6.0 >=3.6.0 >=v23.07.01
0.1.6 >=0.9.9 >=3.1.7 >=0.4.6 >=0.6.0 >=0.6.0 >=3.4.0 >=v23.05.01
0.1.5 >=0.9.9 >=3.1.7 >=0.4.6 >=0.6.0 >=0.6.0 >=3.4.0 >=v23.03.01
0.1.4 >=0.7.1 >=3.1.5 >=0.4.4 >=0.5.1 >=0.5.1 >=3.3.0 >=v22.12.01
0.1.3 >=0.7.1 >=3.1.5 >=0.4.4 >=0.5.1 >=0.5.1 >=3.3.0 -
0.1.2 Not support >=3.1.4 >=0.4.3 >=0.5.0 >=0.5.0 >=3.2.0 -
0.1.1 Not support >=3.1.3 >=0.4.2 >=0.4.0 >=0.4.0 >=3.1.0 -
0.1.0 Not support >=3.1.3 >=0.3.0 >=0.2.4 >=0.2.4 >=3.1.0 -

For BM1688/CV186AH SOPHONSDK, version requirements are as follows:

SOPHON-DEMO TPU-MLIR LIBSOPHON SOPHON-FFMPEG SOPHON-OPENCV SOPHON-SAIL SOPHONSDK
0.2.6 >=1.10 >=0.4.9 >=1.7.0 >=1.7.0 >=3.8.0 >=v1.7.0
0.2.5 >=1.9 >=0.4.9 >=1.7.0 >=1.7.0 >=3.8.0 >=v1.7.0
0.2.4 >=1.9 >=0.4.9 >=1.7.0 >=1.7.0 >=3.8.0 >=v1.7.0
0.2.3 >=1.8 >=0.4.9 >=1.7.0 >=1.7.0 >=3.8.0 >=v1.7.0
0.2.2 >=1.8 >=0.4.9 >=1.6.0 >=1.6.0 >=3.8.0 >=v1.6.0
0.2.1 >=1.7 >=0.4.9 >=1.5.0 >=1.5.0 >=3.8.0 >=v1.5.0
0.2.0 >=1.6 >=0.4.9 >=1.5.0 >=1.5.0 >=3.7.0 >=v1.5.0

Note:

  1. The version requirements may vary from sample to sample, depending on the README of the routine, and other third-party libraries may need to be installed.
  2. SDK for BM1688(CV186X) is not the same as BM1684/BM1684X, it is distinguished on our official site, please pay attention.

Technical Data

Please get the related documents, materials and video tutorials through Technical Materials on the official website of SOPHGO.

Community

The SOPHGO community encourages developers to communicate and learn together. Developers can communicate and learn through the following channels.

SOPHGO community website: https://www.sophgo.com/

SOPHGO Developer Forum: https://developer.sophgo.com/forum/index.html

Contribution

Contributions are welcome. For more details, please refer to our Contributor Wiki.

License

Apache License 2.0