Releases: PaddlePaddle/FastDeploy
FastDeploy 1.0.0
1.0.0 Release Note
全场景高性能AI部署工具⚡️FastDeploy 1.0.0正式发布!🎉 支持飞桨及开源社区150+模型的多硬件高性能部署,为开发者提供简单全场景、简单易用、极致高效的全新部署体验!
多推理后端与多硬件支持
FastDeploy支持在多种硬件上以不同后端的方式进行推理部署,各后端模块可根据开发者需求灵活编译集成,自行编译参考 FastDeploy编译文档。
后端 | 平台 | 支持模型格式 | 支持硬件 |
---|---|---|---|
Paddle Inference | Linux(x64)/Windows(x64) | Paddle | x86 CPU/NVIDIA GPU/Jetson/GraphCore IPU |
Paddle Lite | Linux(aarch64/armhf)/Android | Paddle | Arm CPU/Kunlun R200/RV1126 |
Poros | Linux(x64) | TorchScript | x86 CPU/NVIDIA GPU |
OpenVINO | Linux(x64)/Windows(x64)/OSX(x86) | Paddle/ONNX | x86 CPU/Intel GPU |
TensorRT | Linux(x64/aarch64)/Windows(x64) | Paddle/ONNX | NVIDIA GPU/Jetson |
ONNX Runtime | Linux(x64/aarch64)/Windows(x64)/OSX(x86/arm64) | Paddle/ONNX | x86 CPU/Arm CPU/NVIDIA GPU |
除此之外,FastDeploy也基于Paddle.js 支持模型在网页前端及智能小程序部署工具,参阅 Web部署 了解更多细节。
丰富的AI模型端到端推理
FastDeploy支持如下飞桨模型套件的端到端部署
除飞桨开发套件外,FastDeploy同时支持了开源社区内热门深度学习模型的部署,目前v1.0共完成150+模型的支持,下表为部分重点模型的支持情况,阅读 部署示例 了解更多详细内容。
场景 | 支持模型 |
---|---|
图像分类 | ResNet/MobileNet/PP-LCNet/YOLOv5-Clas等系列模型 |
目标检测 | PP-YOLOE/PicoDet/RCNN/PP-YOLOE/YOLOv5/YOLOv6/YOLOv7/YOLOX/NanoDet等系列模型 |
语义分割 | PP-LiteSeg/PP-HumanSeg/DeepLabv3p/UNet等系列模型 |
图像/视频抠图 | PP-Matting/PP-Mattingv2/ModNet/RobustVideoMatting |
文字识别 | PP-OCRv2/PP-OCRv3 |
视频超分 | PP-MSVSR/BasicVSR/EDVR |
目标跟踪 | PP-Tracking |
姿态/关键点识别 | PP-TinyPose/HeadPose-FSANet |
人脸对齐 | PFLD/FaceLandmark1000/PIPNet等系列模型 |
人脸检测 | RetinaFace/UltraFace/YOLOv5-Face/SCRFD等系列模型 |
人脸识别 | ArcFace/CosFace/PartialFC/VPL/AdaFace等系列模型 |
语音合成 | PaddleSpeech 流式语音合成模型 |
语义表示 | PaddleNLP ERNIE 3.0 Tiny系列模型 |
信息抽取 | PaddleNLP 通用信息抽取UIE模型 |
文图生成 | Stable Diffusion |
高性能服务化部署
FastDeploy基于 Triton Inference Server 提供服务化部署能力。支持Paddle/ONNX模型在不同硬件以及不同后端上的高性能服务化部署体验。
自动化压缩与模型转换
PaddleSlim自动化压缩
FastDeploy基于 PaddleSlim 提供一键量化工具,通过如下命令快速完成模型的无损压缩加速。
fastdeploy compress --config_path=./configs/detection/yolov5s_quant.yaml \
--method='PTQ' --save_dir='./yolov5s_ptq_model/'
目前FastDeploy已完成量化模型与如下后端的适配测试
硬件/推理后端 | ONNX Runtime | Paddle Inference | TensorRT | Paddle Inference TensorRT | Paddle Lite |
---|---|---|---|---|---|
CPU | 支持 | 支持 | - | - | 支持 |
GPU | - | - | 支持 | 支持 | - |
RK1126 | - | - | - | - | 支持 |
自动压缩精度与性能对比如下表所示,精度近乎无损,性能最高提升400%
一键压缩的更多细节与使用方式,参阅FastDeploy一键压缩功能。
模型转换
为了便于对多框架模型的部署支持,FastDeploy预置了 X2Paddle 转换能力,在安装FastDeploy后,通过如下命令可快速完成转换,并通过FastDeploy部署。
fastdeploy convert --framework onnx --model yolov5s.onnx --save_dir yolov5s_paddle_model
更多使用方式,参阅FastDeploy模型转换。
端到端部署性能优化
FastDeploy在各模型的部署中,重点关注端到端到的部署体验和性能。在1.0版本中,FastDeploy在端到端进行如下优化
- 服务端对预处理过程进行融合,降低内存创建开销和计算量
- 移动端集成百度视觉技术部自研高性能图像处理库 FlyCV
结合FastDeploy多后端支持的优势,相较原有部署代码,所有模型端到端性能大幅提升,下表为其中部分模型的测试数据,
1.0.0 Release Note
We are excited to announce the release of ⚡️FastDeploy 1.0.0! 🎉 FastDeploy supports high performance end-to-end deployment for over 150 AI models from PaddlePaddle and other open source community on multiple hardware.
Multiple Inference Backend and Hardware Support
FastDeploy supports inference deployment on multiple hardware with different backends, each backend module can be flexibly compiled and integrated according to the developer's needs, please refer to FastDeploy compilation documentation。
Backend | Platform | Model Format | Supported Hardware in FastDeploy |
---|---|---|---|
Paddle Inference | Linux(x64)/Windows(x64) | Paddle | x86 CPU/NVIDIA GPU/GraphCore IPU |
Paddle Lite | Linux(aarch64/armhf)/Android | Paddle | Arm CPU/Kunlun R200/RV1126 |
Poros | Linux(x64)/Windows(x64) | TorchScript | x86 CPU/NVIDIA GPU |
OpenVINO | Linux(x64)/Windows(x64)/OSX(x86) | Paddle/ONNX | x86 CPU/Intel GPU |
TensorRT | Linux(x64/aarch64)/Windows(x64) | Paddle/ONNX | NVIDIA GPU/Jetson |
ONNX Runtime | Linux(x64/aarch64)/Windows(x64)/OSX(x86/arm64) | Paddle/ONNX | x86 CPU/Arm CPU/NVIDIA GPU |
In addition, FastDeploy also supports the deployment of models on the web and mini application based on Paddle.js, see Web Deployment for more details.
AI Model End-to-end Inference Support
FastDeploy supports end-to-end deployment of the following PaddlePaddle models are as follows:
- PaddleOCR deployment tutorial
- PaddleDetection deployment tutorial
- PaddleSeg deployment tutorial
- PaddleClas deployment tutorial
- PaddleGAN deployment tutorial
In addition, FastDeploy also supports the deployment of popular deep learning models in the open source community. over 150 models are currently supported in release 1.0, the table below shows some of the key models supported, refer to deployment examples for more details.
Task | Supported Models |
---|---|
Classification | ResNet/MobileNet/PP-LCNet/YOLOv5-Clas and other series models |
Object Detection | PP-YOLOE/PicoDet/RCNN/PP-YOLOE/YOLOv5/YOLOv6/YOLOv7/YOLOX/NanoDet and other series models |
Segmentation | PP-LiteSeg/PP-HumanSeg/DeepLabv3p/UNet and other series models |
Image/Video Matting | PP-Matting/PP-Mattingv2/ModNet/RobustVideoMatting |
OCR | PP-OCRv2/PP-OCRv3 |
Video Super-Resolution | PP-MSVSR/BasicVSR/EDVR |
Object Tracking | PP-Tracking |
Posture/Key-point Recognition | PP-TinyPose/HeadPose-FSANet |
Face Align | PFLD/FaceLandmark1000/PIPNet and other series models |
Face Detection | RetinaFace/UltraFace/YOLOv5-Face/SCRFD and other series models |
Face Recognition | ArcFace/CosFace/PartialFC/VPL/AdaFace and other series models |
Text-to-Speech | PaddleSpeech Streaming Speech Synthesis Model |
Semantic Representation | PaddleNLP ERNIE 3.0 series models |
Information Extraction | PaddleNLP Universal Information Extraction UIE model |
Content Generation | Stable Diffusion |
High Performance Serving Deployment
⚡️FastDeploy provides hig...
FastDeploy 0.8.0
0.8.0 Release Note
- 新增PIPNet、FaceLandmark1000人脸对齐模型的部署支持 详情
- 新增视频超分系列模型 PP-MSVSR、EDVR、BasicVSR 详情
- 升级YOLOv7部署代码,增加批量预测部署支持 #611
- 新增UIE服务化部署案例 详情
- 修复ArcFace示例代码中Cosine Similarity计算问题 #648
- [测试功能] 新增OpenVINO后端Device设置,支持集显/独立显卡的调用 #472
- 新增Android图像分类、目标检测、语义分割、OCR、人脸检测 APK工程及示例
图像分类 | 目标检测 | 语义分割 | 文字识别 | 人脸检测 |
---|---|---|---|---|
工程代码 | 工程代码 | 工程代码 | 工程代码 | 工程代码 |
扫码或点击链接安装试用 | 扫码或点击链接安装试用 | 扫码或点击链接安装试用 | 扫码或点击链接安装试用 | 扫码或点击链接安装试用 |
0.8.0 Release Note
- Support PIPNet, FaceLandmark1000 face alignment models deployment Details
- Support Video Super-Resolution series model PP-MSVSR、EDVR、BasicVSR Details
- Upgrade YOLOv7 deployment code to add
batch_predict
deployment #611 - Support UIE service-based deployment Details
- Fix a bug with the Cosine Similarity calculation in the ArcFace sample code #648
- [Test functions] Support OpenVINO backend Device settings, support for integrated/discrete graphics card #472
- Support Android image classification, target detection, semantic segmentation, OCR, face detection APK projects and examples
Image Classification | Object Detection | Semantic Segmentation | OCR | Face Detection |
---|---|---|---|---|
Project Code | Project Code | Project Code | Project Code | Project Code |
Scan the code or click on the link to install and try out | Scan the code or click on the link to install try out | Scan the code or click on the link to install and try out | Scan the code or click on the link to install and try out | Scan the code or click on the link to install and try out |
New Contributors
- @jm12138 made their first contribution in #613
- @Xiue233 made their first contribution in #633
- @ChrisKong93 made their first contribution in #648
Full Changelog: release/0.7.0...release/0.8.0
FastDeploy 0.7.0 Release Note
0.7.0 Release Note
- 新增Paddle Lite TIM-VX集成,支持RK1芯片上的部署 详情
- 人脸检测模型
SCRFD
模型新增RKNPU2的部署支持 部署示例 - 新增
Stable Diffusion
模型部署示例 部署示例 PaddleClas
/PaddleDetection
/YOLOv5
部署代码升级,支持predict
及batch_predict
- 支持大于2G以上的Paddle模型转ONNX部署
- 新增
PaddleClas
模型服务化部署案例 部署案例 - 针对
FDTensor
增加Pad function
操作符,支持在batch预测时,对输入进行Padding - 针对
FDTensor
增加Python APIto_dlpack
接口,支持FDTensor
在不同框架间的无拷贝传输
0.7.0 Release Note
- Integrate Paddle Lite TIM-VX for supporting hardware such as Rockchip RV1126 . Details
- Support Face detection model SCRFD on Rockchip RK3588, RK3568 and other hardware.
- Support Stable Diffusion model deployment.
- Upgrade PaddleClas、PaddleDetection、YOLOv5 deployment code to support
predict
andbatch_predict
; - Support for Paddle model to ONNX deployments larger than 2G.
- Support PaddleClas model service-based deployment.
- Add the Pad function operator for the FDTensor to support Padding of the input during batch prediction.
- Add Python API to_dlpack interface for FDTensor to support copyless transfer of FDTensor between frameworks.
New Contributors
- @GodIsBoom made their first contribution in #529
- @yingshengBD made their first contribution in #557
- @triple-Mu made their first contribution in #563
Full Changelog: release/0.6.0...release/0.7.0
FastDeploy 0.6.0 Release Note
0.6.0 Release Note
模型
服务化部署
- FastDeploy Runtime新增Clone接口支持,降低Paddle Inference/TensorRT/OpenVINO后端在多实例下内存/显存的使用
端侧部署
- 新增RKNPU2(3588)部署支持 详情
性能优化
- 优化YOLO系列、PaddleClas、PaddleDetection前后处理内存创建逻辑
- 融合视觉预处理操作,优化PaddleClas、PaddleDetection预处理性能
- 集成TensorRT BatchedNMSDynamic_TRT插件,提升TensorRT端到端部署性能
其它
- 修复若干文档问题
- 增加FastDeploy Runtime C++使用示例 详情
0.6.0 Release Note
Model
- Support FSANet head pose recognition model Details
- Support PFLD face alignment model Details
- PP-Tracking model adds track visualisation Details
- Support ERNIE text classification model Details
Service-based Deployment
- FastDeploy Runtime Adds Clone interface support for service-based deployment, reducing the memory、GPU memory usage of Paddle Inference、TensorRT、OpenVINO backend in multiple instances.
Edge Deployment
- Support RKNPU2(3588) Details.
Performance Optimisation
- Optimize preprocessing and postprocessing memory creation logic on YOLO series, PaddleClas, PaddleDetection.
- Integrate visual preprocessing operations, optimize the preprocessing performance of PaddleClas and PaddleDetection, and improve end-to-end performance.
- Integrating the TensorRT BatchedNMSDynamic_TRT plugin to improve the performance of TensorRT end-to-end deployments.
Others
- Fixing several documentation issues
- Adding FastDeploy Runtime C++ usage examples Details
New Contributors
- @rainyfly made their first contribution in #453
- @WinterGeng made their first contribution in #487
Full Changelog: release/0.4.0...release/0.6.0
FastDeploy 0.5.0
What's Changed
后端
- 新增通过Paddle Inference TensorRT推理支持
- 新增通过Paddle Inference在IPU硬件上的推理支持
- 解决原生TensorRT无法支持输入输出INT64数据问题
- ONNX Runtime、Paddle Inference、TensorRT后端添加多流支持
模型
其它
- 修复非固定Shape情况下PP-Matting的预测问题
- 修复语义分割模型Python可视化函数问题
- 修复部分模型使用文档
New Contributors
Full Changelog: release/0.4.0...release/0.5.0
FastDeploy 0.4.0
0.4.0版本新增Android移动端部署支持!
What's Changed
移动端部署
- 增加FastDeploy Android C++预测库,支持arm64-v8a和armeabi-v7a架构,详见 预编译库下载
- 增加目标检测模型PicoDet的Android部署,详见示例
- 增加图像分类PaddleClas系列模型的Android部署,详见示例
模型
- 优化YOLOv5/6/7 GPU部署端到端性能,通过YOLOv5::UseCudaPreprocessing()启用GPU前处理后,T4 GPU(TensorRT)上性能提升30%~50%,详见PR说明 #370
- 增加7个Web端js部署案例,详见js部署示例
- 增加TinyPose以及PicoDet+TinyPose串联Pipeline部署支持,详见示例
- 增加Torch Vision ResNet系列模型的部署支持,详见示例
- PPOCRSystemv2 & PPOCRSystemv3重命名为PPOCRv2 & PPOCRv3
- 优化PaddleSeg & PaddleOCR中部分模型警告信息
服务化部署
推理后端
- GPU部署增加
EnablePinedMemory
接口,支持Paddle Inference和TensorRT推理时,使用Pinned Memory,提升数据从GPU拷贝至CPU的传输生能,详见PR #403
文档(仍在完善中)
- 新上线Python API文档,详见 Python API文档
- 新上线C++ API文档,详见C++ API文档
New Contributors
- @HexToString made their first contribution in #384
- @wang-xinyu made their first contribution in #370
- @LDOUBLEV made their first contribution in #392
- @chenqianhe made their first contribution in #415
Full Changelog: release/0.3.0...release/0.4.0
FastDeploy v0.3.0
What's Changed
模型
量化加速
编译
- 支持用户环境指定自定义路径下的OpenCV、OpenVINO、ONNX Runtime编译依赖
- Mac x86上增加OpenVINO后端的编译支持
- 增加arm上Paddle-Lite的后端支持
- 支持Jetson上编译安装 参考文档
服务化部署
代码优化
- 解决模型Predict时修改传入图像的问题
- 增加TensorRT后端
max_workspace_size
设置接口 - 优化PaddleSeg部署模型在动态Shape下的提示信息
- 修复Windows上加载TensorRT序列化文件失败的问题
- 增加
fastdeploy_init.sh
和fastdeploy_init.bat
帮助开发者快速导入FastDeploy依赖库
New Contributors
- @onecatcn made their first contribution in #264
- @Zheng-Bicheng made their first contribution in #290
- @TrellixVulnTeam made their first contribution in #315
- @yeliang2258 made their first contribution in #257
Full Changelog: release/0.2.1...release/0.3.1
FastDeploy v0.2.1
What's Changed
模型
- 新增PaddleDetection MaskRCNN/PPYOLOE+/PPOCRv2/PPOCRv3/PPMatting等视觉模型端到端部署支持,详情参阅FastDeploy/examples/vision
- 新增UIE文本NLP模型端到端部署支持,详情参阅FastDeploy/examples/text
推理后端
- 新增OpenVINO推理后端,得益于OpenVINO团队的支持,大部分Paddle模型均已支持使用OpenVINO在CPU上加速推理
- TensorRT优化使用体验,无需再手动调用
SetTrtInputShape
设置输入范围,改为默认在推理过程中动态设置
参阅文档如何切换推理后端了解更多详情
使用体验
- 新增部分使用文档,包含编译、SDK使用等
- 优化Windows上编译,使用中的部分易用性问题
New Contributors
Full Changelog: release/0.2.0...release/0.2.1
FastDeploy v0.2.0
多推理后端支持
- 集成Paddle Inference、ONNX Runtime、TensorRT后端,并支持根据模型自动选择最佳推理后端。
- 支持源码编译,更灵活地选择后端,可参考 FastDeploy编译文档
更多视觉模型支持
- 新增YOLO全系列(YOLOv7/6/5等)模型在CPU/GPU以及TensorRT的部署支持
- 新增人像抠图,人脸检测,人脸识别等模型支持,更多详细信息可参考 FastDeploy视觉模型部署示例
文档优化
- 新增44个模型的Python/C++API文档及部署示例,更多内容参考 FastDeploy部署示例
FastDeploy v0.1.0
⚡️FastDeploy v0.1.0测试版发布!🎉
💎 发布40个重点模型在8种重点软硬件环境的支持的SDK
😊 支持网页端、pip包两种下载使用方式