if you feel the tutorial is good! why not give us a star!⭐⭐⭐
This repository relates two main sections: Fundamentals and Practical Application, aiming to provide a comprehensive guide on model quantization in TensorRT.
Both the video and code for this section are completely open-source.
- Video Tutorial: Bilibili Link
- Code Repository: GitHub Link
- Principles of Model Quantization
- 1.1 Definition and Significance of Quantization
- 1.1.1 Model Weight Analysis
- 1.1.2 Importance of Quantization
- 1.2 Symmetric vs Asymmetric Quantization
- 1.2.1 Definition of Symmetric Quantization
- 1.2.2 Handwritten Code for Symmetric Quantization
- 1.2.3 Definition of Asymmetric Quantization
- 1.2.4 Handwritten Code for Asymmetric Quantization
- 1.3 Common Methods for Dynamic Range Calculation
- 1.3.1 Max
- 1.3.2 Histogram
- 1.3.3 Entropy
- 1.4 Introduction to PTQ and QAT
- 1.5 Handwriting a Quantized Program with Ops
- 1.1 Definition and Significance of Quantization
- TensorRT Quantization Library
- 2.1 Understanding Quantizer
- 2.2 Understanding InputQuant/MixQuant
- 2.3 Automatic Insertion of QDQ Nodes
- 2.4 Manual Insertion of QDQ Nodes
- 2.5 How to Quantize a Custom Layer
- 2.6 Sensitivity Layer Analysis
- 2.7 Pitfalls and Lessons Learned
The practical application section is paid content. Please visit the link below to purchase:
如果你觉得这个教程很赞,欢迎给星星哟!⭐⭐⭐
本仓库分为涉及两个部分:基础知识和实战应用,旨在全面讲解 TensorRT 下的模型量化。
该部分的视频和代码完全开源。
- 模型量化原理
- 1.1 量化的定义及意义
- 1.1.1 模型权重分析
- 1.1.2 量化的意义
- 1.2 对称量化与非对称量化
- 1.2.1 对称量化的定义
- 1.2.2 对称量化代码手写
- 1.2.3 非对称量化的定义
- 1.2.4 非对称量化代码手写
- 1.3 动态范围的常用计算方法
- 1.3.1 Max
- 1.3.2 Histgram
- 1.3.3 Entropy
- 1.4 PTQ 与 QAT 介绍
- 1.5 手写一个带 op 的量化程序
- 1.1 量化的定义及意义
- TensorRT Quantization Library
- 2.1 Quantizer 的理解
- 2.2 InputQuant/MixQuant 的理解
- 2.3 自动插入 QDQ 节点
- 2.4 手动插入 QDQ 节点
- 2.5 如何量化一个自定义层
- 2.6 敏感层分析
- 2.7 踩坑实录
实战部分内容需要付费购买,请访问以下链接: