Add calibration based INT8 quantization to TensorRT EP #5842

stevenlix · 2020-11-18T05:20:49Z

Add calibration based INT8 support to TensorRT EP
Both native TRT calibration table and ORT tool generated calibration table are supported
Native TRT calibration table is only for the models that can run as a whole on native TRT
ORT generated calibration table can also work with models that only its subgraphs can run on TRT. Those subgraphs will run in INT8 precision in TRT EP if possible
Mixed precision (FP32/FP16/INT8) is supported.

docs/execution_providers/TensorRT-ExecutionProvider.md

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

jywu-msft · 2020-11-18T21:44:56Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

 static const std::string kDumpSubgraphs = "ORT_TENSORRT_DUMP_SUBGRAPHS";
 static const std::string kEngineCacheEnable = "ORT_TENSORRT_ENGINE_CACHE_ENABLE";
-static const std::string kEngineCachePath = "ORT_TENSORRT_ENGINE_CACHE_PATH";
+static const std::string kCachePath = "ORT_TENSORRT_CACHE_PATH";


this is a compatibility breaking change.
users who are already using previous name, may not know it has changed.
i think it would be safer to add another check to throw error if the previous name is found set in the environment variables.
or we don't change the name at all.

you can handle this in a subsequent PR?

stevenlix added 5 commits November 12, 2020 22:10

add int8

a687263

support both native TRT cal table and ORT cal table

cdbf13d

Merge branch 'master' into stevenlix/precision

e306daa

Merge branch 'master' into stevenlix/precision

85e03e9

add more comments

ef09f5e

stevenlix requested review from HectorSVC, jywu-msft and chilo-ms November 18, 2020 05:20

stevenlix requested a review from a team as a code owner November 18, 2020 05:20

stevenlix changed the title ~~Add calibration based INT8 to TensorRT EP~~ Add calibration based INT8 quantization to TensorRT EP Nov 18, 2020

jywu-msft reviewed Nov 18, 2020

View reviewed changes

docs/execution_providers/TensorRT-ExecutionProvider.md Outdated Show resolved Hide resolved

jywu-msft reviewed Nov 18, 2020

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Show resolved Hide resolved

jywu-msft reviewed Nov 18, 2020

View reviewed changes

Update env variable name and check platform availability for int8/fp16

1088137

jywu-msft approved these changes Nov 20, 2020

View reviewed changes

jywu-msft merged commit dfea929 into master Nov 20, 2020

jywu-msft deleted the stevenlix/precision branch November 20, 2020 01:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add calibration based INT8 quantization to TensorRT EP #5842

Add calibration based INT8 quantization to TensorRT EP #5842

stevenlix commented Nov 18, 2020 •

edited

Loading

jywu-msft Nov 18, 2020

jywu-msft Nov 20, 2020

Add calibration based INT8 quantization to TensorRT EP #5842

Add calibration based INT8 quantization to TensorRT EP #5842

Conversation

stevenlix commented Nov 18, 2020 • edited Loading

jywu-msft Nov 18, 2020

Choose a reason for hiding this comment

jywu-msft Nov 20, 2020

Choose a reason for hiding this comment

stevenlix commented Nov 18, 2020 •

edited

Loading