Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add calibration based INT8 quantization to TensorRT EP #5842

Merged
merged 6 commits into from
Nov 20, 2020

Conversation

stevenlix
Copy link
Contributor

@stevenlix stevenlix commented Nov 18, 2020

  1. Add calibration based INT8 support to TensorRT EP
  2. Both native TRT calibration table and ORT tool generated calibration table are supported
    Native TRT calibration table is only for the models that can run as a whole on native TRT
    ORT generated calibration table can also work with models that only its subgraphs can run on TRT. Those subgraphs will run in INT8 precision in TRT EP if possible
  3. Mixed precision (FP32/FP16/INT8) is supported.

@stevenlix stevenlix requested a review from a team as a code owner November 18, 2020 05:20
@stevenlix stevenlix changed the title Add calibration based INT8 to TensorRT EP Add calibration based INT8 quantization to TensorRT EP Nov 18, 2020
static const std::string kDumpSubgraphs = "ORT_TENSORRT_DUMP_SUBGRAPHS";
static const std::string kEngineCacheEnable = "ORT_TENSORRT_ENGINE_CACHE_ENABLE";
static const std::string kEngineCachePath = "ORT_TENSORRT_ENGINE_CACHE_PATH";
static const std::string kCachePath = "ORT_TENSORRT_CACHE_PATH";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a compatibility breaking change.
users who are already using previous name, may not know it has changed.
i think it would be safer to add another check to throw error if the previous name is found set in the environment variables.
or we don't change the name at all.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can handle this in a subsequent PR?

@jywu-msft jywu-msft merged commit dfea929 into master Nov 20, 2020
@jywu-msft jywu-msft deleted the stevenlix/precision branch November 20, 2020 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants