Improve TensorRT GetCapability to Enable More Models #1012

stevenlix · 2019-05-11T01:29:40Z

No description provided.

jywu-msft · 2019-05-11T15:45:33Z

seems like windows tensorrt build fails

jywu-msft · 2019-05-11T15:55:30Z

if GetCapability() is now reliable, doesn't this mean we can re-enable many of the TRT disabled unit tests?
The ultimate validation would be if the TensorRT Execution provider exclusions in #802 can be removed (or kept at an absolute minimum)

stevenlix · 2019-05-12T00:09:09Z

Yes. I think most of the disabled unit tests could be removed. But if we keep those tests disabled people know what tests can't run on TensorRT, otherwise we just lose the tracking since those tests just fall back to other execution providers.

jywu-msft · 2019-05-12T00:37:51Z

We do not want to leave tests disabled. It was only done as a temporary workaround until GetCapability is fixed.
The purpose of the unit tests is to ensure correctness. Falling back via GetCapability is still a correct/valid result. If we leave the tests disabled, there's no validation that the GetCapability() changes in this PR are working as expected. And it prevents future regressions in GetCapability().
Have all unit tests pass (without disabling any tests) gives us confidence that TRT provider can handle real models without crashing.

Yes. I think most of the disabled unit tests could be removed. But if we keep those tests disabled people know what tests can't run on TensorRT, otherwise we just lose the tracking since those tests just fall back to other execution providers.

jywu-msft · 2019-05-13T16:35:44Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

  }
-
+
+  std::unique_ptr<IndexedSubGraph> GetSubGraph(SubGraph_t graph_nodes_index, int& kernels_index,


it would be good to have explicit unit tests of these new methods (GetSubGraph, GetSupportedList)

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

jywu-msft · 2019-05-13T16:42:30Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

+  std::unique_ptr<IndexedSubGraph> GetSubGraph(SubGraph_t graph_nodes_index, int& kernels_index,
+                                               const onnxruntime::GraphViewer& graph) const;
+
+  SubGraphCollection_t GetSupportedList(SubGraphCollection_t supported_nodes_list, int iterations, const int& max_iterations,


why is max_iterations a reference?

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h

jywu-msft · 2019-05-13T16:48:29Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc


-        CHECK_CUDA(cudaMemcpy(output_tensors[output_index].data, buffers[i + num_binding_inputs], batch_size * output_dim_sizes[i] * sizeof(float), cudaMemcpyDeviceToHost));
+        int output_size = batch_size * output_dim_sizes[i];
+        if (output_types[i] == TensorProto::FLOAT) {


add comments for what we're doing here.

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

…time into stevenlix/iter

onnxruntime/test/providers/cpu/math/gemm_test.cc

jywu-msft · 2019-05-15T20:02:05Z

I see TRTProvider exclusions were removed in gemm_test.cc
what about other tests? there were quite a few enabled in #802
will we be able to remove all the exclusions, or are there still some failures in GetCapability?

jywu-msft · 2019-05-22T00:13:38Z

onnxruntime/test/providers/cpu/tensor/upsample_op_test.cc

@@ -173,7 +174,7 @@ TEST(UpsampleOpTest, UpsampleOpNearest222XTest) {
  };

  test.AddOutput<float>("Y", {N*2, C, (int64_t)(H * scales[2]), (int64_t)(W * scales[3])}, Y);
-  test.Run(OpTester::ExpectResult::kExpectSuccess, "", {kTensorrtExecutionProvider});//TensorRT parser: Assertion failed: scales[0] == 1 && scales[1] == 1
+  test.Run();//TensorRT parser: Assertion failed: scales[0] == 1 && scales[1] == 1


should these assertion failure comments be removed if we are re-enabling the TensorrtExecutionProvider for these tests?

same comment for other tests.

removed the comments

.gitmodules

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

jywu-msft · 2019-05-22T19:40:05Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

        OrtValue* output_tensor = ort.KernelContext_GetOutput(context, output_index, output_shapes[i].data(), output_shapes[i].size());
-        if (output_types[i] == TensorProto::FLOAT) {        
-          CHECK_CUDA(cudaMemcpy(ort.GetTensorMutableData<float>(output_tensor), buffers[i + num_binding_inputs], batch_size * output_dim_sizes[i] * sizeof(float), cudaMemcpyDeviceToHost));
+        // If output tensor type is INT64, TensorRT processes data as INT32 and the output will be converted to INT64.


move the comment to where the applicable code is.

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

jywu-msft · 2019-05-23T03:06:39Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

        ort.ReleaseTensorTypeAndShapeInfo(tensor_info);
-        const float* input = ort.GetTensorData<float>(input_tensor);

        const int input_batch_size = tensor_shape[0];
        if (i > 0 && batch_size != input_batch_size) {
          ORT_THROW("Input batch size is inconsistent");


thought we should avoid throwing exception in compute function?

I see other places in Compile() function where it uses ORT_ENFORCE instead of returning a Status
please revisit.

jywu-msft · 2019-05-23T03:11:27Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

+          CHECK_CUDA(cudaMalloc(&buffers[i], input_size * sizeof(int32_t)));
+          CHECK_CUDA(cudaMemcpy(buffers[i], input, input_size * sizeof(int32_t), cudaMemcpyHostToDevice));
+        } else {
+          Status(common::ONNXRUNTIME, common::FAIL, "Input tensor type " + std::to_string(tensor_type) + " is not supported.");


why allocate a Status() , doesn't seem used anywhere.

same comment for other Status() allocations.

looks like this compute_func() always returns 0, which doens't seem correct.
if there are errors, we need to return 1.

jywu-msft · 2019-05-23T03:17:31Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

+        } else if (output_types[i] == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT32 || output_types[i] == ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64) {
+          CHECK_CUDA(cudaMalloc(&buffers[i + num_binding_inputs], batch_size * output_dim_sizes[i] * sizeof(int32_t)));
+        } else {
+          Status(common::ONNXRUNTIME, common::FAIL, "Output tensor type " + std::to_string(output_types[i]) + " is not supported.");


same comment as above for Status()

jywu-msft · 2019-05-23T03:17:44Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

          for (int j = 0; j < output_size; ++j) {
            ort.GetTensorMutableData<int64_t>(output_tensor)[j] = output[j];
          }
          delete[] output;
        } else {
-          Status(common::ONNXRUNTIME, common::FAIL, "Output type is not supported by TensorRT");
+          Status(common::ONNXRUNTIME, common::FAIL, "Output tensor type " + std::to_string(output_types[i]) + " is not supported.");


same Status() comment

jywu-msft · 2019-05-23T20:11:32Z

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

+          }
+          delete[] output;
+        } else {
+          return 1;


better to return specific status code instead of 1 (same for other places)
https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/common/status.h#L33

jywu-msft · 2019-05-24T04:22:30Z

looks like you'll need to merge master to pick up #1097
to resolve the nocontribops CI failure.

Improve TensorRT GetCapability Accuracy

747ee82

stevenlix requested a review from jywu-msft May 11, 2019 01:29

stevenlix requested a review from a team as a code owner May 11, 2019 01:29

Update onnxruntime_providers.cmake

75a2a44

jywu-msft reviewed May 13, 2019

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Show resolved Hide resolved

jywu-msft reviewed May 13, 2019

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.h Outdated Show resolved Hide resolved

jywu-msft reviewed May 13, 2019

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Outdated Show resolved Hide resolved

stevenlix added 2 commits May 13, 2019 12:41

made changes based on feedback

5b8c6d9

Merge branch 'stevenlix/iter' of https://github.com/Microsoft/onnxrun…

398a0d7

…time into stevenlix/iter

jywu-msft reviewed May 15, 2019

View reviewed changes

onnxruntime/test/providers/cpu/math/gemm_test.cc Outdated Show resolved Hide resolved

update unit tests for TensorRT

2b2a0c1

jywu-msft reviewed May 22, 2019

View reviewed changes

update onnx-tensorrt submodule to v5.0 branch

4e378e4

snnn reviewed May 22, 2019

View reviewed changes

.gitmodules Show resolved Hide resolved

stevenlix added 3 commits May 21, 2019 21:06

solve conflicts with master

447ec41

remove uncessary comments

d432724

convert int32 to int64 at inferencing output

3b8ed2c

jywu-msft reviewed May 22, 2019

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Outdated Show resolved Hide resolved

jywu-msft reviewed May 22, 2019

View reviewed changes

onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc Outdated Show resolved Hide resolved

add more data types in compute

e1c8074

jywu-msft reviewed May 23, 2019

View reviewed changes

jywu-msft requested a review from HectorSVC May 23, 2019 03:25

change returns in compute

eadc3eb

jywu-msft reviewed May 23, 2019

View reviewed changes

use StatusCode as return in compute

12360a8

jywu-msft approved these changes May 24, 2019

View reviewed changes

stevenlix merged commit 723d5c7 into master May 24, 2019

stevenlix mentioned this pull request May 24, 2019

Error "trt_engine != nullptr was false" when running inference of ONNX model #1095

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve TensorRT GetCapability to Enable More Models #1012

Improve TensorRT GetCapability to Enable More Models #1012

stevenlix commented May 11, 2019

jywu-msft commented May 11, 2019

jywu-msft commented May 11, 2019

stevenlix commented May 12, 2019

jywu-msft commented May 12, 2019

jywu-msft May 13, 2019 •

edited

Loading

jywu-msft May 13, 2019

jywu-msft May 13, 2019

jywu-msft commented May 15, 2019

jywu-msft May 22, 2019

jywu-msft May 22, 2019

stevenlix May 22, 2019

jywu-msft May 22, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019

jywu-msft May 23, 2019 •

edited

Loading

jywu-msft commented May 24, 2019

		}


		std::unique_ptr<IndexedSubGraph> GetSubGraph(SubGraph_t graph_nodes_index, int& kernels_index,

Improve TensorRT GetCapability to Enable More Models #1012

Improve TensorRT GetCapability to Enable More Models #1012

Conversation

stevenlix commented May 11, 2019

jywu-msft commented May 11, 2019

jywu-msft commented May 11, 2019

stevenlix commented May 12, 2019

jywu-msft commented May 12, 2019

jywu-msft May 13, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jywu-msft commented May 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jywu-msft May 23, 2019 • edited Loading

Choose a reason for hiding this comment

jywu-msft commented May 24, 2019

jywu-msft May 13, 2019 •

edited

Loading

jywu-msft May 23, 2019 •

edited

Loading