Fix TRT custom op issue #12283

toothache · 2022-07-22T05:51:29Z

Description:
Fix issue #12282 : TRT EP failed to create model session with CUDA custom op.

Motivation and Context

Why is this change required? What problem does it solve?
When executing a model with customop, ORT crashes at TRT EP's GetSupportedList. Inside this function, it will create a graph builder first, then construct subgraphs.

onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

Lines 721 to 722 in f2533d3

auto model_build = graph.CreateModel(*GetLogger());

auto& graph_build = model_build->MainGraph();

However, during graph viewer model creation, it lost the schema registries information for custom ops. Thus, causing exceptions in the line below:

onnxruntime/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc

Line 763 in f2533d3

ORT_ENFORCE(graph_build.Resolve().IsOK());

The fix is to pass schema registries inside GraphViewer__CreateModel.
If it fixes an open issue, please link to the issue here.
TRT EP failed to create model session with CUDA custom op #12282

stevenlix · 2022-07-26T02:39:28Z

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux Nuphar CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline,ONNX Runtime Web CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline

azure-pipelines · 2022-07-26T02:40:04Z

Azure Pipelines successfully started running 10 pipeline(s).

stevenlix · 2022-07-26T02:40:47Z

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,onnxruntime-python-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed

azure-pipelines · 2022-07-26T02:41:11Z

Azure Pipelines successfully started running 6 pipeline(s).

toothache · 2022-07-26T03:54:41Z

ORT_MINIMAL_BUILD failed.

toothache · 2022-07-26T04:05:01Z

ORT_MINIMAL_BUILD failed.

@stevenlix , Graph::GetSchemaRegistry is not included in ORT minimal build. Thus, I excluded my fix in ORT_MINIMAL_BUILD, but I'm not sure whether this is appropriate.

stevenlix · 2022-07-26T05:20:49Z

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux Nuphar CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline,ONNX Runtime Web CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline

stevenlix · 2022-07-26T05:21:13Z

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,onnxruntime-python-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed

azure-pipelines · 2022-07-26T05:21:26Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2022-07-26T05:21:36Z

Azure Pipelines successfully started running 6 pipeline(s).

stevenlix · 2022-07-26T17:01:19Z

onnxruntime/core/session/provider_bridge_ort.cc

@@ -743,7 +743,11 @@ struct ProviderHostImpl : ProviderHost {
  void GraphViewer__operator_delete(GraphViewer* p) override { delete p; }
  std::unique_ptr<Model> GraphViewer__CreateModel(const GraphViewer* graph_viewer, const logging::Logger& logger) override {
    return std::make_unique<Model>(graph_viewer->Name(), true, ModelMetaData(), PathString(),
+#if !defined(ORT_MINIMAL_BUILD)
+                                   IOnnxRuntimeOpSchemaRegistryList({graph_viewer->GetSchemaRegistry()}), graph_viewer->DomainToVersionMap(),


please remove extra leading spaces

how many spaces should I keep? These are function arguments and is aligned with the args Ln745.

stevenlix · 2022-07-26T17:01:47Z

onnxruntime/core/session/provider_bridge_ort.cc

@@ -743,7 +743,11 @@ struct ProviderHostImpl : ProviderHost {
  void GraphViewer__operator_delete(GraphViewer* p) override { delete p; }
  std::unique_ptr<Model> GraphViewer__CreateModel(const GraphViewer* graph_viewer, const logging::Logger& logger) override {
    return std::make_unique<Model>(graph_viewer->Name(), true, ModelMetaData(), PathString(),
+#if !defined(ORT_MINIMAL_BUILD)
+                                   IOnnxRuntimeOpSchemaRegistryList({graph_viewer->GetSchemaRegistry()}), graph_viewer->DomainToVersionMap(),
+#else
                                   IOnnxRuntimeOpSchemaRegistryList(), graph_viewer->DomainToVersionMap(),


please remove extra leading spaces

stevenlix · 2022-07-27T23:08:36Z

/azp run Linux CPU CI Pipeline,Linux CPU Minimal Build E2E CI Pipeline,Linux GPU CI Pipeline,Linux GPU TensorRT CI Pipeline,Linux Nuphar CI Pipeline,Linux OpenVINO CI Pipeline,MacOS CI Pipeline,ONNX Runtime Web CI Pipeline,Windows CPU CI Pipeline,Windows GPU CI Pipeline

stevenlix · 2022-07-27T23:08:53Z

/azp run Windows GPU TensorRT CI Pipeline,onnxruntime-binary-size-checks-ci-pipeline,onnxruntime-python-checks-ci-pipeline,orttraining-linux-ci-pipeline,orttraining-linux-gpu-ci-pipeline,orttraining-ortmodule-distributed

azure-pipelines · 2022-07-27T23:09:12Z

Azure Pipelines successfully started running 10 pipeline(s).

azure-pipelines · 2022-07-27T23:09:18Z

Azure Pipelines successfully started running 6 pipeline(s).

* Pass schema registry on CreateModel. * Fix ORT_MINIMAL_BUILD. * Fix build issue.

* update package version * Prevent unbounded growth of command allocator memory (#12114) Prevent unbounded growth of command allocator memory * Update supported ops md for NNAPI/CoreML EP (#12245) * update supported ops md * address pr comments * address pr comments * wording * Change native folder name for java macos arm64 (#12335) * Bump async from 2.6.3 to 2.6.4 in /js/react_native/e2e (#11280) Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4. - [Release notes](https://github.com/caolan/async/releases) - [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md) - [Commits](caolan/async@v2.6.3...v2.6.4) --- updated-dependencies: - dependency-name: async dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [js/rn] upgrade dependencies for e2e test (#11863) * [js/rn] upgrade dependencies for e2e test * use JDK11 only for gradle * expand variable * [js/rn] upgrade package react-native@^0.69.1 (#12155) * [js/rn] upgrade package react-native@^0.69.1 * upgrade compile sdk to v31 * update ios version requirement * update pod path for onnxruntime-react-native * add missing build_java in Android testing stage. (#12187) add missing build_java in testing * Use specific Android NDK version in CI builds. (#12350) Current builds use a NDK version that happens to be on the build machine. The build machine environment may change in ways that are outside of our control. This change installs a specific version of NDK (the current LTS version 25.0.8775105) and uses it. * Remove preview keyword from DirectML pacakge (#12368) Remove preview keyword Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com> * Scope CreateFileMapping2 to valid API partitions (#12374) * Fix TRT custom op issue (#12283) * Pass schema registry on CreateModel. * Fix ORT_MINIMAL_BUILD. * Fix build issue. * Manually add optimization flag for Android Release builds. (#12390) With recent versions of NDK (since 23), the `-O` optimization level compile flag is not being passed when building in the "Release" configuration. More details here: android/ndk#1740 Our "Release" Android builds have been built without the optimization flag since we upgraded from NDK 21. This change is a workaround to manually add `-O3` for "Release" Android builds. * resolve conflicts in tensorRT related changes * Enable support of multi-level nested control flow ops model for TRT EP (#12147) * Make multiple-level nested control flow op model work * find correct input index * find correct input index (cont.) * enable nested layer unit tests for TRT EP * add comment * add Scan op to current workaround support of control flow op Co-authored-by: Jeff Bloomfield <38966965+jeffbloo@users.noreply.github.com> Co-authored-by: Rachel Guo <35738743+YUNQIUGUO@users.noreply.github.com> Co-authored-by: Changming Sun <chasun@microsoft.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Yulong Wang <7679871+fs-eire@users.noreply.github.com> Co-authored-by: Yi Zhang <zhanyi@microsoft.com> Co-authored-by: Edward Chen <18449977+edgchen1@users.noreply.github.com> Co-authored-by: sumitsays <sumitagarwal330@gmail.com> Co-authored-by: Sumit Agarwal <sumitagarwal@microsoft.com> Co-authored-by: Justin Stoecker <justoeck@microsoft.com> Co-authored-by: Yateng Hong <yatengh@microsoft.com> Co-authored-by: Chi Lo <54722500+chilo-ms@users.noreply.github.com>

* Pass schema registry on CreateModel. * Fix ORT_MINIMAL_BUILD. * Fix build issue.

Pass schema registry on CreateModel.

b5ee28e

Fix ORT_MINIMAL_BUILD.

3e1b89d

toothache closed this Jul 26, 2022

toothache reopened this Jul 26, 2022

Fix build issue.

fe5a139

stevenlix reviewed Jul 26, 2022

View reviewed changes

stevenlix approved these changes Jul 28, 2022

View reviewed changes

jywu-msft merged commit c579497 into microsoft:master Jul 29, 2022

jywu-msft added the release:1.12.1 label Aug 1, 2022

RandySheriffH pushed a commit that referenced this pull request Aug 2, 2022

Fix TRT custom op issue (#12283)

0da07f4

* Pass schema registry on CreateModel. * Fix ORT_MINIMAL_BUILD. * Fix build issue.

toothache added a commit to toothache/onnxruntime that referenced this pull request Feb 17, 2023

Fix TRT custom op issue (microsoft#12283)

73512c9

* Pass schema registry on CreateModel. * Fix ORT_MINIMAL_BUILD. * Fix build issue.

toothache deleted the fix_trt_custom_op branch May 20, 2025 07:39

	auto model_build = graph.CreateModel(*GetLogger());
	auto& graph_build = model_build->MainGraph();

Fix TRT custom op issue #12283

Fix TRT custom op issue #12283

Uh oh!

Conversation

toothache commented Jul 22, 2022

Uh oh!

stevenlix commented Jul 26, 2022

Uh oh!

azure-pipelines bot commented Jul 26, 2022

Uh oh!

stevenlix commented Jul 26, 2022

Uh oh!

azure-pipelines bot commented Jul 26, 2022

Uh oh!

toothache commented Jul 26, 2022

Uh oh!

toothache commented Jul 26, 2022

Uh oh!

stevenlix commented Jul 26, 2022

Uh oh!

stevenlix commented Jul 26, 2022

Uh oh!

azure-pipelines bot commented Jul 26, 2022

Uh oh!

azure-pipelines bot commented Jul 26, 2022

Uh oh!

stevenlix Jul 26, 2022

Choose a reason for hiding this comment

Uh oh!

toothache Jul 27, 2022

Choose a reason for hiding this comment

Uh oh!

stevenlix Jul 26, 2022

Choose a reason for hiding this comment

Uh oh!

stevenlix commented Jul 27, 2022

Uh oh!

stevenlix commented Jul 27, 2022

Uh oh!

azure-pipelines bot commented Jul 27, 2022

Uh oh!

azure-pipelines bot commented Jul 27, 2022

Uh oh!

Uh oh!