-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Upgrade to CUDA10.2 for TensorRT #3084
Conversation
stevenlix
commented
Feb 24, 2020
- Upgrade from CUDA10.0 to CUDA10.2 since TensorRT doesn't have CUDA10.1 package
- Check if TensorRT subgraph inputs have shape specified. If not, remind user to do offline shape inference first
- Enable more unit tests for TensorRT as the result of bug fixing in the parser
- Provide optimization profile for dynamic shape inputs only
…ble some TensorRT unit tests
@@ -19,6 +19,8 @@ status = session_object.Load(model_file_name); | |||
``` | |||
The C API details are [here](../C_API.md#c-api). | |||
|
|||
If certain operators in the model are not supported by TensorRT, ONNX Runtime will partition the graph and only send supported subgraphs to TensorRT execution provider. Because TensorRT requires all inputs of the subgraph have shape specified, ONNX Runtime will thow error if there is no shape for any inputs. In this case please infer shapes on the model first by running shape inference [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/core/providers/nuphar/scripts/symbolic_shape_infer.py). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add ## Symbolic Shape Inference
to line 21.
this will allow one to create a permalink to the section with the instructions.
ORT_THROW_IF_ERROR(ORT_MAKE_STATUS(ONNXRUNTIME, EP_FAIL, | ||
"TensorRT input: " + input_arg->Name() + " has no shape specified. " + | ||
"Please run shape inference on the onnx model first. Details can be found in " + | ||
"https://github.com/microsoft/onnxruntime/blob/master/docs/execution_providers/TensorRT-ExecutionProvider.md")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you add a ## header as suggested above, you can provide a direct link to the exact section pertaining to the shape inference script, rather than the entire TRT EP document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might consider returning FAIL instead of EP_FAIL here to make it a fatal error.
otherwise, under python it would retry with CPU provider and succeed, and the user might not notice this error?
@@ -42,7 +42,7 @@ jobs: | |||
displayName: 'Generate cmake config' | |||
inputs: | |||
scriptPath: '$(Build.SourcesDirectory)\tools\ci_build\build.py' | |||
arguments: '--config $(BuildConfig) --build_dir $(Build.BinariesDirectory) --skip_submodule_sync --build_shared_lib --update --cmake_generator "Visual Studio 16 2019" --msvc_toolset 14.16 --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="C:\local\TensorRT-7.0.0.11.cuda-10.0.cudnn7.6\TensorRT-7.0.0.11" --cuda_version=10.0 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0" --cudnn_home="C:\local\cudnn-10.0-windows10-x64-v7.6.5.32\cuda" --cmake_extra_defines CMAKE_SYSTEM_VERSION=10.0.18362.0' | |||
arguments: '--config $(BuildConfig) --build_dir $(Build.BinariesDirectory) --skip_submodule_sync --build_shared_lib --update --cmake_generator "Visual Studio 16 2019" --msvc_toolset 14.16 --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="C:\local\TensorRT-7.0.0.11.cuda-10.2.cudnn7.6\TensorRT-7.0.0.11" --cuda_version=10.2 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --cudnn_home="C:\local\cudnn-10.2-windows10-x64-v7.6.5.32\cuda" --cmake_extra_defines CMAKE_SYSTEM_VERSION=10.0.18362.0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you update cuda to 10.2 , you'll need to add some code to build.py to update the PATH to point to the cuda 10.2 directory and take precedence over any other cuda versions.
i suggest separating out the cuda 10.2 update to a different PR to undergo more testing.
@@ -81,7 +81,7 @@ jobs: | |||
del wheel_filename_file | |||
python.exe -m pip install -q --upgrade %WHEEL_FILENAME% | |||
set PATH=$(Build.BinariesDirectory)\$(BuildConfig)\$(BuildConfig);%PATH% | |||
python $(Build.SourcesDirectory)\tools\ci_build\build.py --config $(BuildConfig) --build_dir $(Build.BinariesDirectory) --skip_submodule_sync --build_shared_lib --test --cmake_generator "Visual Studio 16 2019" --msvc_toolset 14.16 --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="C:\local\TensorRT-7.0.0.11.cuda-10.0.cudnn7.6\TensorRT-7.0.0.11" --cuda_version=10.0 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.0" --cudnn_home="C:\local\cudnn-10.0-windows10-x64-v7.6.5.32\cuda" --cmake_extra_defines CMAKE_SYSTEM_VERSION=10.0.18362.0 | |||
python $(Build.SourcesDirectory)\tools\ci_build\build.py --config $(BuildConfig) --build_dir $(Build.BinariesDirectory) --skip_submodule_sync --build_shared_lib --test --cmake_generator "Visual Studio 16 2019" --msvc_toolset 14.16 --build_wheel --enable_onnx_tests --use_tensorrt --tensorrt_home="C:\local\TensorRT-7.0.0.11.cuda-10.2.cudnn7.6\TensorRT-7.0.0.11" --cuda_version=10.2 --cuda_home="C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.2" --cudnn_home="C:\local\cudnn-10.2-windows10-x64-v7.6.5.32\cuda" --cmake_extra_defines CMAKE_SYSTEM_VERSION=10.0.18362.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--msvc_toolset 14.16 is not needed when cuda version >= 10.1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stevenlix , it looks like using newer toolset encounters build issue?
e.g.
Error C2127: 'getMatrixOp': illegal initialization of 'constexpr' entity with a non-constant expression
we/nvidia should fix these issues so we don't remain on 14.16 msvc toolset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please address remaining review comments. thanks!
seems like a build error with the tensorrt parser update? |