Triton Inference Server Support for Jetson and JetPack

Triton Inference Server is officially supported on JetPack starting from JetPack 4.6. Triton Inference Server on Jetson supports trained AI models from multiple frameworks includings NVIDIA TensorRT, TensorFlow and ONNX Runtime.

On JetPack, although HTTP/REST and GRPC inference protocols are supported, for edge use cases, direct C API integration is recommended.

Triton Inference Server support on JetPack includes:

Running models on GPU and NVDLA
Support for multiple frameworks: TensorRT, TensorFlow and ONNX Runtime.
Concurrent model execution
Dynamic batching
Model pipelines
Extensible backends
HTTP/REST and GRPC inference protocols
C API

You can download the .tar files for Jetson published on the Triton Infence Server release page in "Jetson JetPack Support" section. The .tar file contains the Triton executables and shared libraries, as well as the C++ and Python client libraries and examples.

Note that perf_analyzer is supported on Jetson, while the model_analyzer is currently not available for Jetson. To execute perf_analyzer for C API, include the option --service-kind=triton_c_api:

perf_analyzer -m graphdef_int32_int32_int32 --service-kind=triton_c_api --triton-server-directory=/opt/tritonserver --model-repository=/workspace/qa/L0_perf_analyzer_capi/models

Refer to these examples that demonstrate how to use Triton Inference Server on Jetson.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jetson.md

jetson.md

Triton Inference Server Support for Jetson and JetPack

Files

jetson.md

Latest commit

History

jetson.md

File metadata and controls

Triton Inference Server Support for Jetson and JetPack