Triton Inference Server Support for Jetson and JetPack

A release of Triton for JetPack 4.6.1 is provided in the attached tar file in the release notes.

Triton Inference Server support on JetPack includes:

Running models on GPU and NVDLA
Concurrent model execution
Dynamic batching
Model pipelines
Extensible backends
HTTP/REST and GRPC inference protocols
C API

Limitations on Jetson/JetPack:

Onnx Runtime backend does not support the OpenVino execution provider. The TensorRT execution provider however is supported.
The Python backend does not support GPU Tensors and Async BLS.
CUDA IPC (shared memory) is not supported. System shared memory however is supported.
GPU metrics, GCS storage, S3 storage and Azure storage are not supported.

On JetPack, although HTTP/REST and GRPC inference protocols are supported, for edge use cases, direct C API integration is recommended.

You can download the .tar files for Jetson from the Triton Inference Server release page in the "Jetson JetPack Support" section.

The .tar file contains the Triton server executable and shared libraries, as well as the C++ and Python client libraries and examples.

Installation and Usage

The following dependencies must be installed before building / running Triton server:

apt-get update && \
        apt-get install -y --no-install-recommends \
            software-properties-common \
            autoconf \
            automake \
            build-essential \
            git \
            libb64-dev \
            libre2-dev \
            libssl-dev \
            libtool \
            libboost-dev \
            rapidjson-dev \
            patchelf \
            pkg-config \
            libopenblas-dev \
            libarchive-dev \
            zlib1g-dev \
            python3 \
            python3-pip \
            python3-dev

Additional PyTorch dependencies:

apt-get -y install autoconf \
            bc \
            g++-8 \
            gcc-8 \
            clang-8 \
            lld-8

pip3 install --upgrade expecttest xmlrunner hypothesis aiohttp pyyaml scipy ninja typing_extensions protobuf

In addition to the above Pytorch dependencies, the PyTorch wheel corresponding to this release must also be installed:

pip3 install --upgrade https://developer.download.nvidia.com/compute/redist/jp/v461/pytorch/torch-1.11.0a0+17540c5+nv22.01-cp36-cp36m-linux_aarch64.whl

Note: The PyTorch backend depends on libomp.so, which is not loaded automatically. If using the PyTorch backend in Triton, you need to set the LD_LIBRARY_PATH to allow libomp.so to be loaded as needed before launching Triton.

LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/llvm-8/lib"

Note: When building Triton on Jetson, you will require a recent version of cmake. We recommend using cmake 3.21.0. Below is a script to upgrade your cmake version to 3.21.0.

apt remove cmake
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
      gpg --dearmor - | \
      tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null && \
    apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' && \
    apt-get update && \
    apt-get install -y --no-install-recommends \
        cmake-data=3.21.0-0kitware1ubuntu18.04.1 cmake=3.21.0-0kitware1ubuntu18.04.1

Note: Seeing a core dump when using numpy 1.19.5 on Jetson is a known issue. We recommend using numpy version 1.19.4 or earlier to work around this issue.

To build / run the Triton client libraries and examples on Jetson, the following dependencies must also be installed.

apt-get install -y --no-install-recommends \
            curl \
            jq

    pip3 install --upgrade wheel setuptools cython && \
    pip3 install --upgrade grpcio-tools numpy==1.19.4 future attrdict
    pip3 install --upgrade six requests flake8 flatbuffers pillow

Note: OpenCV 4.1.1 is installed as a part of JetPack. It is one of the dependencies for the client build.

Note: On Jetson, the backend directory must be explicitly specified using the --backend-directory flag. Triton defaults to using TensorFlow 1.x and a version string is required to use TensorFlow 2.x.

tritonserver --model-repository=/path/to/model_repo --backend-directory=/path/to/tritonserver/backends \
             --backend-config=tensorflow,version=2

Note: perf_analyzer is supported on Jetson, while the model_analyzer is currently not available for Jetson. To execute perf_analyzer for C API, use the CLI flag --service-kind=triton_c_api:

perf_analyzer -m graphdef_int32_int32_int32 --service-kind=triton_c_api \
    --triton-server-directory=/opt/tritonserver \
    --model-repository=/workspace/qa/L0_perf_analyzer_capi/models

Refer to these examples that demonstrate how to use Triton Inference Server on Jetson.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jetson.md

jetson.md

Triton Inference Server Support for Jetson and JetPack

Installation and Usage

Files

jetson.md

Latest commit

History

jetson.md

File metadata and controls

Triton Inference Server Support for Jetson and JetPack

Installation and Usage