A release of Triton for JetPack 4.6.1 is provided in the attached tar file in the release notes.
Triton Inference Server support on JetPack includes:
- Running models on GPU and NVDLA
- Concurrent model execution
- Dynamic batching
- Model pipelines
- Extensible backends
- HTTP/REST and GRPC inference protocols
- C API
Limitations on Jetson/JetPack:
- Onnx Runtime backend does not support the OpenVino execution provider. The TensorRT execution provider however is supported.
- The Python backend does not support GPU Tensors and Async BLS.
- CUDA IPC (shared memory) is not supported. System shared memory however is supported.
- GPU metrics, GCS storage, S3 storage and Azure storage are not supported.
On JetPack, although HTTP/REST and GRPC inference protocols are supported, for edge use cases, direct C API integration is recommended.
You can download the .tar
files for Jetson from the Triton Inference Server
release page in the
"Jetson JetPack Support" section.
The .tar
file contains the Triton server executable and shared libraries,
as well as the C++ and Python client libraries and examples.
The following dependencies must be installed before building / running Triton server:
apt-get update && \
apt-get install -y --no-install-recommends \
software-properties-common \
autoconf \
automake \
build-essential \
git \
libb64-dev \
libre2-dev \
libssl-dev \
libtool \
libboost-dev \
rapidjson-dev \
patchelf \
pkg-config \
libopenblas-dev \
libarchive-dev \
zlib1g-dev \
python3 \
python3-pip \
python3-dev
Additional PyTorch dependencies:
apt-get -y install autoconf \
bc \
g++-8 \
gcc-8 \
clang-8 \
lld-8
pip3 install --upgrade expecttest xmlrunner hypothesis aiohttp pyyaml scipy ninja typing_extensions protobuf
In addition to the above Pytorch dependencies, the PyTorch wheel corresponding to this release must also be installed:
pip3 install --upgrade https://developer.download.nvidia.com/compute/redist/jp/v461/pytorch/torch-1.11.0a0+17540c5+nv22.01-cp36-cp36m-linux_aarch64.whl
Note: The PyTorch backend depends on libomp.so, which is not loaded automatically. If using the PyTorch backend in Triton, you need to set the LD_LIBRARY_PATH to allow libomp.so to be loaded as needed before launching Triton.
LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/llvm-8/lib"
Note: When building Triton on Jetson, you will require a recent version of cmake. We recommend using cmake 3.21.0. Below is a script to upgrade your cmake version to 3.21.0.
apt remove cmake
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null && \
apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' && \
apt-get update && \
apt-get install -y --no-install-recommends \
cmake-data=3.21.0-0kitware1ubuntu18.04.1 cmake=3.21.0-0kitware1ubuntu18.04.1
Note: Seeing a core dump when using numpy 1.19.5 on Jetson is a known issue. We recommend using numpy version 1.19.4 or earlier to work around this issue.
To build / run the Triton client libraries and examples on Jetson, the following dependencies must also be installed.
apt-get install -y --no-install-recommends \
curl \
jq
pip3 install --upgrade wheel setuptools cython && \
pip3 install --upgrade grpcio-tools numpy==1.19.4 future attrdict
pip3 install --upgrade six requests flake8 flatbuffers pillow
Note: OpenCV 4.1.1 is installed as a part of JetPack. It is one of the dependencies for the client build.
Note: On Jetson, the backend directory must be explicitly specified using the
--backend-directory
flag. Triton defaults to using TensorFlow 1.x and a version string
is required to use TensorFlow 2.x.
tritonserver --model-repository=/path/to/model_repo --backend-directory=/path/to/tritonserver/backends \
--backend-config=tensorflow,version=2
Note: perf_analyzer is supported on Jetson, while the model_analyzer
is currently not available for Jetson. To execute perf_analyzer
for C API, use
the CLI flag --service-kind=triton_c_api
:
perf_analyzer -m graphdef_int32_int32_int32 --service-kind=triton_c_api \
--triton-server-directory=/opt/tritonserver \
--model-repository=/workspace/qa/L0_perf_analyzer_capi/models
Refer to these examples that demonstrate how to use Triton Inference Server on Jetson.