Skip to content

Commit

Permalink
Update main post-23.05 release (#5880)
Browse files Browse the repository at this point in the history
* Update README and versions for 23.05 branch

* Changes to support 23.05 (#5782)

* Update python and conda version

* Update CMAKE installation

* Update checksum version

* Update ubuntu base image to 22.04

* Use ORT 1.15.0

* Set CMAKE to pull latest version

* Update libre package version

* Removing unused argument

* Adding condition for ubuntu 22.04

* Removing installation of the package from the devel container

* Nnshah1 u22.04 (#5770)

* Update CMAKE installation

* Update python and conda version

* Update CMAKE installation

* Update checksum version

* Update ubuntu base image to 22.04

* updating versions for ubuntu 22.04

* remove re2

---------

Co-authored-by: Neelay Shah <neelays@neelays-dt.nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>

* Set ONNX version to 1.13.0

* Fix L0_custom_ops for ubuntu 22.04 (#5775)

* add back rapidjson-dev

---------

Co-authored-by: Neelay Shah <neelays@neelays-dt.nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: nv-kmcgill53 <101670481+nv-kmcgill53@users.noreply.github.com>

* Fix L0_mlflow (#5805)

* working thread

* remove default install of blinker

* merge issue fixed

* Fix L0_backend_python/env test (#5799)

* Fix L0_backend_python/env test

* Address comment

* Update the copyright

* Fix up

* Fix L0_http_fuzz (#5776)

* installing python 3.8.16 for test

* spelling

Co-authored-by: Neelay Shah <neelays@nvidia.com>

* use util functions to install python3.8 in an easier way

---------

Co-authored-by: Neelay Shah <neelays@nvidia.com>

* Update Windows versions for 23.05 release (#5826)

* Rename Ubuntu 20.04 mentions to 22.04 (#5849)

* Update DCGM version (#5856)

* Update DCGM version (#5857)

* downgrade DCGM version to 2.4.7 (#5860)

* Updating link for latest release notes to 23.05

---------

Co-authored-by: Neelay Shah <neelays@neelays-dt.nvidia.com>
Co-authored-by: Neelay Shah <neelays@nvidia.com>
Co-authored-by: nv-kmcgill53 <101670481+nv-kmcgill53@users.noreply.github.com>
Co-authored-by: Iman Tabrizian <iman.tabrizian@gmail.com>
  • Loading branch information
5 people committed Jun 1, 2023
1 parent b9e8125 commit e0e2c1a
Show file tree
Hide file tree
Showing 39 changed files with 235 additions and 199 deletions.
25 changes: 13 additions & 12 deletions Dockerfile.QA
Original file line number Diff line number Diff line change
Expand Up @@ -62,13 +62,15 @@ RUN apt-get update && \
RUN pip3 install --upgrade pip && \
pip3 install --upgrade wheel setuptools

RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null && \
apt-add-repository 'deb https://apt.kitware.com/ubuntu/ focal main' && \
RUN apt update && apt install -y gpg wget && \
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && \
. /etc/os-release && \
echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ $UBUNTU_CODENAME main" | \
tee /etc/apt/sources.list.d/kitware.list >/dev/null && \
apt-get update && \
apt-get install -y --no-install-recommends \
cmake-data=3.25.2-0kitware1ubuntu20.04.1 cmake=3.25.2-0kitware1ubuntu20.04.1
apt-get install -y --no-install-recommends cmake cmake-data

# Add inception_graphdef model to example repo
WORKDIR /workspace/docs/examples/model_repository
Expand Down Expand Up @@ -294,12 +296,16 @@ RUN if [ $(cat /etc/os-release | grep 'VERSION_ID="20.04"' | wc -l) -ne 0 ]; the
apt-get update && \
apt-get install -y --no-install-recommends \
libpng-dev; \
elif [ $(cat /etc/os-release | grep 'VERSION_ID="22.04"' | wc -l) -ne 0 ]; then \
apt-get update && \
apt-get install -y --no-install-recommends \
libpng-dev; \
elif [ $(cat /etc/os-release | grep 'VERSION_ID="18.04"' | wc -l) -ne 0 ]; then \
apt-get update && \
apt-get install -y --no-install-recommends \
libpng-dev; \
else \
echo "Ubuntu version must be either 18.04 or 20.04" && \
echo "Ubuntu version must be either 18.04, 20.04 or 22.04" && \
exit 1; \
fi

Expand Down Expand Up @@ -333,11 +339,6 @@ RUN pip3 install --upgrade wheel setuptools && \
pip3 install --upgrade numpy pillow attrdict future grpcio requests gsutil \
awscli six grpcio-channelz prettytable virtualenv

# L0_http_fuzz is hitting similar issue with boofuzz with latest version (0.4.0):
# https://github.com/jtpereyda/boofuzz/issues/529
# Hence, fixing the boofuzz version to 0.3.0
RUN pip3 install 'boofuzz==0.3.0'

# go needed for example go client test.
RUN if [ "$TARGETPLATFORM" = "linux/arm64" ]; then \
wget https://golang.org/dl/go1.19.1.linux-arm64.tar.gz && \
Expand Down
36 changes: 14 additions & 22 deletions Dockerfile.sdk
Original file line number Diff line number Diff line change
Expand Up @@ -29,21 +29,20 @@
#

# Base image on the minimum Triton container
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.04-py3-min
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:23.05-py3-min

ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo
ARG TRITON_COMMON_REPO_TAG=main
ARG TRITON_CORE_REPO_TAG=main
ARG TRITON_BACKEND_REPO_TAG=main
ARG TRITON_THIRD_PARTY_REPO_TAG=main
ARG TRITON_MODEL_ANALYZER_REPO_TAG=main
ARG CMAKE_UBUNTU_VERSION=20.04
ARG TRITON_ENABLE_GPU=ON
ARG JAVA_BINDINGS_MAVEN_VERSION=3.8.4
ARG JAVA_BINDINGS_JAVACPP_PRESETS_TAG=1.5.8

# DCGM version to install for Model Analyzer
ARG DCGM_VERSION=2.2.9
ARG DCGM_VERSION=2.4.7

ARG NVIDIA_TRITON_SERVER_SDK_VERSION=unknown
ARG NVIDIA_BUILD_ID=unknown
Expand Down Expand Up @@ -87,25 +86,18 @@ RUN apt-get update && \
pip3 install --upgrade grpcio-tools && \
pip3 install --upgrade pip

ARG CMAKE_UBUNTU_VERSION
# Client build requires recent version of CMake (FetchContent required)
RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null && \
if [ "$CMAKE_UBUNTU_VERSION" = "20.04" ]; then \
apt-add-repository 'deb https://apt.kitware.com/ubuntu/ focal main' && \
apt-get update && \
apt-get install -y --no-install-recommends \
cmake-data=3.25.2-0kitware1ubuntu20.04.1 cmake=3.25.2-0kitware1ubuntu20.04.1; \
elif [ "$CMAKE_UBUNTU_VERSION" = "18.04" ]; then \
apt-add-repository 'deb https://apt.kitware.com/ubuntu/ bionic main' && \
apt-get update && \
apt-get install -y --no-install-recommends \
cmake-data=3.18.4-0kitware1 cmake=3.18.4-0kitware1; \
else \
echo "ERROR: Only support CMAKE_UBUNTU_VERSION to be 18.04 or 20.04" && false; \
fi && \
cmake --version
# Using CMAKE installation instruction from:: https://apt.kitware.com/
RUN apt update && apt install -y gpg wget && \
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && \
. /etc/os-release && \
echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ $UBUNTU_CODENAME main" | \
tee /etc/apt/sources.list.d/kitware.list >/dev/null && \
apt-get update && \
apt-get install -y --no-install-recommends cmake cmake-data && \
cmake --version

# Build expects "python" executable (not python3).
RUN rm -f /usr/bin/python && \
Expand Down Expand Up @@ -224,7 +216,7 @@ RUN pip3 install --upgrade numpy pillow attrdict && \
RUN if [ "$TRITON_ENABLE_GPU" = "ON" ]; then \
[ "$(uname -m)" != "x86_64" ] && arch="sbsa" || arch="x86_64" && \
curl -o /tmp/cuda-keyring.deb \
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/$arch/cuda-keyring_1.0-1_all.deb \
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/$arch/cuda-keyring_1.0-1_all.deb \
&& apt install /tmp/cuda-keyring.deb && rm /tmp/cuda-keyring.deb && \
apt-get update && apt-get install -y datacenter-gpu-manager=1:${DCGM_VERSION}; \
fi
Expand Down
16 changes: 8 additions & 8 deletions Dockerfile.win10.min
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ WORKDIR /
#
# Installing Vcpkg
#
ARG VCPGK_VERSION=2023.02.24
ARG VCPGK_VERSION=2022.11.14
RUN git clone --single-branch --depth=1 -b %VCPGK_VERSION% https://github.com/microsoft/vcpkg.git
WORKDIR /vcpkg
RUN bootstrap-vcpkg.bat
Expand Down Expand Up @@ -103,9 +103,9 @@ LABEL CMAKE_VERSION=${CMAKE_VERSION}
#
# Installing CUDA
#
ARG CUDA_MAJOR=11
ARG CUDA_MINOR=8
ARG CUDA_PATCH=0
ARG CUDA_MAJOR=12
ARG CUDA_MINOR=1
ARG CUDA_PATCH=1
ARG CUDA_VERSION=${CUDA_MAJOR}.${CUDA_MINOR}.${CUDA_PATCH}
ARG CUDA_PACKAGES="nvcc_${CUDA_MAJOR}.${CUDA_MINOR} \
cudart_${CUDA_MAJOR}.${CUDA_MINOR} \
Expand Down Expand Up @@ -135,8 +135,8 @@ LABEL CUDA_VERSION="${CUDA_VERSION}"
#
# Installing Tensorrt
#
ARG TENSORRT_VERSION=8.5.3.1
ARG TENSORRT_ZIP="TensorRT-${TENSORRT_VERSION}.Windows10.x86_64.cuda-11.8.zip"
ARG TENSORRT_VERSION=8.6.1.6
ARG TENSORRT_ZIP="TensorRT-${TENSORRT_VERSION}.Windows10.x86_64.cuda-12.0.zip"
ARG TENSORRT_SOURCE=${TENSORRT_ZIP}
# COPY ${TENSORRT_ZIP} /tmp/${TENSORRT_ZIP}
ADD ${TENSORRT_SOURCE} /tmp/${TENSORRT_ZIP}
Expand All @@ -153,8 +153,8 @@ LABEL TENSORRT_VERSION="${TENSORRT_VERSION}"
#
# Installing CUDNN
#
ARG CUDNN_VERSION=8.8.1.3
ARG CUDNN_ZIP=cudnn-windows-x86_64-${CUDNN_VERSION}_cuda11-archive.zip
ARG CUDNN_VERSION=8.9.1.23
ARG CUDNN_ZIP=cudnn-windows-x86_64-${CUDNN_VERSION}_cuda12-archive.zip
ARG CUDNN_SOURCE=${CUDNN_ZIP}

ADD ${CUDNN_SOURCE} /tmp/${CUDNN_ZIP}
Expand Down
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,8 @@

**LATEST RELEASE: You are currently on the main branch which tracks
under-development progress towards the next release. The current release is
version [2.33.0](https://github.com/triton-inference-server/server/tree/r23.04)
and corresponds to the 23.04 container release on
version [2.34.0](https://github.com/triton-inference-server/server/tree/r23.05)
and corresponds to the 23.05 container release on
[NVIDIA GPU Cloud (NGC)](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/tritonserver).**

----
Expand Down Expand Up @@ -88,16 +88,16 @@ Inference Server with the

```bash
# Step 1: Create the example model repository
git clone -b r23.04 https://github.com/triton-inference-server/server.git
git clone -b r23.05 https://github.com/triton-inference-server/server.git
cd server/docs/examples
./fetch_models.sh

# Step 2: Launch triton from the NGC Triton container
docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:23.04-py3 tritonserver --model-repository=/models
docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:23.05-py3 tritonserver --model-repository=/models

# Step 3: Sending an Inference Request
# In a separate console, launch the image_client example from the NGC Triton SDK container
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.04-py3-sdk
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:23.05-py3-sdk
/workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg

# Inference should return the following
Expand Down
35 changes: 19 additions & 16 deletions build.py
Original file line number Diff line number Diff line change
Expand Up @@ -67,14 +67,15 @@
# incorrectly load the other version of the openvino libraries.
#
TRITON_VERSION_MAP = {

'2.35.0dev': (
'23.06dev', # triton container
'23.04', # upstream container
'1.14.1', # ORT
'23.05', # upstream container
'1.15.0', # ORT
'2022.1.0', # ORT OpenVINO
'2022.1.0', # Standalone OpenVINO
'2.2.9', # DCGM version
'py38_4.12.0') # Conda version.
'2.4.7', # DCGM version
'py310_23.1.0-1') # Conda version.
}

CORE_BACKENDS = ['ensemble']
Expand Down Expand Up @@ -830,7 +831,7 @@ def install_dcgm_libraries(dcgm_version, target_machine):
ENV DCGM_VERSION {}
# Install DCGM. Steps from https://developer.nvidia.com/dcgm#Downloads
RUN curl -o /tmp/cuda-keyring.deb \
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/sbsa/cuda-keyring_1.0-1_all.deb \
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/sbsa/cuda-keyring_1.0-1_all.deb \
&& apt install /tmp/cuda-keyring.deb && rm /tmp/cuda-keyring.deb && \
apt-get update && apt-get install -y datacenter-gpu-manager=1:{}
'''.format(dcgm_version, dcgm_version)
Expand All @@ -839,7 +840,7 @@ def install_dcgm_libraries(dcgm_version, target_machine):
ENV DCGM_VERSION {}
# Install DCGM. Steps from https://developer.nvidia.com/dcgm#Downloads
RUN curl -o /tmp/cuda-keyring.deb \
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-keyring_1.0-1_all.deb \
https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.0-1_all.deb \
&& apt install /tmp/cuda-keyring.deb && rm /tmp/cuda-keyring.deb && \
apt-get update && apt-get install -y datacenter-gpu-manager=1:{}
'''.format(dcgm_version, dcgm_version)
Expand All @@ -857,9 +858,9 @@ def install_miniconda(conda_version, target_machine):
.format(FLAGS.version))
miniconda_url = f"https://repo.anaconda.com/miniconda/Miniconda3-{conda_version}-Linux-{target_machine}.sh"
if target_machine == 'x86_64':
sha_sum = "3190da6626f86eee8abf1b2fd7a5af492994eb2667357ee4243975cdbb175d7a"
sha_sum = "32d73e1bc33fda089d7cd9ef4c1be542616bd8e437d1f77afeeaf7afdb019787"
else:
sha_sum = "0c20f121dc4c8010032d64f8e9b27d79e52d28355eb8d7972eafc90652387777"
sha_sum = "80d6c306b015e1e3b01ea59dc66c676a81fa30279bc2da1f180a7ef7b2191d6e"
return f'''
RUN mkdir -p /opt/
RUN wget "{miniconda_url}" -O miniconda.sh -q && \
Expand Down Expand Up @@ -945,13 +946,15 @@ def create_dockerfile_buildbase(ddir, dockerfile_name, argmap):
mv /tmp/boost_1_80_0/boost /usr/include/boost
# Server build requires recent version of CMake (FetchContent required)
RUN wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null && \
apt-add-repository 'deb https://apt.kitware.com/ubuntu/ focal main' && \
RUN apt update && apt install -y gpg wget && \
wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null | \
gpg --dearmor - | \
tee /usr/share/keyrings/kitware-archive-keyring.gpg >/dev/null && \
. /etc/os-release && \
echo "deb [signed-by=/usr/share/keyrings/kitware-archive-keyring.gpg] https://apt.kitware.com/ubuntu/ $UBUNTU_CODENAME main" | \
tee /etc/apt/sources.list.d/kitware.list >/dev/null && \
apt-get update && \
apt-get install -y --no-install-recommends \
cmake-data=3.25.2-0kitware1ubuntu20.04.1 cmake=3.25.2-0kitware1ubuntu20.04.1
apt-get install -y --no-install-recommends cmake cmake-data
'''

if FLAGS.enable_gpu:
Expand Down Expand Up @@ -1136,7 +1139,7 @@ def dockerfile_prepare_container_linux(argmap, backends, enable_gpu,
software-properties-common \
libb64-0d \
libcurl4-openssl-dev \
libre2-5 \
libre2-9 \
git \
gperf \
dirmngr \
Expand Down Expand Up @@ -1314,7 +1317,7 @@ def create_build_dockerfiles(container_build_dir, images, backends, repoagents,
base_image = 'nvcr.io/nvidia/tritonserver:{}-py3-min'.format(
FLAGS.upstream_container_version)
else:
base_image = 'ubuntu:20.04'
base_image = 'ubuntu:22.04'

dockerfileargmap = {
'NVIDIA_BUILD_REF':
Expand Down
2 changes: 1 addition & 1 deletion compose.py
Original file line number Diff line number Diff line change
Expand Up @@ -434,7 +434,7 @@ def create_argmap(images, skip_pull):
"nvcr.io/nvidia/tritonserver:{}-cpu-only-py3".format(
FLAGS.container_version),
"min":
"ubuntu:20.04"
"ubuntu:22.04"
}
fail_if(
len(images) < 2,
Expand Down
2 changes: 1 addition & 1 deletion deploy/aws/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
replicaCount: 1

image:
imageName: nvcr.io/nvidia/tritonserver:23.04-py3
imageName: nvcr.io/nvidia/tritonserver:23.05-py3
pullPolicy: IfNotPresent
modelRepositoryPath: s3://triton-inference-server-repository/model_repository
numGpus: 1
Expand Down
2 changes: 1 addition & 1 deletion deploy/fleetcommand/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@

apiVersion: v1
# appVersion is the Triton version; update when changing release
appVersion: "2.33.0"
appVersion: "2.34.0"
description: Triton Inference Server (Fleet Command)
name: triton-inference-server
# version is the Chart version; update when changing anything in the chart
Expand Down
6 changes: 3 additions & 3 deletions deploy/fleetcommand/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
replicaCount: 1

image:
imageName: nvcr.io/nvidia/tritonserver:23.04-py3
imageName: nvcr.io/nvidia/tritonserver:23.05-py3
pullPolicy: IfNotPresent
numGpus: 1
serverCommand: tritonserver
Expand All @@ -46,13 +46,13 @@ image:
# Model Control Mode (Optional, default: none)
#
# To set model control mode, uncomment and configure below
# See https://github.com/triton-inference-server/server/blob/r23.04/docs/model_management.md
# See https://github.com/triton-inference-server/server/blob/r23.05/docs/model_management.md
# for more details
#- --model-control-mode=explicit|poll|none
#
# Additional server args
#
# see https://github.com/triton-inference-server/server/blob/r23.04/README.md
# see https://github.com/triton-inference-server/server/blob/r23.05/README.md
# for more details

service:
Expand Down
2 changes: 1 addition & 1 deletion deploy/gcp/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
replicaCount: 1

image:
imageName: nvcr.io/nvidia/tritonserver:23.04-py3
imageName: nvcr.io/nvidia/tritonserver:23.05-py3
pullPolicy: IfNotPresent
modelRepositoryPath: gs://triton-inference-server-repository/model_repository
numGpus: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ metadata:
namespace: default
spec:
containers:
- image: nvcr.io/nvidia/tritonserver:23.04-py3-sdk
- image: nvcr.io/nvidia/tritonserver:23.05-py3-sdk
imagePullPolicy: Always
name: nv-triton-client
securityContext:
Expand Down
4 changes: 2 additions & 2 deletions deploy/gke-marketplace-app/server-deployer/build_and_push.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,8 +27,8 @@
export REGISTRY=gcr.io/$(gcloud config get-value project | tr ':' '/')
export APP_NAME=tritonserver
export MAJOR_VERSION=2.33
export MINOR_VERSION=2.33.0
export NGC_VERSION=23.04-py3
export MINOR_VERSION=2.34.0
export NGC_VERSION=23.05-py3

docker pull nvcr.io/nvidia/$APP_NAME:$NGC_VERSION

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -28,4 +28,4 @@ apiVersion: v1
appVersion: "2.33"
description: Triton Inference Server
name: triton-inference-server
version: 2.33.0
version: 2.34.0
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,13 @@ tritonProtocol: HTTP
# HPA GPU utilization autoscaling target
HPATargetAverageValue: 85
modelRepositoryPath: gs://triton_sample_models/23_04
publishedVersion: '2.33.0'
publishedVersion: '2.34.0'
gcpMarketplace: true

image:
registry: gcr.io
repository: nvidia-ngc-public/tritonserver
tag: 23.04-py3
tag: 23.05-py3
pullPolicy: IfNotPresent
# modify the model repository here to match your GCP storage bucket
numGpus: 1
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
x-google-marketplace:
schemaVersion: v2
applicationApiVersion: v1beta1
publishedVersion: '2.33.0'
publishedVersion: '2.34.0'
publishedVersionMetadata:
releaseNote: >-
Initial release.
Expand Down
Loading

0 comments on commit e0e2c1a

Please sign in to comment.