Skip to content

Commit 4a5a9cb

Browse files
committed
version: TPRD-1695: Update version fro 25.09 release (#8386)
1 parent 44273e8 commit 4a5a9cb

File tree

19 files changed

+39
-44
lines changed

19 files changed

+39
-44
lines changed

Dockerfile.sdk

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@
2929
#
3030

3131
# Base image on the minimum Triton container
32-
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:25.08-py3-min
32+
ARG BASE_IMAGE=nvcr.io/nvidia/tritonserver:25.09-py3-min
3333

3434
ARG TRITON_CLIENT_REPO_SUBDIR=clientrepo
3535
ARG TRITON_REPO_ORGANIZATION=http://github.com/triton-inference-server

README.md

Lines changed: 11 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,6 @@
2727
-->
2828
[![License](https://img.shields.io/badge/License-BSD3-lightgrey.svg)](https://opensource.org/licenses/BSD-3-Clause)
2929

30-
>[!WARNING]
31-
>You are currently on the `main` branch which tracks under-development progress
32-
>towards the next release. The current release is version [2.60.0](https://github.com/triton-inference-server/server/releases/latest)
33-
>and corresponds to the 25.08 container release on NVIDIA GPU Cloud (NGC).
34-
3530
# Triton Inference Server
3631

3732
Triton Inference Server is an open source inference serving software that
@@ -61,7 +56,7 @@ Major features include:
6156
- Provides [Backend API](https://github.com/triton-inference-server/backend) that
6257
allows adding custom backends and pre/post processing operations
6358
- Supports writing custom backends in python, a.k.a.
64-
[Python-based backends.](https://github.com/triton-inference-server/backend/blob/main/docs/python_based_backends.md#python-based-backends)
59+
[Python-based backends.](https://github.com/triton-inference-server/backend/blob/r25.09/docs/python_based_backends.md#python-based-backends)
6560
- Model pipelines using
6661
[Ensembling](docs/user_guide/architecture.md#ensemble-models) or [Business
6762
Logic Scripting
@@ -90,16 +85,16 @@ Inference Server with the
9085

9186
```bash
9287
# Step 1: Create the example model repository
93-
git clone -b r25.08 https://github.com/triton-inference-server/server.git
88+
git clone -b r25.09 https://github.com/triton-inference-server/server.git
9489
cd server/docs/examples
9590
./fetch_models.sh
9691

9792
# Step 2: Launch triton from the NGC Triton container
98-
docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:25.08-py3 tritonserver --model-repository=/models --model-control-mode explicit --load-model densenet_onnx
93+
docker run --gpus=1 --rm --net=host -v ${PWD}/model_repository:/models nvcr.io/nvidia/tritonserver:25.09-py3 tritonserver --model-repository=/models --model-control-mode explicit --load-model densenet_onnx
9994

10095
# Step 3: Sending an Inference Request
10196
# In a separate console, launch the image_client example from the NGC Triton SDK container
102-
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:25.08-py3-sdk /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
97+
docker run -it --rm --net=host nvcr.io/nvidia/tritonserver:25.09-py3-sdk /workspace/install/bin/image_client -m densenet_onnx -c 3 -s INCEPTION /workspace/images/mug.jpg
10398

10499
# Inference should return the following
105100
Image '/workspace/images/mug.jpg':
@@ -172,10 +167,10 @@ configuration](docs/user_guide/model_configuration.md) for the model.
172167
[Python](https://github.com/triton-inference-server/python_backend), and more
173168
- Not all the above backends are supported on every platform supported by Triton.
174169
Look at the
175-
[Backend-Platform Support Matrix](https://github.com/triton-inference-server/backend/blob/main/docs/backend_platform_support_matrix.md)
170+
[Backend-Platform Support Matrix](https://github.com/triton-inference-server/backend/blob/r25.09/docs/backend_platform_support_matrix.md)
176171
to learn which backends are supported on your target platform.
177172
- Learn how to [optimize performance](docs/user_guide/optimization.md) using the
178-
[Performance Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/main/README.md)
173+
[Performance Analyzer](https://github.com/triton-inference-server/perf_analyzer/blob/r25.09/README.md)
179174
and
180175
[Model Analyzer](https://github.com/triton-inference-server/model_analyzer)
181176
- Learn how to [manage loading and unloading models](docs/user_guide/model_management.md) in
@@ -189,14 +184,14 @@ A Triton *client* application sends inference and other requests to Triton. The
189184
[Python and C++ client libraries](https://github.com/triton-inference-server/client)
190185
provide APIs to simplify this communication.
191186

192-
- Review client examples for [C++](https://github.com/triton-inference-server/client/blob/main/src/c%2B%2B/examples),
193-
[Python](https://github.com/triton-inference-server/client/blob/main/src/python/examples),
194-
and [Java](https://github.com/triton-inference-server/client/blob/main/src/java/src/main/java/triton/client/examples)
187+
- Review client examples for [C++](https://github.com/triton-inference-server/client/blob/r25.09/src/c%2B%2B/examples),
188+
[Python](https://github.com/triton-inference-server/client/blob/r25.09/src/python/examples),
189+
and [Java](https://github.com/triton-inference-server/client/blob/r25.09/src/java/src/main/java/triton/client/examples)
195190
- Configure [HTTP](https://github.com/triton-inference-server/client#http-options)
196191
and [gRPC](https://github.com/triton-inference-server/client#grpc-options)
197192
client options
198193
- Send input data (e.g. a jpeg image) directly to Triton in the [body of an HTTP
199-
request without any additional metadata](https://github.com/triton-inference-server/server/blob/main/docs/protocol/extension_binary_data.md#raw-binary-request)
194+
request without any additional metadata](https://github.com/triton-inference-server/server/blob/r25.09/docs/protocol/extension_binary_data.md#raw-binary-request)
200195

201196
### Extend Triton
202197

@@ -205,7 +200,7 @@ designed for modularity and flexibility
205200

206201
- [Customize Triton Inference Server container](docs/customization_guide/compose.md) for your use case
207202
- [Create custom backends](https://github.com/triton-inference-server/backend)
208-
in either [C/C++](https://github.com/triton-inference-server/backend/blob/main/README.md#triton-backend-api)
203+
in either [C/C++](https://github.com/triton-inference-server/backend/blob/r25.09/README.md#triton-backend-api)
209204
or [Python](https://github.com/triton-inference-server/python_backend)
210205
- Create [decoupled backends and models](docs/user_guide/decoupled_models.md) that can send
211206
multiple responses for a request or not send any responses for a request

build.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -73,12 +73,12 @@
7373
DEFAULT_TRITON_VERSION_MAP = {
7474
"release_version": "2.62.0dev",
7575
"triton_container_version": "25.10dev",
76-
"upstream_container_version": "25.08",
76+
"upstream_container_version": "25.09",
7777
"ort_version": "1.23.0",
7878
"ort_openvino_version": "2025.3.0",
7979
"standalone_openvino_version": "2025.3.0",
8080
"dcgm_version": "4.4.0-1",
81-
"vllm_version": "0.9.2",
81+
"vllm_version": "0.10.1.1",
8282
"rhel_py_version": "3.12.3",
8383
}
8484

deploy/aws/values.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
replicaCount: 1
2828

2929
image:
30-
imageName: nvcr.io/nvidia/tritonserver:25.08-py3
30+
imageName: nvcr.io/nvidia/tritonserver:25.09-py3
3131
pullPolicy: IfNotPresent
3232
modelRepositoryPath: s3://triton-inference-server-repository/model_repository
3333
numGpus: 1

deploy/fleetcommand/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@
2626

2727
apiVersion: v1
2828
# appVersion is the Triton version; update when changing release
29-
appVersion: 2.60.0"
29+
appVersion: 2.61.0"
3030
description: Triton Inference Server (Fleet Command)
3131
name: triton-inference-server
3232
# version is the Chart version; update when changing anything in the chart

deploy/fleetcommand/values.yaml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
replicaCount: 1
2828

2929
image:
30-
imageName: nvcr.io/nvidia/tritonserver:25.08-py3
30+
imageName: nvcr.io/nvidia/tritonserver:25.09-py3
3131
pullPolicy: IfNotPresent
3232
numGpus: 1
3333
serverCommand: tritonserver
@@ -47,13 +47,13 @@ image:
4747
#
4848
# To set model control mode, uncomment and configure below
4949
# TODO: Fix the following url, it is invalid
50-
# See https://github.com/triton-inference-server/server/blob/r25.08/docs/user_guide/model_management.md
50+
# See https://github.com/triton-inference-server/server/blob/r25.09/docs/user_guide/model_management.md
5151
# for more details
5252
#- --model-control-mode=explicit|poll|none
5353
#
5454
# Additional server args
5555
#
56-
# see https://github.com/triton-inference-server/server/blob/r25.08/README.md
56+
# see https://github.com/triton-inference-server/server/blob/r25.09/README.md
5757
# for more details
5858

5959
service:

deploy/gcp/values.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@
2727
replicaCount: 1
2828

2929
image:
30-
imageName: nvcr.io/nvidia/tritonserver:25.08-py3
30+
imageName: nvcr.io/nvidia/tritonserver:25.09-py3
3131
pullPolicy: IfNotPresent
3232
modelRepositoryPath: gs://triton-inference-server-repository/model_repository
3333
numGpus: 1

deploy/gke-marketplace-app/benchmark/perf-analyzer-script/triton_client.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ metadata:
3333
namespace: default
3434
spec:
3535
containers:
36-
- image: nvcr.io/nvidia/tritonserver:25.08-py3-sdk
36+
- image: nvcr.io/nvidia/tritonserver:25.09-py3-sdk
3737
imagePullPolicy: Always
3838
name: nv-triton-client
3939
securityContext:

deploy/gke-marketplace-app/server-deployer/build_and_push.sh

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@
2727

2828
export REGISTRY=gcr.io/$(gcloud config get-value project | tr ':' '/')
2929
export APP_NAME=tritonserver
30-
export MAJOR_VERSION=2.60
31-
export MINOR_VERSION=2.60.0
32-
export NGC_VERSION=25.08-py3
30+
export MAJOR_VERSION=2.61
31+
export MINOR_VERSION=2.61.0
32+
export NGC_VERSION=25.09-py3
3333

3434
docker pull nvcr.io/nvidia/$APP_NAME:$NGC_VERSION
3535

deploy/gke-marketplace-app/server-deployer/chart/triton/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -28,4 +28,4 @@ apiVersion: v1
2828
appVersion: "2.60"
2929
description: Triton Inference Server
3030
name: triton-inference-server
31-
version: 2.60.0
31+
version: 2.61.0

0 commit comments

Comments
 (0)