Skip to content

Commit

Permalink
Wholesale updates for tritonserver version (nv-morpheus#369)
Browse files Browse the repository at this point in the history
General refresh of some out-of-date tritonserver image references

Authors:
  - Pete MacKinnon (https://github.com/pdmack)

Approvers:
  - David Gardner (https://github.com/dagardner-nv)
  - https://github.com/raykallen

URL: nv-morpheus#369
  • Loading branch information
pdmack authored Sep 21, 2022
1 parent e114cb2 commit 74f7c3c
Show file tree
Hide file tree
Showing 13 changed files with 21 additions and 24 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ Use the following command to launch a Docker container for Triton loading all of
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 \
-v $PWD/models:/models \
nvcr.io/nvidia/tritonserver:22.06-py3 \
nvcr.io/nvidia/tritonserver:22.08-py3 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--log-info=true \
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -149,7 +149,7 @@ Note: This step assumes you have both [Docker](https://docs.docker.com/engine/in
From the root of the Morpheus project we will launch a Triton Docker container with the `models` directory mounted into the container:

```shell
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:22.02-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --log-info=true
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --log-info=true
```

Once we have Triton running, we can verify that it is healthy using [curl](https://curl.se/). The `/v2/health/live` endpoint should return a 200 status code:
Expand Down
4 changes: 2 additions & 2 deletions examples/abp_nvsmi_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,12 +65,12 @@ This example utilizes the Triton Inference Server to perform inference.

Pull the Docker image for Triton:
```bash
docker pull nvcr.io/nvidia/tritonserver:22.02-py3
docker pull nvcr.io/nvidia/tritonserver:22.08-py3
```

From the Morpheus repo root directory, run the following to launch Triton and load the `abp-nvsmi-xgb` XGBoost model:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:22.02-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model abp-nvsmi-xgb
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model abp-nvsmi-xgb
```

This will launch Triton and only load the `abp-nvsmi-xgb` model. This model has been configured with a max batch size of 32768, and to use dynamic batching for increased performance.
Expand Down
4 changes: 2 additions & 2 deletions examples/abp_pcap_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ To run this example, an instance of Triton Inference Server and a sample dataset

### Triton Inference Server
```bash
docker pull nvcr.io/nvidia/tritonserver:22.02-py3
docker pull nvcr.io/nvidia/tritonserver:22.08-py3
```

##### Deploy Triton Inference Server
Expand All @@ -35,7 +35,7 @@ Bind the provided `abp-pcap-xgb` directory to the docker container model repo at
cd <MORPHEUS_ROOT>/examples/abp_pcap_detection

# Launch the container
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/abp-pcap-xgb:/models/abp-pcap-xgb --name tritonserver nvcr.io/nvidia/tritonserver:22.02-py3 tritonserver --model-repository=/models --exit-on-error=false --model-control-mode=poll --repository-poll-secs=30
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/abp-pcap-xgb:/models/abp-pcap-xgb --name tritonserver nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models --exit-on-error=false --model-control-mode=poll --repository-poll-secs=30
```

##### Verify Model Deployment
Expand Down
4 changes: 2 additions & 2 deletions examples/log_parsing/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,14 +26,14 @@ Pull Docker image from NGC (https://ngc.nvidia.com/catalog/containers/nvidia:tri
Example:

```
docker pull nvcr.io/nvidia/tritonserver:22.02-py3
docker pull nvcr.io/nvidia/tritonserver:22.08-py3
```

##### Start Triton Inference Server container
```
cd ${MORPHEUS_ROOT}/models
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:22.02-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model log-parsing-onnx
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model log-parsing-onnx
```

##### Verify Model Deployment
Expand Down
2 changes: 1 addition & 1 deletion examples/nlp_si_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ This example utilizes the Triton Inference Server to perform inference. The neur
From the Morpheus repo root directory, run the following to launch Triton and load the `sid-minibert` model:

```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:22.02-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx
```

Where `22.02-py3` can be replaced with the current year and month of the Triton version to use. For example, to use May 2021, specify `nvcr.io/nvidia/tritonserver:21.05-py3`. Ensure that the version of TensorRT that is used in Triton matches the version of TensorRT elsewhere (see [NGC Deep Learning Frameworks Support Matrix](https://docs.nvidia.com/deeplearning/frameworks/support-matrix/index.html)).
Expand Down
4 changes: 2 additions & 2 deletions examples/ransomware_detection/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@ Pull Docker image from NGC (https://ngc.nvidia.com/catalog/containers/nvidia:tri
Example:

```
docker pull nvcr.io/nvidia/tritonserver:22.06-py3
docker pull nvcr.io/nvidia/tritonserver:22.08-py3
```

##### Start Triton Inference Server container
```bash
cd ${MORPHEUS_ROOT}/examples/ransomware_detection

# Run Triton in explicit mode
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models/triton-model-repo nvcr.io/nvidia/tritonserver:22.06-py3 \
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD/models:/models/triton-model-repo nvcr.io/nvidia/tritonserver:22.08-py3 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--model-control-mode=explicit \
Expand Down
2 changes: 1 addition & 1 deletion examples/sid_visualization/docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ x-with-gpus: &with_gpus

services:
triton:
image: nvcr.io/nvidia/tritonserver:22.06-py3
image: nvcr.io/nvidia/tritonserver:22.08-py3
<<: *with_gpus
command: "tritonserver --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx --model-repository=/models/triton-model-repo"
environment:
Expand Down
7 changes: 2 additions & 5 deletions models/triton-model-repo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ To launch Triton with one of the models in `triton-model-repo`, this entire repo
### Load `sid-minibert-onnx` Model with Default Triton Image

```bash
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD:/models --name tritonserver nvcr.io/nvidia/tritonserver:21.06-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD:/models --name tritonserver nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-onnx
```

### Load `abp-nvsmi-xgb` Model with FIL Backend Triton
Expand All @@ -36,9 +36,6 @@ docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD:/model
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD:/models --name tritonserver triton_fil tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model abp-nvsmi-xgb
```

Note: The FIL Backend Triton image was built with `docker build -t triton_fil -f ops/Dockerfile .`. Adjust the image name as necessary.


### Load `sid-minibert-trt` Model with Default Triton Image from Morpheus Repo

To load a TensorRT model, it first must be compiled with the `morpheus tools onnx-to-trt` utility (See `triton-model-repo/sid-minibert-trt/1/README.md` for more info):
Expand All @@ -51,5 +48,5 @@ morpheus tools onnx-to-trt --input_model ../../sid-minibert-onnx/1/sid-minibert.
Then launch Triton:

```bash
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/models:/models --name tritonserver nvcr.io/nvidia/tritonserver:21.06-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-trt
docker run --rm --gpus=all -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/models:/models --name tritonserver nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --exit-on-error=false --model-control-mode=explicit --load-model sid-minibert-trt
```
6 changes: 3 additions & 3 deletions scripts/validation/kafka_testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ For this test we are going to replace the from & to file stages from the ABP val
1. In a new terminal launch Triton:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v ${MORPHEUS_ROOT}/models:/models \
nvcr.io/nvidia/tritonserver:22.02-py3 \
nvcr.io/nvidia/tritonserver:22.08-py3 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--model-control-mode=explicit \
Expand Down Expand Up @@ -338,7 +338,7 @@ For this test we are going to replace the from & to file stages from the Phishin
1. In a new terminal launch Triton:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v ${MORPHEUS_ROOT}/models:/models \
nvcr.io/nvidia/tritonserver:22.02-py3 \
nvcr.io/nvidia/tritonserver:22.08-py3 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--model-control-mode=explicit \
Expand Down Expand Up @@ -411,7 +411,7 @@ Note: Due to the complexity of the input data and a limitation of the cudf reade
1. In a new terminal launch Triton:
```bash
docker run --rm -ti --gpus=all -p8000:8000 -p8001:8001 -p8002:8002 -v ${MORPHEUS_ROOT}/models:/models \
nvcr.io/nvidia/tritonserver:22.02-py3 \
nvcr.io/nvidia/tritonserver:22.08-py3 \
tritonserver --model-repository=/models/triton-model-repo \
--exit-on-error=false \
--model-control-mode=explicit \
Expand Down
2 changes: 1 addition & 1 deletion scripts/validation/val-globals.sh
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ export e="\033[0;90m"
export y="\033[0;33m"
export x="\033[0m"

export TRITON_IMAGE=${TRITON_IMAGE:-"nvcr.io/nvidia/tritonserver:22.06-py3"}
export TRITON_IMAGE=${TRITON_IMAGE:-"nvcr.io/nvidia/tritonserver:22.08-py3"}

# TRITON_GRPC_PORT is only used when TRITON_URL is undefined
export TRITON_GRPC_PORT=${TRITON_GRPC_PORT:-"8001"}
Expand Down
2 changes: 1 addition & 1 deletion scripts/validation/val-utils.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ function wait_for_triton {

function ensure_triton_running {

TRITON_IMAGE=${TRITON_IMAGE:-"nvcr.io/nvidia/tritonserver:22.06-py3"}
TRITON_IMAGE=${TRITON_IMAGE:-"nvcr.io/nvidia/tritonserver:22.08-py3"}

IS_RUNNING=$(is_triton_running)

Expand Down
4 changes: 2 additions & 2 deletions tests/benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,14 +24,14 @@ Pull Docker image from NGC (https://ngc.nvidia.com/catalog/containers/nvidia:tri
Example:

```
docker pull nvcr.io/nvidia/tritonserver:22.02-py3
docker pull nvcr.io/nvidia/tritonserver:22.08-py3
```

##### Start Triton Inference Server container
```
cd ${MORPHEUS_ROOT}/models
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:22.06-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model sid-minibert-onnx --load-model abp-nvsmi-xgb --load-model phishing-bert-onnx
docker run --gpus=1 --rm -p8000:8000 -p8001:8001 -p8002:8002 -v $PWD:/models nvcr.io/nvidia/tritonserver:22.08-py3 tritonserver --model-repository=/models/triton-model-repo --model-control-mode=explicit --load-model sid-minibert-onnx --load-model abp-nvsmi-xgb --load-model phishing-bert-onnx
```

##### Verify Model Deployments
Expand Down

0 comments on commit 74f7c3c

Please sign in to comment.