Skip to content

Commit 3d6e3d6

Browse files
author
Swetha Mandava
committed
fix incorrect link in triton readme
1 parent 33ea90e commit 3d6e3d6

File tree

1 file changed

+2
-3
lines changed
  • TensorFlow/LanguageModeling/BERT/triton

1 file changed

+2
-3
lines changed

TensorFlow/LanguageModeling/BERT/triton/README.md

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,3 @@
1-
21
# Deploying the BERT TensorFlow model using Triton Inference Server
32

43
This folder contains instructions for deployment and exemplary client application to run inference on
@@ -183,7 +182,7 @@ For more information about `perf_client`, refer to the [official documentation](
183182

184183
### Latency vs Throughput for TensorRT Engine
185184

186-
Performance numbers for BERT Large, sequence length=384 are obtained from [experiments]([https://github.com/NVIDIA/TensorRT/tree/release/7.1/demo/BERT#inference-performance-nvidia-a100-40gb](https://github.com/NVIDIA/TensorRT/tree/release/7.1/demo/BERT#inference-performance-nvidia-a100-40gb)) on NVIDIA A100 with 1x A100 40G GPUs. Throughput is measured in samples/second, and latency in milliseconds.
185+
Performance numbers for BERT Large, sequence length=384 are obtained from [experiments](https://github.com/NVIDIA/TensorRT/tree/release/7.1/demo/BERT#inference-performance-nvidia-a100-40gb) on NVIDIA A100 with 1x A100 40G GPUs. Throughput is measured in samples/second, and latency in milliseconds.
187186

188187
![](../data/images/bert_trt_throughput_vs_latency.png?raw=true)
189188

@@ -232,4 +231,4 @@ April 2020
232231
TRTIS -> TRITON
233232

234233
October 2019
235-
Initial release
234+
Initial release

0 commit comments

Comments
 (0)