Skip to content

Commit 746c414

Browse files
author
GrzegorzKarchNV
authored
Merge pull request NVIDIA#353 from rajeevsrao/master
Minor README cleanup for TRT Tacotron2 example
2 parents 2fc48f1 + 42555b7 commit 746c414

File tree

1 file changed

+26
-32
lines changed
  • PyTorch/SpeechSynthesis/Tacotron2/trt

1 file changed

+26
-32
lines changed

PyTorch/SpeechSynthesis/Tacotron2/trt/README.md

Lines changed: 26 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,11 @@
1-
# Tacotron 2 and WaveGlow Inference For TensorRT
2-
3-
This is subfolder of the Tacotron 2 for PyTorch repository, tested and
4-
maintained by NVIDIA, and provides scripts to perform high-performance
5-
inference using NVIDIA TensorRT.
6-
The Tacotron 2 and WaveGlow models form a text-to-speech (TTS) system that
7-
enables users to synthesize natural sounding speech from raw transcripts
8-
without any additional information such as patterns and/or rhythms of speech.
9-
More information about the TTS system and its training can be found in the
1+
# Tacotron 2 and WaveGlow Inference with TensorRT
2+
3+
This is subfolder of the Tacotron 2 for PyTorch repository, tested and maintained by NVIDIA, and provides scripts to perform high-performance inference using NVIDIA TensorRT.
4+
5+
The Tacotron 2 and WaveGlow models form a text-to-speech (TTS) system that enables users to synthesize natural sounding speech from raw transcripts without any additional information such as patterns and/or rhythms of speech. More information about the TTS system and its training can be found in the
106
[Tacotron 2 PyTorch README](../README.md).
11-
NVIDIA TensorRT is a platform for high-performance deep learning inference.
12-
It includes a deep learning inference optimizer and runtime that delivers low
13-
latency and high-throughput for deep learning inference applications. After
14-
optimizing the compute-intensive acoustic model with NVIDIA TensorRT,
15-
inference throughput increased by up to 1.4x over native PyTorch in mixed
16-
precision.
7+
8+
NVIDIA TensorRT is a platform for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high-throughput for deep learning inference applications. After optimizing the compute-intensive acoustic model with NVIDIA TensorRT, inference throughput increased by up to 1.4x over native PyTorch in mixed precision.
179

1810

1911
## Quick Start Guide
@@ -23,17 +15,16 @@ precision.
2315
```bash
2416
git clone https://github.com/NVIDIA/DeepLearningExamples
2517
cd DeepLearningExamples/PyTorch/SpeechSynthesis/Tacotron2
26-
```
18+
```
2719
28-
2. Download pretrained checkpoints from [NGC](https://ngc.nvidia.com/catalog/models)
29-
and store them in `./checkpoints` directory:
20+
2. Download pretrained checkpoints from [NGC](https://ngc.nvidia.com/catalog/models) and copy them to the `./checkpoints` directory:
3021
3122
- [Tacotron2 checkpoint](https://ngc.nvidia.com/models/nvidia:tacotron2pyt_fp16)
3223
- [WaveGlow checkpoint](https://ngc.nvidia.com/models/nvidia:waveglow256pyt_fp16)
3324
3425
```bash
3526
mkdir -p checkpoints
36-
mv <Tacotron2_checkpoint> <WaveGlow_checkpoint> ./checkpoints/
27+
cp <Tacotron2_and_WaveGlow_checkpoints> ./checkpoints/
3728
```
3829
3930
3. Build the Tacotron 2 and WaveGlow PyTorch NGC container.
@@ -49,10 +40,18 @@ and store them in `./checkpoints` directory:
4940
bash scripts/docker/interactive.sh
5041
```
5142
52-
5. Export the models to ONNX intermediate representations (ONNX IRs).
43+
5. Verify that TensorRT version installed is 7.0 or greater. If necessary, download and install the latest release from https://developer.nvidia.com/nvidia-tensorrt-download
44+
45+
```bash
46+
pip list | grep tensorrt
47+
dpkg -l | grep TensorRT
48+
```
49+
50+
6. Export the models to ONNX intermediate representation (ONNX IR).
5351
Export Tacotron 2 to three ONNX parts: Encoder, Decoder, and Postnet:
5452
5553
```bash
54+
mkdir -p output
5655
python exports/export_tacotron2_onnx.py --tacotron2 ./checkpoints/nvidia_tacotron2pyt_fp16_20190427 -o output/
5756
```
5857
@@ -62,32 +61,27 @@ and store them in `./checkpoints` directory:
6261
python exports/export_waveglow_onnx.py --waveglow ./checkpoints/nvidia_waveglow256pyt_fp16 --wn-channels 256 -o output/
6362
```
6463
65-
After running the above commands, there should be four new files in `./output/`
66-
directory: `encoder.onnx`, `decoder_iter.onnx`, `postnet.onnx`, and 'waveglow.onnx`.
64+
After running the above commands, there should be four new ONNX files in `./output/` directory:
65+
`encoder.onnx`, `decoder_iter.onnx`, `postnet.onnx`, and `waveglow.onnx`.
6766
68-
6. Export the ONNX IRs to TensorRT engines:
67+
7. Export the ONNX IRs to TensorRT engines with fp16 mode enabled:
6968
7069
```bash
7170
python trt/export_onnx2trt.py --encoder output/encoder.onnx --decoder output/decoder_iter.onnx --postnet output/postnet.onnx --waveglow output/waveglow.onnx -o output/ --fp16
7271
```
7372
74-
After running the command, there should be four new files in `./output/`
75-
directory: `encoder_fp16.engine`, `decoder_iter_fp16.engine`,
76-
`postnet_fp16.engine`, and 'waveglow_fp16.engine`.
73+
After running the command, there should be four new engine files in `./output/` directory:
74+
`encoder_fp16.engine`, `decoder_iter_fp16.engine`, `postnet_fp16.engine`, and `waveglow_fp16.engine`.
7775
78-
7. Run the inference:
76+
8. Run TTS inference pipeline with fp16:
7977
8078
```bash
8179
python trt/inference_trt.py -i phrases/phrase.txt --encoder output/encoder_fp16.engine --decoder output/decoder_iter_fp16.engine --postnet output/postnet_fp16.engine --waveglow output/waveglow_fp16.engine -o output/
8280
```
8381
8482
## Inference performance: NVIDIA T4
8583
86-
Our results were obtained by running the `./trt/run_latency_tests_trt.sh` script in
87-
the PyTorch-19.11-py3 NGC container. Please note that to reproduce the results,
88-
you need to provide pretrained checkpoints for Tacotron 2 and WaveGlow. Please
89-
edit the script to provide your checkpoint filenames. For all tests in this table,
90-
we used WaveGlow with 256 residual channels.
84+
Our results were obtained by running the `./trt/run_latency_tests_trt.sh` script in the PyTorch-19.11-py3 NGC container. Please note that to reproduce the results, you need to provide pretrained checkpoints for Tacotron 2 and WaveGlow. Please edit the script to provide your checkpoint filenames. For all tests in this table, we used WaveGlow with 256 residual channels.
9185
9286
|Framework|Batch size|Input length|Precision|Avg latency (s)|Latency std (s)|Latency confidence interval 90% (s)|Latency confidence interval 95% (s)|Latency confidence interval 99% (s)|Throughput (samples/sec)|Speed-up PyT+TRT/TRT|Avg mels generated (81 mels=1 sec of speech)|Avg audio length (s)|Avg RTF|
9387
|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|---:|

0 commit comments

Comments
 (0)