Skip to content

Conversation

@yuekaizhang
Copy link

@yuekaizhang yuekaizhang commented Oct 22, 2025

This PR uses Nvidia TensorRT to accelerate StepAudio2's token2wav module.

  • Both Streaming and Offline Mode using TRT
  • Offline Mode batch > 1 inference
  • Speaker Embedding Model using TRT

The following benchmark was conducted on an NVIDIA L20 GPU, generating 26 audio clips with a total length of 170 seconds.

Method Note Cost Time RTF
Offline batch=1, PyTorch 4.32 seconds 0.025
Offline batch=1, TensorRT enabled 2.09 seconds 0.012
Offline batch=2, PyTorch 3.77 seconds 0.022
Offline batch=2, TensorRT enabled 1.97 seconds 0.012
Streaming batch=1, chunk_size = 1 second, PyTorch 20.3 seconds 0.119
Streaming batch=1, chunk_size = 1 second, TensorRT 12.96 seconds 0.076

For more details, see https://github.com/yuekaizhang/Step-Audio2/blob/trt/tools/tensorrt_token2wav.md

@yuekaizhang yuekaizhang changed the title Trt Add TensorRT Token2wav Oct 22, 2025
@yuekaizhang
Copy link
Author

See also Cosyvoice2 LLM + StepAudio2 Token2wav https://github.com/FunAudioLLM/CosyVoice/blob/main/runtime/triton_trtllm/README.DIT.md.

@light1726
Copy link

Hi! Great work! I had some tests and they consistently failed with an AssertionError at this line: torch.testing.assert_allclose(output_pytorch, torch.from_numpy(output_onnx).to(device), rtol=1e-2, atol=1e-4). Could you please share the versions of key dependencies you used, such as:

  • TensorRT
  • ONNX Runtime GPU
  • CUDA
  • PyTorch

This would help me align my environment with yours. Or do you have any insights on what might be causing this discrepancy?

@yuekaizhang
Copy link
Author

Hi! Great work! I had some tests and they consistently failed with an AssertionError at this line: torch.testing.assert_allclose(output_pytorch, torch.from_numpy(output_onnx).to(device), rtol=1e-2, atol=1e-4). Could you please share the versions of key dependencies you used, such as:

  • TensorRT
  • ONNX Runtime GPU
  • CUDA
  • PyTorch

This would help me align my environment with yours. Or do you have any insights on what might be causing this discrepancy?

@light1726 The error should be harmless. Go ahead please.

@yuekaizhang
Copy link
Author

yuekaizhang commented Nov 21, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants