2pass 模式，如何使用GPU加快offline结果的输出？ #2125

chenht2021 · 2024-10-09T03:36:51Z

Notice: In order to resolve issues more efficiently, please raise issue following the template.
（注意：为了更加高效率解决您遇到的问题，请按照模板提问，补充细节）

❓ Questions and Help

使用2pass模式时，虽然online速度快，但是识别效果相对于offline较差，因为使用offline的结果，但是offline的结果耗时较长，尝试使用这个API “OrtSessionOptionsAppendExecutionProvider_CUDA” 来使用CUDA，计算的确是在GPU上进行的，但是速度并没有什么太大变化。测试音频时一个不间断说话，长约10.84秒的音频，耗时需要1.5s以上。

请问有什么方法能加速offline模型的onnx的推理？

也尝试过离线的镜像，因为离线镜像是所有结果一起返回，不符合使用场景。据我的认知，离线镜像若使用GPU，则使用的libtorch；如果不使用GPU，则用的是onnxruntime，两者的确在速度上有明显的差别，将batch_size设置为1，libtorch解码之前提到的音频，比onnx要快约800ms

Before asking:

search the issues.
search the docs.

What is your question?

Code

What have you tried?

What's your environment?

OS (e.g., Linux):
FunASR Version (e.g., 1.0.0):
ModelScope Version (e.g., 1.11.0):
PyTorch Version (e.g., 2.0.0):
How you installed funasr (pip, source):
Python version:
GPU (e.g., V100M32)
CUDA/cuDNN version (e.g., cuda11.7):
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

The text was updated successfully, but these errors were encountered:

LauraGPT · 2024-11-05T09:30:59Z

On going

chenht2021 added the question Further information is requested label Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

2pass 模式，如何使用GPU加快offline结果的输出？ #2125

2pass 模式，如何使用GPU加快offline结果的输出？ #2125

chenht2021 commented Oct 9, 2024

LauraGPT commented Nov 5, 2024

2pass 模式，如何使用GPU加快offline结果的输出？ #2125

2pass 模式，如何使用GPU加快offline结果的输出？ #2125

Comments

chenht2021 commented Oct 9, 2024

❓ Questions and Help

Before asking:

What is your question?

Code

What have you tried?

What's your environment?

LauraGPT commented Nov 5, 2024