Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sherpa-onnx support #50

Merged
merged 3 commits into from
Dec 13, 2024
Merged

Add sherpa-onnx support #50

merged 3 commits into from
Dec 13, 2024

Conversation

Neil2893
Copy link
Contributor

@Neil2893 Neil2893 commented Dec 8, 2024

This pull request adds support for sherpa-onnx (https://github.com/k2-fsa/sherpa-onnx) ASR models. It allows users to easily integrate various sherpa-onnx models, including transducer, Paraformer, NeMo CTC, WeNet CTC, Whisper, TDNN CTC, and SenseVoice models.

Note: Only SenseVoice and Paraformer models have been tested at this time. Further testing with other model types is encouraged.
(This implementation was developed with the assistance of an AI language model.)

sherpa-onnx offers great performance and is significantly lighter than FunASR. (no torch)

@Neil2893
Copy link
Contributor Author

Neil2893 commented Dec 9, 2024

新增加了TTS引擎部分,现在可以使用 sherpa-onnx TTS了,在config_alts里添加了部分配置样例,方便参考。
使用sherpa-onnx 只需要以下几步:

  1. pip install sherpa-onnx
  2. https://github.com/k2-fsa/sherpa-onnx/releases/tag/asr-modelshttps://github.com/k2-fsa/sherpa-onnx/releases/tag/tts-models 选择想要使用的模型
  3. 参考config_alts里的配置进行配置,要修改解压后模型所在的地址。
  4. 运行就好了,性能不错。

推荐一下模型:
英文的话, piper不错。
纯中文的话 https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/sherpa-onnx-vits-zh-ll.tar.bz2
中英:只有melo了 https://github.com/k2-fsa/sherpa-onnx/releases/download/tts-models/vits-melo-tts-zh_en.tar.bz2
melo 的英文发音……有点难以入耳。

@t41372
Copy link
Owner

t41372 commented Dec 9, 2024

太牛了👍 我来看看
话说能在 README.md 里面添加一下文档吗? (README.CN.md 现在跟 README.md 没有很同步,我之后会一起更新)

@t41372
Copy link
Owner

t41372 commented Dec 9, 2024

我也在看这个 sherpa-onnx,感觉真不错。正想着他的 tts 部分能替代 Melo 你就更新了,太赞了👍。

@Neil2893
Copy link
Contributor Author

Neil2893 commented Dec 10, 2024

已经更新了 README.md,添加了简要的安装使用说明, 虽然觉得sherpa-onnx也可以代替 funasr 成为默认的推荐,但还是等等后面大家使用的反馈吧。

@t41372
Copy link
Owner

t41372 commented Dec 11, 2024

抱歉,我这几天期末考有点忙来着。我一两天之后再来看...

@Neil2893
Copy link
Contributor Author

Neil2893 commented Dec 12, 2024

抱歉,我这几天期末考有点忙来着。我一两天之后再来看...

没关系的,还请先好好准备考试。我能理解是责任感在作怪,但真的无需感到压力哦。

# whisper_encoder: "" # Path to the Whisper encoder model (e.g., "path/to/encoder.onnx")
# whisper_decoder: "" # Path to the Whisper decoder model (e.g., "path/to/decoder.onnx")
# --- For model_type: "sense_voice" ---
sense_voice: "/path/to/sherpa-onnx-sense-voice-zh-en-ja-ko-yue-2024-07-17/model.onnx" # Path to the SenseVoice model (e.g., "path/to/model.onnx")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest that you use model.int8.onnx, which is way smaller in file size than that of model.onnx.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the heads up! I appreciate the suggestion to use model.int8.onnx. Do you have any insights into how the int8 model performs compared to the original model.onnx in terms of recognition accuracy? Have you had a chance to evaluate its performance?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I only tested it on fewer than 4 test wave files and the quantized model produces identical results as the not quantized model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

3 participants