Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么说第二句话的时候才能计算出第一句话的标点?这样岂不是最后一句话一直没有标点吗? #2143

Open
yfqvip opened this issue Oct 15, 2024 · 2 comments
Labels
question Further information is requested

Comments

@yfqvip
Copy link

yfqvip commented Oct 15, 2024

如题,这个问题困扰了我很久,请问要做什么配置吗?
例如,第一句话说:“你好吗”(此时没有问号),第二句话说:“我很好”(此时回复的是:“?我很好”),那么如果我只说一句话就没有标点了;多句话的时候最后一句也是没有标点的。
感谢~
下面是具体的配置以及日志:
sudo docker pull registry.cn-hangzhou.aliyuncs.com/funasr_repo/funasr:funasr-runtime-sdk-online-cpu-0.1.11

nohup bash run_server_2pass.sh
--download-model-dir /workspace/models
--vad-dir damo/speech_fsmn_vad_zh-cn-16k-common-onnx
--model-dir damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-onnx
--online-model-dir damo/speech_paraformer-large_asr_nat-zh-cn-16k-common-vocab8404-online-onnx
--punc-dir damo/punc_ct-transformer_zh-cn-common-vad_realtime-vocab272727-onnx
--lm-dir damo/speech_ngram_lm_zh-cn-ai-wesp-fst
--itn-dir thuduj12/fst_itn_zh
--certfile 0
--hotword /workspace/models/hotwords.txt > log.txt 2>&1 &

I20241015 11:04:38.081081 978 websocket-server-2pass.cpp:478] jsonresult={"chunk_interval":10,"chunk_size":[5,10,5],"is_speaking":false,"mode":"2pass","wav_name":"h5"}, msg_data->msg={"access_num":0,"audio_fs":16000,"is_eof":false,"itn":false,"mode":"2pass","wav_format":"pcm","wav_name":"h5"}
I20241015 11:04:38.081166 978 websocket-server-2pass.cpp:484] client done
I20241015 11:04:43.084945 977 websocket-server-2pass.cpp:342] connection is closed.
I20241015 11:04:51.849478 978 websocket-server-2pass.cpp:444] hotwords:
I20241015 11:04:51.849568 978 websocket-server-2pass.cpp:447] ͨ��ǧ�� : 100
I20241015 11:04:51.849576 978 websocket-server-2pass.cpp:447] ħ�� : 100
I20241015 11:04:51.849582 978 websocket-server-2pass.cpp:447] "max_tokens": : 400
I20241015 11:04:51.849588 978 websocket-server-2pass.cpp:447] "temperature": : 0
I20241015 11:04:51.849593 978 websocket-server-2pass.cpp:447] hello world : 40
I20241015 11:04:51.849599 978 websocket-server-2pass.cpp:447] 阿里巴巴 : 20
I20241015 11:04:51.849725 978 bias-lm.h:134] Build bias lm takes 0.000115 s
I20241015 11:04:51.849825 978 websocket-server-2pass.cpp:478] jsonresult={"chunk_interval":10,"chunk_size":[5,10,5],"hotwords":"{"阿里巴巴":20,"hello world":40}","is_speaking":true,"itn":true,"mode":"2pass","wav_name":"h5"}, msg_data->msg={"access_num":0,"audio_fs":16000,"is_eof":false,"itn":true,"mode":"2pass","wav_format":"pcm","wav_name":"h5"}
I20241015 11:04:57.494354 949 websocket-server-2pass.cpp:67] online_res :你好
I20241015 11:04:58.202082 955 websocket-server-2pass.cpp:67] online_res :吗
I20241015 11:04:58.202142 955 websocket-server-2pass.cpp:73] offline results : 你好吗
I20241015 11:04:58.202149 955 websocket-server-2pass.cpp:80] offline stamps : [[2270,2470],[2470,2629],[2629,2965]]
I20241015 11:04:58.202169 955 websocket-server-2pass.cpp:88] offline stamp_sents : [{"end":2965,"punc":"","start":2270,"text_seg":"你 好 吗","ts_list":[[2270,2470],[2470,2629],[2629,2965]]}]
I20241015 11:05:01.354351 942 websocket-server-2pass.cpp:67] online_res :我
I20241015 11:05:01.975816 949 websocket-server-2pass.cpp:67] online_res :很
I20241015 11:05:02.760255 956 websocket-server-2pass.cpp:67] online_res :好
I20241015 11:05:02.760314 956 websocket-server-2pass.cpp:73] offline results : ?我很好
I20241015 11:05:02.760320 956 websocket-server-2pass.cpp:80] offline stamps : [[6260,6620],[6620,7000],[7000,7395]]
I20241015 11:05:02.760339 956 websocket-server-2pass.cpp:88] offline stamp_sents : [{"end":-1,"punc":"?","start":-1,"text_seg":"","ts_list":[]},{"end":7395,"punc":"","start":6260,"text_seg":"我 很 好","ts_list":[[6260,6620],[6620,7000],[7000,7395]]}]
I20241015 11:05:53.089830 979 websocket-server-2pass.cpp:478] jsonresult={"chunk_interval":10,"chunk_size":[5,10,5],"is_speaking":false,"mode":"2pass","wav_name":"h5"}, msg_data->msg={"access_num":1,"audio_fs":16000,"is_eof":false,"itn":true,"mode":"2pass","wav_format":"pcm","wav_name":"h5"}
I20241015 11:05:53.089934 979 websocket-server-2pass.cpp:484] client done
I20241015 11:05:58.088874 977 websocket-server-2pass.cpp:342] connection is closed.

@yfqvip yfqvip added the question Further information is requested label Oct 15, 2024
@LauraGPT
Copy link
Collaborator

是的,目前的模型训练过程是这个样子的。

@duj12
Copy link
Collaborator

duj12 commented Nov 12, 2024

可以通过在2pass中使用离线标点模型(不带realtime的那个标点模型)去解决,但是需要改一下代码,可以参考 duj12/ASR-2Pass@727f9a8
这样做的好处是单句话上屏显示的时候不会有前一句标点在后一句开头的问题,坏处就是会因为误切分而导致可能出现多余标点。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants