Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

为什么读取本地音频报错 #2213

Open
deict opened this issue Nov 18, 2024 · 2 comments
Open

为什么读取本地音频报错 #2213

deict opened this issue Nov 18, 2024 · 2 comments
Labels
question Further information is requested

Comments

@deict
Copy link

deict commented Nov 18, 2024

from funasr import AutoModel
from funasr.utils.postprocess_utils import rich_transcription_postprocess

model_dir = "iic/SenseVoiceSmall"

model = AutoModel(
model=model_dir,
vad_model="fsmn-vad",
vad_kwargs={"max_single_segment_time": 30000},
device="cuda:0",
)

en

res = model.generate(
input=f"D:\demo\demo_1\recording.wav",
cache={},
language="auto", # "zn", "en", "yue", "ja", "ko", "nospeech"
use_itn=True,
batch_size_s=60,
merge_vad=True, #
merge_length_s=15,
)
text = rich_transcription_postprocess(res[0]["text"])
print(text)
上述是运行代码

Notice: ffmpeg is not installed. torchaudio is used to load audio
If you want to use ffmpeg backend to load audio, please install it by:
sudo apt install ffmpeg # ubuntu
# brew install ffmpeg # mac
Key Conformer already exists in model_classes, re-register
Key Linear already exists in adaptor_classes, re-register
Key TransformerDecoder already exists in decoder_classes, re-register
Key LightweightConvolutionTransformerDecoder already exists in decoder_classes, re-register
Key LightweightConvolution2DTransformerDecoder already exists in decoder_classes, re-register
Key DynamicConvolutionTransformerDecoder already exists in decoder_classes, re-register
Key DynamicConvolution2DTransformerDecoder already exists in decoder_classes, re-register
funasr version: 1.1.14.
Check update of funasr, and it would cost few times. You may disable it by set disable_update=True in AutoModel
You are using the latest version of funasr-1.1.14
2024-11-18 15:23:47,114 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
2024-11-18 15:23:50,544 - modelscope - WARNING - Using branch: master as version is unstable, use with caution
0%| | 0/1 [00:00<?, ?it/s]Traceback (most recent call last):
File "D:\Python\lib\site-packages\funasr\utils\load_utils.py", line 93, in load_audio_text_image_video
data_or_path_or_list, audio_fs = torchaudio.load(data_or_path_or_list)
File "D:\Python\lib\site-packages\torchaudio_backend\utils.py", line 203, in load
return backend.load(uri, frame_offset, num_frames, normalize, channels_first, format, buffer_size)
File "D:\Python\lib\site-packages\torchaudio_backend\soundfile.py", line 26, in load
return soundfile_backend.load(uri, frame_offset, num_frames, normalize, channels_first, format)
File "D:\Python\lib\site-packages\torchaudio_backend\soundfile_backend.py", line 221, in load
with soundfile.SoundFile(filepath, "r") as file_:
File "D:\Python\lib\site-packages\soundfile.py", line 658, in init
self._file = self._open(file, mode_int, closefd)
File "D:\Python\lib\site-packages\soundfile.py", line 1216, in _open
raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'D:\demo\demo_1\recording.wav': Format not recognised.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "D:\demo\语音识别.py", line 60, in
res = model.generate(
File "D:\Python\lib\site-packages\funasr\auto\auto_model.py", line 304, in generate
return self.inference_with_vad(input, input_len=input_len, **cfg)
File "D:\Python\lib\site-packages\funasr\auto\auto_model.py", line 377, in inference_with_vad
res = self.inference(
File "D:\Python\lib\site-packages\funasr\auto\auto_model.py", line 343, in inference
res = model.inference(**batch, **kwargs)
File "D:\Python\lib\site-packages\funasr\models\fsmn_vad_streaming\model.py", line 676, in inference
audio_sample_list = load_audio_text_image_video(
File "D:\Python\lib\site-packages\funasr\utils\load_utils.py", line 72, in load_audio_text_image_video
return [
File "D:\Python\lib\site-packages\funasr\utils\load_utils.py", line 73, in
load_audio_text_image_video(
File "D:\Python\lib\site-packages\funasr\utils\load_utils.py", line 97, in load_audio_text_image_video
data_or_path_or_list = _load_audio_ffmpeg(data_or_path_or_list, sr=fs)
File "D:\Python\lib\site-packages\funasr\utils\load_utils.py", line 213, in _load_audio_ffmpeg
out = run(cmd, capture_output=True, check=True).stdout
File "D:\Python\lib\subprocess.py", line 501, in run
with Popen(*popenargs, **kwargs) as process:
File "D:\Python\lib\subprocess.py", line 969, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "D:\Python\lib\subprocess.py", line 1438, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] 系统找不到指定的文件。
0%| | 0/1 [00:00<?, ?it/s]
这是报错

@deict deict added the question Further information is requested label Nov 18, 2024
@michsiu
Copy link

michsiu commented Dec 5, 2024

一样的问题,你解决了吗

@michsiu
Copy link

michsiu commented Dec 6, 2024

找到解决办法了,需要安装ffmpeg就可以了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants