推理时RuntimeError: Expected 2D or 3D (batch mode) tensor with possibly 0 batch size and other non-zero dimensions for input, but got: [1, 0, 0] #1974
Labels
question
Further information is requested
raceback (most recent call last): | 0/4 [00:00<?, ?it/s]
File "/root/workspace/FunASR/examples/industrial_data_pretraining/sense_voice/deno2.py", line 28, in
res = model.generate(
^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/funasr/auto/auto_model.py", line 263, in generate
return self.inference_with_vad(input, input_len=input_len, **cfg)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/funasr/auto/auto_model.py", line 417, in inference_with_vad
results = self.inference(
^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/funasr/auto/auto_model.py", line 302, in inference
res = model.inference(**batch, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/funasr/models/sense_voice/model.py", line 832, in inference
speech, speech_lengths = extract_fbank(
^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/funasr/utils/load_utils.py", line 173, in extract_fbank
data, data_len = frontend(data, data_len, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1518, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1527, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/funasr/frontends/wav_frontend.py", line 134, in forward
mat = kaldi.fbank(
^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/torchaudio/compliance/kaldi.py", line 600, in fbank
strided_input, signal_log_energy = _get_window(
^^^^^^^^^^^^
File "/usr/bin/anaconda3/envs/Whisper-Finetune/lib/python3.11/site-packages/torchaudio/compliance/kaldi.py", line 195, in _get_window
offset_strided_input = torch.nn.functional.pad(strided_input.unsqueeze(0), (1, 0), mode="replicate").squeeze(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Expected 2D or 3D (batch mode) tensor with possibly 0 batch size and other non-zero dimensions for input, but got: [1, 0, 0]
请问怎么解决呢,一部分音频可以,一部分报错
The text was updated successfully, but these errors were encountered: