IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

JonneryR · 2024-11-15T14:01:20Z

Question:
[rank0]: File "/home/jovyan/.conda/envs/funasr/lib/python3.11/site-packages/funasr/models/sense_voice/model.py", line 758, in encode
[rank0]: [[self.textnorm_int_dict[int(style)]] for style in text[:, 3]]
[rank0]: ~~~~^^^^^^
[rank0]: IndexError: index 3 is out of bounds for dimension 1 with size 2

What have you tried?

1.I try to print the shape of text:
torch.Size([82, 3])
torch.Size([89, 2])
2.I try to add text_language, emo and event in the jsonl file:
e.g
{"key": "wechat_msg_chunk_12135561624217528293_1709826135802_external_left_chunk_32_16k_real_8k_alaw__629597", "source": "/home/jovyan/jonneryr/asr/sales/data/train/label/train_wavs_wer_less_than_5_16k/wechat_msg_chunk_12135561624217528293_1709826135802_external_left_chunk_32_16k_real_8k_alaw_.wav", "source_len": 261, "target": "零六没有", "target_len": 4, "emo_target": "<|NEUTRAL|>", "emo_target_len": 11, "event_target": "<|Speech|>", "event_target_len": 10, "text_language": "<|zh|>", "text_language_len": 6}

What's your environment?

OS (e.g., Linux):Linux
FunASR Version (e.g., 1.0.0):1.1.14
ModelScope Version (e.g., 1.11.0):1.19.2
PyTorch Version (e.g., 2.0.0):2.5.1
How you installed funasr (pip, source):pip
Python version:3.11
GPU (e.g., V100M32) A800
CUDA/cuDNN version (e.g., cuda11.7):cuda12.4
Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
Any other relevant information:

The text was updated successfully, but these errors were encountered:

JonneryR · 2024-11-15T15:17:21Z

看起来好像是speech 要做padding，text应该也要做一些padding才对？
但是没有这部分代码。

JonneryR · 2024-11-18T05:53:03Z

问题解决了，是dataset的选择问题，需要选择SenseVoiceCTCDataset，只有这里才有给text前面做padding的代码。

ouyang982020 · 2024-12-03T08:39:21Z

问题解决了，是dataset的选择问题，需要选择SenseVoiceCTCDataset，只有这里才有给text前面做padding的代码。

请问一下，我这边也遇到了同样的问题，能告知是哪个位置去做修改+选择吗

JonneryR · 2024-12-03T08:51:30Z

就是训练的时候dataset 设置成 SenseVoiceCTCDataset

ouyang982020 · 2024-12-03T09:30:04Z

就是训练的时候dataset 设置成 SenseVoiceCTCDataset

好的，我好像明白了，命令行里面可以加

JonneryR added the question Further information is requested label Nov 15, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

JonneryR commented Nov 15, 2024

JonneryR commented Nov 15, 2024

JonneryR commented Nov 18, 2024

ouyang982020 commented Dec 3, 2024

JonneryR commented Dec 3, 2024

ouyang982020 commented Dec 3, 2024

IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

Comments

JonneryR commented Nov 15, 2024

What have you tried?

What's your environment?

JonneryR commented Nov 15, 2024

JonneryR commented Nov 18, 2024

ouyang982020 commented Dec 3, 2024

JonneryR commented Dec 3, 2024

ouyang982020 commented Dec 3, 2024