Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

Open
JonneryR opened this issue Nov 15, 2024 · 5 comments
Open

IndexError: index 3 is out of bounds for dimension 1 with size 2 #2211

JonneryR opened this issue Nov 15, 2024 · 5 comments
Labels
question Further information is requested

Comments

@JonneryR
Copy link

Question:
[rank0]: File "/home/jovyan/.conda/envs/funasr/lib/python3.11/site-packages/funasr/models/sense_voice/model.py", line 758, in encode
[rank0]: [[self.textnorm_int_dict[int(style)]] for style in text[:, 3]]
[rank0]: ~~~~^^^^^^
[rank0]: IndexError: index 3 is out of bounds for dimension 1 with size 2

What have you tried?

1.I try to print the shape of text:
torch.Size([82, 3])
torch.Size([89, 2])
2.I try to add text_language, emo and event in the jsonl file:
e.g
{"key": "wechat_msg_chunk_12135561624217528293_1709826135802_external_left_chunk_32_16k_real_8k_alaw__629597", "source": "/home/jovyan/jonneryr/asr/sales/data/train/label/train_wavs_wer_less_than_5_16k/wechat_msg_chunk_12135561624217528293_1709826135802_external_left_chunk_32_16k_real_8k_alaw_.wav", "source_len": 261, "target": "零六没有", "target_len": 4, "emo_target": "<|NEUTRAL|>", "emo_target_len": 11, "event_target": "<|Speech|>", "event_target_len": 10, "text_language": "<|zh|>", "text_language_len": 6}

What's your environment?

  • OS (e.g., Linux):Linux
  • FunASR Version (e.g., 1.0.0):1.1.14
  • ModelScope Version (e.g., 1.11.0):1.19.2
  • PyTorch Version (e.g., 2.0.0):2.5.1
  • How you installed funasr (pip, source):pip
  • Python version:3.11
  • GPU (e.g., V100M32) A800
  • CUDA/cuDNN version (e.g., cuda11.7):cuda12.4
  • Docker version (e.g., funasr-runtime-sdk-cpu-0.4.1)
  • Any other relevant information:
@JonneryR JonneryR added the question Further information is requested label Nov 15, 2024
@JonneryR
Copy link
Author

看起来好像是speech 要做padding,text应该也要做一些padding才对?
但是没有这部分代码。

@JonneryR
Copy link
Author

问题解决了,是dataset的选择问题,需要选择SenseVoiceCTCDataset,只有这里才有给text前面做padding的代码。

@ouyang982020
Copy link

问题解决了,是dataset的选择问题,需要选择SenseVoiceCTCDataset,只有这里才有给text前面做padding的代码。

请问一下,我这边也遇到了同样的问题,能告知是哪个位置去做修改+选择吗

@JonneryR
Copy link
Author

JonneryR commented Dec 3, 2024

就是训练的时候dataset 设置成 SenseVoiceCTCDataset

@ouyang982020
Copy link

就是训练的时候dataset 设置成 SenseVoiceCTCDataset

好的,我好像明白了,命令行里面可以加

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants