Empty Recordings File in utils/fix_data_dir.sh Script #4920
Description
I am using the WeSpeaker pipeline and Kaldi toolkit for a speaker diarization task, employing ResNet as the feature extractor. During the filtering of my segments file using the script utils/fix_data_dir.sh, I ran into an issue where the script filters my segments file to zero lines due to the temporary file /tmp/kaldi.XXXX/recordings having no entries
The following is the link to the script : https://github.com/kaldi-asr/kaldi/blob/master/egs/wsj/s5/utils/fix_data_dir.sh .
This is a part of the error output:
utils/fix_data_dir.sh: filtered /data1/XYZ/ABC/speaker_diarization/SHARC_check/tools_wespk/data/ABC_dev_fbank_seg/old_dir/segments from 8310 to 0 lines based on filter /tmp/kaldi.3oKA/recordings.
I found that the file /tmp/kaldi.XXXX/recordings generated by the script is empty which causes the script to filter out all lines from the segments file.
-
What might be causing the /tmp/kaldi.XXXX/recordings file to be empty?
-
Are there any known issues or additional steps required to ensure the recordings file is correctly populated?
If required I can provide the formats of the necessary files to check for any formatting errors between segments and wav.scp which is being used to generate the /tmp/kaldi.XXXX/recordings file
Thanks