Skip to content

AssertionError: choose a window size 400 that is [2, 1] #133

Open
@GrafKnusprig

Description

@GrafKnusprig

I try to use the feature extractor on my audiofiles.
My audio files are all 16000Hz and 5 seconds long.
The waveform.shape[1] is 80000

input_values = feature_extractor(waveform, sampling_rate=16000, return_tensors="pt").input_values

I get the error:
AssertionError: choose a window size 400 that is [2, 1]
and I don't really know what to do with it.

Here is the whole thing:

def preprocess_function(examples):
    audio_files = examples['file_path']
    inputs = {'input_values': []}
    for audio_file in tqdm(audio_files, desc="Preprocessing dataset"):
        waveform, sample_rate = torchaudio.load(audio_file)
        # Ensure sample rate is 16000 Hz
        assert sample_rate == 16000, f"Expected sample rate of 16000 Hz, but got {sample_rate} Hz"
        # Assuming all audio files are 5 seconds long
        max_len = 16000 * 5  # 5 seconds at 16000 Hz
        # Pad or truncate to the maximum length
        print(waveform.shape[1])
        if waveform.shape[1] > max_len:
            waveform = waveform[:, :max_len]
        else:
            waveform = torch.nn.functional.pad(waveform, (0, max_len - waveform.shape[1]), "constant", 0)
        input_values = feature_extractor(waveform, sampling_rate=16000, return_tensors="pt").input_values
        inputs['input_values'].append(input_values.squeeze(0))
    return inputs


processed_dataset = dataset_dict.map(preprocess_function, batched=True, remove_columns=['file_path'])```

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions