RuntimeError: upper bound and larger bound inconsistent with step sign #54

ecooper7 · 2024-11-28T00:20:29Z

Hello, thanks very much for sharing your code and pretrained models! I was able to successfully run prediction using the nisqa_tts.tar pretrained model already on several datasets, but one dataset gave this error (here is the full error text):

/home/SLC/users/ecooper/miniconda3/envs/nisqa/lib/python3.9/site-packages/librosa/core/spectrum.py:222: UserWarning: n_fft=4096 is too small for input signal of length=743
  warnings.warn(
Traceback (most recent call last):
  File "/share02/SLC/users/ecooper/workspace/NISQA/run_predict.py", line 43, in <module>
    nisqa.predict()
  File "/share02/SLC/users/ecooper/workspace/NISQA/nisqa/NISQA_model.py", line 67, in predict
    y_val_hat, y_val = NL.predict_mos(
  File "/share02/SLC/users/ecooper/workspace/NISQA/nisqa/NISQA_lib.py", line 1434, in predict_mos
    y_hat_list = [ [model(xb.to(dev), n_wins.to(dev)).cpu().numpy(), yb.cpu().numpy()] for xb, yb, (idx, n_wins) in dl]
  File "/share02/SLC/users/ecooper/workspace/NISQA/nisqa/NISQA_lib.py", line 1434, in <listcomp>
    y_hat_list = [ [model(xb.to(dev), n_wins.to(dev)).cpu().numpy(), yb.cpu().numpy()] for xb, yb, (idx, n_wins) in dl]
  File "/home/SLC/users/ecooper/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 521, in __next__
    data = self._next_data()
  File "/home/SLC/users/ecooper/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 561, in _next_data
    data = self._dataset_fetcher.fetch(index)  # may raise StopIteration
  File "/home/SLC/users/ecooper/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/SLC/users/ecooper/miniconda3/envs/nisqa/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 49, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/share02/SLC/users/ecooper/workspace/NISQA/nisqa/NISQA_lib.py", line 2180, in __getitem__
    x_spec_seg, n_wins = segment_specs(file_path,
  File "/share02/SLC/users/ecooper/workspace/NISQA/nisqa/NISQA_lib.py", line 2262, in segment_specs
    idx2 = torch.arange(n_wins)
RuntimeError: upper bound and larger bound inconsistent with step sign

This appears to be an issue of too-short audio samples, so I removed some of the shortest samples, but I still get this error. Checking my other datasets, they have samples which are equally short or shorter (I am guessing that maybe some batch-wise padding is happening which solves it?) In any case, I just wanted to ask whether there is some minimum audio sample length that is required for prediction. Thanks very much in advance for any advice!

The text was updated successfully, but these errors were encountered:

gabrielmittag · 2024-12-01T00:44:08Z

Hi,

Thanks for bringing it up, this seems to happen if there are less segments / windows available than required. For the TTS model 15 segments are required and each segment is 0.02 seconds long with a 0.01 hop size, which should be 160 ms but in my tests it seems that 140 ms are sufficient. That might be due to the actual FFT window size being larger and some padding being applied.

Overall I don't know how reliable the model would be for such short samples. It was mostly trained on the Blizzard challenge datasets and they did not contain such short samples.

A workaround could be to add zero padding to the audio sample directly. This could also be done within the code after the sample is loaded here:

NISQA/nisqa/NISQA_lib.py

Line 2295 in ac83137

y, sr = lb.load(file_path, sr=sr, mono=False)

BTW, I have now added to raise an error in this case to make it clearer what the issue is. I am not sure why it wouldn't happen for some samples that are shorter. Do you have samples that are longer than 160 ms and they fail?

ecooper7 · 2024-12-03T05:06:43Z

Hi, thanks so much for the helpful information, I actually did have one super short audio sample (50ms!) that I had somehow missed earlier that had been causing it to fail. I was able to run prediction successfully now after removing it.

ecooper7 closed this as completed Dec 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RuntimeError: upper bound and larger bound inconsistent with step sign #54

RuntimeError: upper bound and larger bound inconsistent with step sign #54

ecooper7 commented Nov 28, 2024

gabrielmittag commented Dec 1, 2024

ecooper7 commented Dec 3, 2024

RuntimeError: upper bound and larger bound inconsistent with step sign #54

RuntimeError: upper bound and larger bound inconsistent with step sign #54

Comments

ecooper7 commented Nov 28, 2024

gabrielmittag commented Dec 1, 2024

ecooper7 commented Dec 3, 2024