Description
🐛 Bug
For single channel MP3 files, the length returned when calling torchaudio.info() is slightly longer than torchaudio.load(). For files with a sample rate of 16KHz the length provided by info is always 576 longer and for a sample rate of 48KHz the length is 1152 longer.
To Reproduce
Steps to reproduce the behavior:
import torchaudio
path = ... # some MP3 file location
print(torchaudio.info(path)[0].channels) # Verify that file is single channel
print(torchaudio.info(path)[0].length)
print(torchaudio.load(path)[0].size(1))
Expected behavior
The above code should output identical lengths, e.g.
1
96768
96768
Environment
-
What commands did you used to install torchaudio (conda/pip/build from source)?
Installed with pip usingconda env create -f environment.yml
whereenvironment.yml
contains the dependencies for the project -
What does
torchaudio.__version__
print? (If applicable)
0.3.0
PyTorch version: 1.2.0
Is debug build: No
CUDA used to build PyTorch: 10.0.130
OS: Ubuntu 18.04.3 LTS
GCC version: (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0
CMake version: version 3.10.2
Python version: 3.7
Is CUDA available: Yes
CUDA runtime version: Could not collect
GPU models and configuration: GPU 0: GeForce GTX 1050 Ti
Nvidia driver version: 430.50
cuDNN version: Could not collect
Versions of relevant libraries:
[pip] numpy==1.16.4
[pip] torch==1.2.0
[pip] torchaudio==0.3.0
[pip] torchvision==0.4.0a0+6b959ee
[conda] _pytorch_select 0.1 cpu_0 anaconda
[conda] blas 1.0 mkl conda-forge/label/cf201901
[conda] mkl 2019.4 243 anaconda
[conda] mkl-service 2.3.0 py37he904b0f_0 anaconda
[conda] mkl_fft 1.0.12 py37ha843d7b_0 anaconda
[conda] mkl_random 1.0.2 py37hd81dba3_0 anaconda
[conda] pytorch 1.2.0 py3.7_cuda10.0.130_cudnn7.6.2_0 pytorch
[conda] torchaudio 0.3.0 pypi_0 pypi
[conda] torchvision 0.4.0 py37_cu100 pytorch
Additional context
Checking the length of the files with soxi -D filename
, the length reported by info appears to be the correct one. The example code above behaves as expected (i.e. info and load agree on lengths) when using a FLAC file in my environment.
The MP3 files tested came from the Common Voice data set and converting some of the FLAC files in the LibriSpeech data set to MP3s.