Description
I am unsure whether this is a bug or a feature request.
When junk data is between an ID3 data block and the following MP3 data, the Probe
does not report the file type MP3
.
Most other software I tested with simply skips the junk bytes (e.g. VLC, FFmpeg, MPV) without issues. Note that junk, in this case, does not mean the optional padding after ID3 data. The issue is with a "wrong" length field of the ID3 header.
Example code that panics on guess_file_type()
:
// test data that contains 4 bytes of junk (0x20) between the ID3 portion and the first MP3 frame
let data: [&[u8]; 4] = [
// ID3v2.3 header (10 bytes)
&[0x49, 0x44, 0x33, 0x03, 0x00, 0x00, 0x00, 0x00, 0x00, 0x23],
// TALB frame
&[
0x54, 0x41, 0x4C, 0x42, 0x00, 0x00, 0x00, 0x19, 0x00, 0x00, 0x01, 0xFF, 0xFE, 0x61,
0x00, 0x61, 0x00, 0x61, 0x00, 0x61, 0x00, 0x61, 0x00, 0x61, 0x00, 0x61, 0x00, 0x61,
0x00, 0x61, 0x00, 0x61, 0x00, 0x61, 0x00,
],
// 4 bytes of junk
&[0x20, 0x20, 0x20, 0x20],
// start of MP3 frame (not all bytes are shown in this slice)
&[
0xFF, 0xFB, 0x50, 0xC4, 0x00, 0x03, 0xC0, 0x00, 0x01, 0xA4, 0x00, 0x00, 0x00, 0x20,
0x00, 0x00, 0x34, 0x80, 0x00, 0x00, 0x04,
],
];
let data: Vec<u8> = data.into_iter().flatten().cloned().collect();
let data = std::io::Cursor::new(&data);
let probe = Probe::new(data);
let probe = probe.guess_file_type().unwrap();
matches!(probe.file_type(), Some(crate::FileType::MP3));
The issue is that only a simple 'if'-match is implemented for probing the sync bits of an MP3 frame:
Lines 180 to 186 in c27264f
Instead, a search for the 11 sync bits should be implemented. Is this something this library wants to do or is this something that is out of scope? I would offer to try to implement this (I suspect the search would need to be implemented somewhere else as well; e.g. for the properties).