FEATURE: Add V_MPEG2 track support in MKV demuxer for CC extraction#2152
Conversation
MKV files with MPEG-2 video (common in DVD sources) were silently skipped. Add V_MPEG2 track detection and processing using the existing process_m2v() infrastructure, matching how mp4.c handles MPEG-2 streams. Fixes CCExtractor#2149
There was a problem hiding this comment.
Pull request overview
Adds Matroska (MKV) demuxer support for V_MPEG2 video tracks so CCExtractor can extract EIA-608/708 captions from MPEG-2-in-MKV content (e.g., DVD-sourced rips), addressing the “no output” scenario reported in #2149.
Changes:
- Add
"V_MPEG2"codec-id recognition and track-number tracking in the Matroska parser. - Dispatch MPEG-2 SimpleBlock frames through a new
process_mpeg2_frame_mkv()that reuses the existing MPEG-2 elementary stream processing path (process_m2v()). - Update Matroska loop reporting to include MPEG-2 track detection.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/lib_ccx/matroska.h | Adds MPEG-2 codec id constant, a new mpeg2_track_number field, and a new frame-processing function prototype. |
| src/lib_ccx/matroska.c | Detects MPEG-2 tracks, dispatches MPEG-2 frames for processing, initializes/report MPEG-2 track presence, and adjusts return behavior. |
Comments suppressed due to low confidence (2)
src/lib_ccx/matroska.c:699
- In the non-video-track fast-path,
skip_bytes(file, len - 1)assumes the TrackNumber VINT is always 1 byte. Track numbers in Matroska are variable-length VINTs, so this can seek to the wrong position and desynchronize parsing. Compute the remaining bytes to skip based on the block start (pos) and current file position instead of subtracting 1.
int is_mpeg2 = (track == mkv_ctx->mpeg2_track_number);
if (!is_avc && !is_hevc && !is_mpeg2)
{
// Skip everything except AVC/HEVC tracks
skip_bytes(file, len - 1); // 1 byte for track
return;
src/lib_ccx/matroska.c:699
- The comment says "Skip everything except AVC/HEVC tracks" but this block now also allows MPEG-2. Please update the comment to match the behavior so future changes don’t reintroduce accidental skips.
{
// Skip everything except AVC/HEVC tracks
skip_bytes(file, len - 1); // 1 byte for track
return;
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit 733ed89...:
Congratulations: Merging this PR would fix the following tests:
All tests passed completely. Check the result page for more info. |
CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit 733ed89...:
Congratulations: Merging this PR would fix the following tests:
All tests passed completely. Check the result page for more info. |
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
Summary
CCExtractor's MKV demuxer only recognized
V_MPEG4/ISO/AVCandV_MPEGH/ISO/HEVCtracks, silently skipping
V_MPEG2tracks. MKV files with MPEG-2 video — commonin DVD-sourced content — produced no output at all.
Reported in #2149 by a user trying to extract CC3 captions from a Fairly OddParents DVD.
Changes
mpeg2_codec_id = "V_MPEG2"tomatroska.hmpeg2_track_numberfield tomatroska_ctxstructV_MPEG2track during track entry parsing alongside AVC/HEVCprocess_mpeg2_frame_mkv()reusing the existingprocess_m2v()infrastructure (same path used by
mp4.candgeneral_loop.c)parse_simple_block()mpeg2_track_numberalongside AVC/HEVCTesting
Tested with the sample from #2149 (
cc3.mkv, V_MPEG2, captions on CC3/Field 2):Closes #2149