-
Notifications
You must be signed in to change notification settings - Fork 542
[FIX]: Restore XMLTV generation for ATSC EIT/VCT streams and correct EIT bounds checks #1773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FIX]: Restore XMLTV generation for ATSC EIT/VCT streams and correct EIT bounds checks #1773
Conversation
52cce44 to
b033bde
Compare
CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit b293017...:
Congratulations: Merging this PR would fix the following tests:
All tests passing on the master branch were passed completely. Check the result page for more info. |
CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit b293017...:
Congratulations: Merging this PR would fix the following tests:
All tests passing on the master branch were passed completely. Check the result page for more info. |
|
Hi @cfsmp3, I noticed that master now includes changes to In this PR, the logic follows ATSC A/65 semantics:
I see that master currently duplicates segment 0 into both Before resolving the conflict, could you share your perspective on whether duplicating segment 0 into both fields was intentional (for example, as a fallback for single-segment streams), or whether preserving the multi-segment separation from this PR would be acceptable? I’d really appreciate your guidance on the intended behavior here so I can resolve the conflict correctly. Thanks! |
|
@x15sr71 I'm working on fixing all our timing problems as well as all the vulnerabilities that can be easily caught by a linter (use of memory without checking malloc, use of sprintf instead of snprintf), etc. So many files are being modified over the weekend. Take a look at the PR that modified ts_tables_epg.c to see what happened. You can add a comment there if you think a change is wrong and needs work. I have two more large PRs in flight that I'm currently testing, and once those are merged you can expect calm again. I'm experimenting with claude to address some of the long standing issues and it's being an extremely productive session (timing is pretty much identical to FFmpeg in the reference samples I'm using) but at the same time it's possible this high speed is leading to some incorrect changes. BTW if you are using input files that we don't have in our sample platform it would be great if you could upload them so they become part of the official tests. |
|
I have documented in Issue #1759 that the existing PR1773 does not in fact include in the XMLTV output all ETT strings that match EIT events in the test stream. This description of the ETT format tells how to relate the EIT events to the ETT strings by combining bitwise the 14-bit EIT Event ID, 16-bit and Source ID, and 0x02. All of the EIT entries in my sample files have "ETM location: 1" meaning that the corresponding ETT string is present. I'll be happy to upload more TS samples if you tell me where to put them. |
Could you upload them to a google drive or dropbox or something like that and share them with me so I can fetch them? (then you can delete them). I can then copy them to our test suite and just have them tested for each pull request. |
|
Thanks for the context @cfsmp3, I’ll hold off on pushing for now and wait for the current EPG and timing-related changes on
That's great to hear, good to know the timing work is refined! |
|
Thanks for the clarification @TPeterson94070 and for pointing out the exact I’ve corrected the I’ve added the currently generated XML file here: 20251206ch29FullTS_epg.xml for 20251206ch29FullTS.ts I’ll be pushing these changes shortly. At the moment, as cfsmp3 mentioned, there are several large EPG and timing-related changes landing on Thanks again for catching this, it helped improve the correctness of the implementation. |
|
Thanks, @x15sr71 , for incorporating the ATSC ETM id "secret decoder ring" into PR1773. 😊 I see that the XML correctly identifies the EIT events by channel ID. However, the initial list of channel IDs at the head of the |
|
@cfsmp3 , after confirming that 20-second and even 15-second clips contain all the broadcast EIT/ETT data, I made 20-second clips of the 21 rf channels that I can receive well and put them into this folder on my Google Drive. IIUC, you should be able copy the files therein. If I'm mistaken about that folder's permissions and you cannot copy them, please let me know and I'll share them individually here. BTW, I didn't examine them all in detail, but I did see that ch12's EIT event start times are off by months! That doesn't affect the EIT-ETT linkage, but it will mess up any EPG that tries to show actual event times for that channel. I'm going to try to contact the station engineer and ask that the times be corrected. |
Maintainer ReviewI've done a deep code review of this PR. Excellent work, @x15sr71! This is a high-quality fix for a real, important bug. Critical Bug Fix Confirmed ✅The #define CHECK_OFFSET(val) \
if (offset + val < offset_end) \
returnThis returns when we're within bounds (wrong!). Your fix correctly returns when outside bounds: #define CHECK_OFFSET(val) \
if (offset + (val) > offset_end) \
returnThis prevents potential buffer overreads. Root Cause Analysis Confirmed ✅Your diagnosis is correct: ATSC events end up in fallback storage ( Other Improvements ✅
CI Status ✅All builds and all 237 sample platform tests pass. The sample platform indicates this would actually fix some currently failing tests. RequestPlease rebase on master when you're ready. Master has stabilized from the recent changes I was making. During the rebase, there's one minor cleanup: the diff shows what appears to be duplicate lines in Once rebased, this should be ready to merge. Thank you for this thorough fix! |
8355cef to
4d658ed
Compare
4d658ed to
e0ac99a
Compare
|
Thanks for the review and the kind words @cfsmp3! I’ve rebased the PR onto the current master and resolved the conflicts. During the rebase, I also cleaned up the duplicate The PR is now up to date with Thanks again for the thorough review and guidance. |
|
@cfsmp3 , I'd like to test my EPG generator using the improved ccextractor xmltv code on a DVB-T full TS file. Do you have any here? If not, can you suggest a source for my testing? P.S.: I've scanned through the ts files listed in CCExtractor/sample-platform and didn't see any that evidently contained DVB-T full-ts. |
https://drive.google.com/file/d/1jgTFaJelAaGTNYZCnBZSfaruVqDXJIaq/view?usp=drive_link I recorded this just now from Spanish TV. Haven't tested it (other than to verify that it does play). |
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
Description
Fixes #1759 - This PR restores functional XMLTV generation for ATSC broadcast streams and adds comprehensive EPG parsing capabilities. ATSC streams with EIT/VCT/ETT tables now generate complete XMLTV output with program titles, descriptions, and extended text metadata.
Problem
The
-xmltvparameter was completely non-functional for ATSC broadcast streams. When processing ATSC transport streams containing valid EPG data (EIT tables), channel information (VCT/TVCT tables), and extended text (ETT tables), CCExtractor would:This made it impossible to extract Electronic Program Guide data from ATSC streams, despite the
-xmltvparameter being specified.Root causes identified:
TS_PMT_MAP_SIZE) were never output to XMLTVCHECK_OFFSETmacro) caused parser failures and potential buffer overrunsSolution
Core Fixes
Fixed EPG output logic (
EPG_output()function)nb_programvalueFixed critical buffer boundary check (
CHECK_OFFSETmacro)<to>in boundary validationif (offset + val < offset_end)(incorrect - allowed overruns)if (offset + (val) > offset_end)(correct - prevents overruns)Extended ATSC table support (
EPG_parse_table()function)New Features
Implemented ATSC ETT (Extended Text Table) parsing
EPG_ATSC_decode_ETT()function to parse ETT table structuresEPG_ATSC_decode_ETT_text()to extract multiple string format extended descriptions<desc>tags in XMLTV output with detailed program informationEnhanced ATSC multiple_string decoder (
EPG_ATSC_decode_multiple_string())event_name(title), second segment →text(subtitle/description)Improved XMLTV output formatting
<desc>tags (correct XMLTV placement)Testing
Tested with sample files provided by @TPeterson94070 in issue #1759:
channel5FullTS.ts- 5 channels with VCT/TVCT tablesch12FullTS.ts- Additional ATSC test casech29FullTS.ts- 5 programs with extended EIT data (Nov 26-28, 2025)Before this PR:
./ccextractor channel5FullTS.ts --xmltv 1.srtfile generatedAfter this PR:
./ccextractor channel5FullTS.ts --xmltv 1.srtAND.xmlfiles generated successfullyts-meta-idvalues matching EIT event IDsSample XMLTV output (after ETT parsing):
Known Limitations
ATSC date/time conversion issues: ATSC date/time conversion occasionally produces incorrect years in some streams (pre-existing behavior).
Channel naming: XMLTV output uses numeric channel IDs (source_id) instead of human-readable names. VCT short_name and major/minor channel numbers are not currently mapped to XMLTV display-name elements.
Orphaned events: Some EIT events may appear under channel="0" when their service_id does not match any VCT-defined program. This occurs with malformed streams or when VCT data is incomplete.
These three accuracy issues mentioned above (incorrect dates, channel naming, orphaned programs) are data quality problems that existed in the codebase previously and are not directly caused by or related to the primary bug fix in this PR.
I believe these should be addressed in follow-up PRs for better separation of concerns. However, if maintainers prefer these issues to be fixed in this PR, I'm happy to include them.