mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-04 05:44:53 +00:00
[PR #1773] [FIX]: Restore XMLTV generation for ATSC EIT/VCT streams and correct EIT bounds checks #2503
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Original Pull Request: https://github.com/CCExtractor/ccextractor/pull/1773
State: closed
Merged: Yes
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
Description
Fixes #1759 - This PR restores functional XMLTV generation for ATSC broadcast streams and adds comprehensive EPG parsing capabilities. ATSC streams with EIT/VCT/ETT tables now generate complete XMLTV output with program titles, descriptions, and extended text metadata.
Problem
The
-xmltvparameter was completely non-functional for ATSC broadcast streams. When processing ATSC transport streams containing valid EPG data (EIT tables), channel information (VCT/TVCT tables), and extended text (ETT tables), CCExtractor would:This made it impossible to extract Electronic Program Guide data from ATSC streams, despite the
-xmltvparameter being specified.Root causes identified:
TS_PMT_MAP_SIZE) were never output to XMLTVCHECK_OFFSETmacro) caused parser failures and potential buffer overrunsSolution
Core Fixes
Fixed EPG output logic (
EPG_output()function)nb_programvalueFixed critical buffer boundary check (
CHECK_OFFSETmacro)<to>in boundary validationif (offset + val < offset_end)(incorrect - allowed overruns)if (offset + (val) > offset_end)(correct - prevents overruns)Extended ATSC table support (
EPG_parse_table()function)New Features
Implemented ATSC ETT (Extended Text Table) parsing
EPG_ATSC_decode_ETT()function to parse ETT table structuresEPG_ATSC_decode_ETT_text()to extract multiple string format extended descriptions<desc>tags in XMLTV output with detailed program informationEnhanced ATSC multiple_string decoder (
EPG_ATSC_decode_multiple_string())event_name(title), second segment →text(subtitle/description)Improved XMLTV output formatting
<desc>tags (correct XMLTV placement)Testing
Tested with sample files provided by @TPeterson94070 in issue #1759:
channel5FullTS.ts- 5 channels with VCT/TVCT tablesch12FullTS.ts- Additional ATSC test casech29FullTS.ts- 5 programs with extended EIT data (Nov 26-28, 2025)Before this PR:
./ccextractor channel5FullTS.ts --xmltv 1.srtfile generatedAfter this PR:
./ccextractor channel5FullTS.ts --xmltv 1.srtAND.xmlfiles generated successfullyts-meta-idvalues matching EIT event IDsSample XMLTV output (after ETT parsing):
Known Limitations
ATSC date/time conversion issues: ATSC date/time conversion occasionally produces incorrect years in some streams (pre-existing behavior).
Channel naming: XMLTV output uses numeric channel IDs (source_id) instead of human-readable names. VCT short_name and major/minor channel numbers are not currently mapped to XMLTV display-name elements.
Orphaned events: Some EIT events may appear under channel="0" when their service_id does not match any VCT-defined program. This occurs with malformed streams or when VCT data is incomplete.
These three accuracy issues mentioned above (incorrect dates, channel naming, orphaned programs) are data quality problems that existed in the codebase previously and are not directly caused by or related to the primary bug fix in this PR.
I believe these should be addressed in follow-up PRs for better separation of concerns. However, if maintainers prefer these issues to be fixed in this PR, I'm happy to include them.