mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-03 21:23:48 +00:00
[PROPOSAL] Validate TTML subtitles in ISO-BMFF #626
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @donmartin00 on GitHub (Apr 12, 2021).
CCExtractor Version: 0.88
Necessary information
{No arguments - we did not attempt to run this type of file through the gui because we did not see an option for the ISO-BMFF.}Additional information
Add a feature to validate TTML subtitles in ISO-BMFF format per specification 14496-30, the format used in DVB Dash and ATSC 3.0).
@bubbaprog commented on GitHub (Jul 24, 2021):
Not sure what method they're using to do it, but I just uploaded a raw ATSC 3.0 broadcast to Twitter and the captions appeared in the tweet— so a method for extracting and converting them is already out there in something presumably avconv/ffmpeg-related, & could possibly be used to develop it here.
@bubbaprog commented on GitHub (Jan 16, 2024):
FWIW VLC can decode captions in ATSC3.0 video, and the code can likely be ported over. https://code.videolan.org/videolan/vlc/-/tree/master/modules/codec
@x15sr71 commented on GitHub (Jan 18, 2026):
I've been looking into CCExtractor's MP4/ISO-BMFF handling recently (related to GPAC and packaging work) and wanted to add some context that might be useful here.
TTML subtitle tracks in MP4 use the
stppsample entry (defined in ISO/IEC 14496-30). CCExtractor's current MP4 demuxer handles similar subtitle formats liketx3g(QuickTime timed text) andclcp(CEA-608/708), so the track detection pattern exists. The main complexity is TTML validation — unlike the tools mentioned above (VLC, FFmpeg, AVconv), which can rely on larger XML parsing libraries, adding libxml2 or similar to CCExtractor just for TTML validation might not be desirable. A minimal approach (namespace detection, well-formedness checks via string parsing) could work for detection and reporting purposes, but would need careful implementation to avoid false positives. Tools like Bento4 are useful references for understanding stpp sample entries at the container level.I am currently assessing the scope of this to understand what a lightweight implementation would look like, as this seems to require a structural addition rather than a quick patch.
Leaving this note here in case the context is helpful..