mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-03 21:23:48 +00:00
[PR #1915] [MERGED] fix(mp4): Fix 200ms timing offset for MOV/MP4 caption extraction #2707
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/CCExtractor/ccextractor/pull/1915
Author: @cfsmp3
Created: 12/27/2025
Status: ✅ Merged
Merged: 12/28/2025
Merged by: @cfsmp3
Base:
master← Head:fix/mp4-mov-200ms-timing-offset📝 Commits (1)
c061026fix(mp4): Fix 200ms timing offset for MOV/MP4 caption extraction📊 Changes
1 file changed (+19 additions, -0 deletions)
View changed files
📝
src/lib_ccx/mp4.c(+19 -0)📄 Description
Summary
This PR fixes a 200ms timing offset that was affecting caption extraction from MOV/MP4 files, causing sample platform tests 226-230 to fail.
Before fix:
After fix:
Root Cause Analysis
The Bug
The
in_bufferdatatypevariable was never set inmp4.cfor MP4/MOV container tracks. It remained at its default valueCCX_UNKNOWN, which caused incorrect behavior in the caption processing pipeline.How
in_bufferdatatypeAffects TimingIn
src/lib_ccx/ccx_decoders_common.c, thedo_cb()function has a check (lines 150-154):This check is designed to skip
cb_fieldcounter increments for container formats. However, within_bufferdatatype == CCX_UNKNOWN, this condition evaluated totrue, causingcb_field1to be incremented for every CEA-608 caption block.The Timing Math
When
get_fts()is called to timestamp captions, it calculates:The
cb_field1 * 1001/30term adds ~33.37ms per caption block. With a typical roll-up caption having ~6 blocks per frame:This explains the consistent 200ms timing offset observed in MOV/MP4 files.
Why Container Formats Don't Need cb_field
For container formats (MP4, MOV, MKV, TS with PES), all caption data for a video frame is bundled together and associated with the frame's PTS. The caption blocks within a frame don't have sub-frame timing - they all belong to the same presentation timestamp.
In contrast, raw/elementary streams may have caption data arriving at field rate (59.94 Hz for NTSC), where each CEA-608 byte pair has its own timing. The
cb_fieldoffset accounts for this sub-frame timing.The Fix
Set
in_bufferdatatypecorrectly in the three MP4 track processing functions:process_avc_track()CCX_H264process_hevc_track()CCX_H264process_xdvb_track()CCX_PESVerification
Test Sample
/home/cfsmp3/media_samples/completed/1974a299f0502fc8199dabcaadb20e422e79df45972e554d58d1d025ef7d0686.movBefore Fix (ttxt output)
After Fix (ttxt output)
FFmpeg Reference
The fix aligns CCExtractor's output exactly with FFmpeg's authoritative timing.
Regression Check
Verified that TS files still work correctly with the same timing (no regressions introduced).
Test Plan
Files Changed
src/lib_ccx/mp4.c- Setin_bufferdatatypein 3 track processing functions🤖 Generated with Claude Code
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.