[BUG] ccextractor skips many captions (CEA-708) #793

Closed
opened 2026-01-29 16:53:37 +00:00 by claunia · 2 comments
Owner

Originally created by @svlobanov on GitHub (Dec 30, 2023).

CCExtractor version: 0.94 376ff83161 CEA-708 decoder: C

Necessary information

  • Is this a regression (i.e. did it work before)? NO
  • What platform did you use? Mac
  • What were the used arguments? ./ccextractor -pn 4 503.ts

Video links

TS file: https://tsduck.io/streams/usa-atsc/503.ts (index page is here: https://tsduck.io/streams/?name=usa-atsc )

Additional information

The result file 503.p4.svc01.srt contains only two captions, but there are much more captions in the CC708 Service1 in the program(4). The same issue if I use ccextractor with CEA-708 Rust decoder.

In VLC4 decodes all CC708 service1 captions for Program 4(Charge!). Also, caption-inspector (https://github.com/Comcast/caption-inspector ) decodes much more captions in the CC708 Service1 in the program(4). I'm attaching caption-inspector's output for PID=0x61 (97) caption-inspector-output.zip

ccextractor CC708 Service1 Program4 output:

1
00:00:01,067 --> 00:00:23,856
     - [Chris A.] Thank you.
    none here is amazing.

2
00:00:25,358 --> 00:01:03,096
This is accomplished easily
with the graduating guide combs
Originally created by @svlobanov on GitHub (Dec 30, 2023). CCExtractor version: 0.94 376ff831616919e092b53353f3799654d86d0759 CEA-708 decoder: C # Necessary information - Is this a regression (i.e. did it work before)? NO - What platform did you use? Mac - What were the used arguments? `./ccextractor -pn 4 503.ts` # Video links TS file: https://tsduck.io/streams/usa-atsc/503.ts (index page is here: https://tsduck.io/streams/?name=usa-atsc ) # Additional information The result file `503.p4.svc01.srt` contains only two captions, but there are much more captions in the CC708 Service1 in the program(4). The same issue if I use ccextractor with CEA-708 Rust decoder. In VLC4 decodes all CC708 service1 captions for Program 4(Charge!). Also, caption-inspector (https://github.com/Comcast/caption-inspector ) decodes much more captions in the CC708 Service1 in the program(4). I'm attaching caption-inspector's output for PID=0x61 (97) [caption-inspector-output.zip](https://github.com/CCExtractor/ccextractor/files/13799033/caption-inspector-output.zip) ccextractor CC708 Service1 Program4 output: ``` 1 00:00:01,067 --> 00:00:23,856 - [Chris A.] Thank you. none here is amazing. 2 00:00:25,358 --> 00:01:03,096 This is accomplished easily with the graduating guide combs ```
Author
Owner

@cfsmp3 commented on GitHub (Dec 14, 2025):

Issue Status: Fixed

I've tested this issue with the current master branch and the problem appears to be resolved.

Test Results

  • Sample file: https://tsduck.io/streams/usa-atsc/503.ts
  • Command: ./ccextractor --program-number 4 503.ts
  • Original report: Only 2 captions extracted
  • Current version: 30 captions extracted (matching caption-inspector output)

Caption Comparison

The captions now match the expected output from caption-inspector:

# CCExtractor Output Caption-Inspector Reference
1 00:00:00,601 - "d so every time / you start trimming," Match
2 00:00:02,403 - "all of those hairs are being..." Match
... ... ...
30 "This is accomplished easily with the graduating guide combs" Match

Likely Fixes

Several commits since the original report (Dec 2023) have improved the CEA-708 decoder:

  • 3c51fb65 - Handle row_count decrease in CEA-708 C decoder
  • d6ccf1bf - Port 708 decoder encoding module to Rust
  • Various window handling and display logic improvements

The issue was likely related to improper handling of CEA-708 window toggle (TGW) and delete layer (DLW) commands, causing captions to accumulate instead of being flushed at proper display boundaries.

This issue can be closed as fixed.

🤖 Generated with Claude Code

@cfsmp3 commented on GitHub (Dec 14, 2025): ## Issue Status: Fixed ✅ I've tested this issue with the current master branch and the problem appears to be resolved. ### Test Results - **Sample file:** https://tsduck.io/streams/usa-atsc/503.ts - **Command:** `./ccextractor --program-number 4 503.ts` - **Original report:** Only 2 captions extracted - **Current version:** **30 captions extracted** (matching caption-inspector output) ### Caption Comparison The captions now match the expected output from caption-inspector: | # | CCExtractor Output | Caption-Inspector Reference | |---|-------------------|---------------------------| | 1 | 00:00:00,601 - "d so every time / you start trimming," | ✅ Match | | 2 | 00:00:02,403 - "all of those hairs are being..." | ✅ Match | | ... | ... | ... | | 30 | "This is accomplished easily with the graduating guide combs" | ✅ Match | ### Likely Fixes Several commits since the original report (Dec 2023) have improved the CEA-708 decoder: - `3c51fb65` - Handle row_count decrease in CEA-708 C decoder - `d6ccf1bf` - Port 708 decoder encoding module to Rust - Various window handling and display logic improvements The issue was likely related to improper handling of CEA-708 window toggle (TGW) and delete layer (DLW) commands, causing captions to accumulate instead of being flushed at proper display boundaries. This issue can be closed as fixed. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
Author
Owner

@cfsmp3 commented on GitHub (Dec 14, 2025):

Closing as fixed in master. The CEA-708 decoder improvements since v0.94 now correctly extract all 30 captions from the test file.

@cfsmp3 commented on GitHub (Dec 14, 2025): Closing as fixed in master. The CEA-708 decoder improvements since v0.94 now correctly extract all 30 captions from the test file.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#793