[BUG] Captions fail to extract on HEVC video stream #829

Closed
opened 2026-01-29 16:54:36 +00:00 by claunia · 6 comments
Owner

Originally created by @shirt-dev on GitHub (Apr 15, 2025).

CCExtractor version: 0.94

In raising this issue, I confirm the following:

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • I have used the latest available version of CCExtractor to verify this issue exists.
  • I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.

Necessary information

  • Is this a regression (i.e. did it work before)? NO
  • What platform did you use? Windows
  • What were the used arguments? No arguments

Video links

Additional information

HEVC video files with EIA-608 captions fail to extract, however MPV and VLC display the captions.

Opening file: caption_test.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Done, processing time = 0 seconds

No captions were found in input.
Issues? Open a ticket here

MediaInfo

General
ID                                       : 1 (0x1)
Complete name                            : caption_test.ts
Format                                   : MPEG-TS
File size                                : 65.2 MiB
Duration                                 : 4 min 59 s
Overall bit rate mode                    : Variable
Overall bit rate                         : 1 827 kb/s
Frame rate                               : 59.940 FPS
Law rating                               : C8+

Video
ID                                       : 257 (0x101)
Menu ID                                  : 1 (0x1)
Format                                   : HEVC
Format/Info                              : High Efficiency Video Coding
Format profile                           : Main@L4.1@Main
Codec ID                                 : 36
Duration                                 : 4 min 59 s
Width                                    : 1 280 pixels
Height                                   : 720 pixels
Display aspect ratio                     : 16:9
Active Format Description                : Letterbox 16:9 image
Frame rate                               : 59.940 (60000/1001) FPS
Color space                              : YUV
Chroma subsampling                       : 4:2:0 (Type 0)
Bit depth                                : 8 bits
Color range                              : Limited
Color primaries                          : BT.709
Transfer characteristics                 : BT.709
Matrix coefficients                      : BT.709

Audio #1
ID                                       : 256 (0x100)
Menu ID                                  : 1 (0x1)
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Format version                           : Version 4
Muxing mode                              : ADTS
Codec ID                                 : 15-2
Duration                                 : 4 min 59 s
Bit rate mode                            : Variable
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Delay relative to video                  : -820 ms
Language                                 : English

Audio #2
ID                                       : 258 (0x102)
Menu ID                                  : 1 (0x1)
Format                                   : AAC LC
Format/Info                              : Advanced Audio Codec Low Complexity
Format version                           : Version 4
Muxing mode                              : ADTS
Codec ID                                 : 15-2
Duration                                 : 4 min 59 s
Bit rate mode                            : Variable
Channel(s)                               : 2 channels
Channel layout                           : L R
Sampling rate                            : 48.0 kHz
Frame rate                               : 46.875 FPS (1024 SPF)
Compression mode                         : Lossy
Delay relative to video                  : -820 ms

Text #1
ID                                       : 257 (0x101)-CC1
Menu ID                                  : 1 (0x1)
Format                                   : EIA-608
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 4 min 59 s
Duration of the visible content          : 4 min 54 s
Start time                               : 4 s 875 ms
End time                                 : 4 min 59 s
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)
Count of frames before first event       : 161
Type of the first event                  : PopOn

Text #2
ID                                       : 257 (0x101)-1
Menu ID                                  : 1 (0x1)
Format                                   : EIA-708
Muxing mode                              : SCTE 128 / DTVCC Transport
Muxing mode, more info                   : Muxed in Video #1
Duration                                 : 4 min 59 s
Bit rate mode                            : Constant
Stream size                              : 0.00 Byte (0%)

Menu
ID                                       : 4096 (0x1000)
Menu ID                                  : 1 (0x1)
Format                                   : AAC / HEVC / AAC
Duration                                 : 4 min 59 s
List                                     : 256 (0x100) (AAC, English) / 257 (0x101) (HEVC) / 258 (0x102) (AAC)
Language                                 : English
Service name                             : Service01
Service provider                         : FFmpeg
Service type                             : digital television
Law rating                               : C8+
Originally created by @shirt-dev on GitHub (Apr 15, 2025). CCExtractor version: 0.94 # In raising this issue, I confirm the following: - [x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [x] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present. - [x] I have checked that the issue I'm posting isn't already reported. - [x] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues) - [x] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion. - [x] I have used the latest available version of CCExtractor to verify this issue exists. - [x] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text. # Necessary information - Is this a regression (i.e. did it work before)? NO - What platform did you use? Windows - What were the used arguments? No arguments # Video links * https://drive.google.com/file/d/1vmt8yBWGohL45LCRTCZNChxEINDuSVY8/view?usp=sharing # Additional information HEVC video files with EIA-608 captions fail to extract, however MPV and VLC display the captions. ``` Opening file: caption_test.ts File seems to be a transport stream, enabling TS mode Analyzing data in general mode Done, processing time = 0 seconds No captions were found in input. Issues? Open a ticket here ``` MediaInfo ``` General ID : 1 (0x1) Complete name : caption_test.ts Format : MPEG-TS File size : 65.2 MiB Duration : 4 min 59 s Overall bit rate mode : Variable Overall bit rate : 1 827 kb/s Frame rate : 59.940 FPS Law rating : C8+ Video ID : 257 (0x101) Menu ID : 1 (0x1) Format : HEVC Format/Info : High Efficiency Video Coding Format profile : Main@L4.1@Main Codec ID : 36 Duration : 4 min 59 s Width : 1 280 pixels Height : 720 pixels Display aspect ratio : 16:9 Active Format Description : Letterbox 16:9 image Frame rate : 59.940 (60000/1001) FPS Color space : YUV Chroma subsampling : 4:2:0 (Type 0) Bit depth : 8 bits Color range : Limited Color primaries : BT.709 Transfer characteristics : BT.709 Matrix coefficients : BT.709 Audio #1 ID : 256 (0x100) Menu ID : 1 (0x1) Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Format version : Version 4 Muxing mode : ADTS Codec ID : 15-2 Duration : 4 min 59 s Bit rate mode : Variable Channel(s) : 2 channels Channel layout : L R Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Delay relative to video : -820 ms Language : English Audio #2 ID : 258 (0x102) Menu ID : 1 (0x1) Format : AAC LC Format/Info : Advanced Audio Codec Low Complexity Format version : Version 4 Muxing mode : ADTS Codec ID : 15-2 Duration : 4 min 59 s Bit rate mode : Variable Channel(s) : 2 channels Channel layout : L R Sampling rate : 48.0 kHz Frame rate : 46.875 FPS (1024 SPF) Compression mode : Lossy Delay relative to video : -820 ms Text #1 ID : 257 (0x101)-CC1 Menu ID : 1 (0x1) Format : EIA-608 Muxing mode : SCTE 128 / DTVCC Transport Muxing mode, more info : Muxed in Video #1 Duration : 4 min 59 s Duration of the visible content : 4 min 54 s Start time : 4 s 875 ms End time : 4 min 59 s Bit rate mode : Constant Stream size : 0.00 Byte (0%) Count of frames before first event : 161 Type of the first event : PopOn Text #2 ID : 257 (0x101)-1 Menu ID : 1 (0x1) Format : EIA-708 Muxing mode : SCTE 128 / DTVCC Transport Muxing mode, more info : Muxed in Video #1 Duration : 4 min 59 s Bit rate mode : Constant Stream size : 0.00 Byte (0%) Menu ID : 4096 (0x1000) Menu ID : 1 (0x1) Format : AAC / HEVC / AAC Duration : 4 min 59 s List : 256 (0x100) (AAC, English) / 257 (0x101) (HEVC) / 258 (0x102) (AAC) Language : English Service name : Service01 Service provider : FFmpeg Service type : digital television Law rating : C8+ ```
Author
Owner

@VivianVRodrigues commented on GitHub (Aug 4, 2025):

I tried to get the cc , on linux os , yes it does not recognize the cc in hevc format , but was able to extract by converting the hevc to h.264 using ffmpeg then extraction was successful , so the problem is in the extraction of hevc in cc extractor .

@VivianVRodrigues commented on GitHub (Aug 4, 2025): I tried to get the cc , on linux os , yes it does not recognize the cc in hevc format , but was able to extract by converting the hevc to h.264 using ffmpeg then extraction was successful , so the problem is in the extraction of hevc in cc extractor .
Author
Owner

@VivianVRodrigues commented on GitHub (Aug 4, 2025):

So is anyone working on it , or is this problem solved but not merged yet ?

@VivianVRodrigues commented on GitHub (Aug 4, 2025): So is anyone working on it , or is this problem solved but not merged yet ?
Author
Owner

@shirt-dev commented on GitHub (Dec 21, 2025):

Appreciate the help with this! MP4 and MKV files still are failing.
MP4: https://drive.google.com/file/d/1w36ic-gbLAc6_fjMB5TDpBV8Pcs40bGC/view?usp=sharing
MKV: https://drive.google.com/file/d/10X8R95TUnFzAZZkP_M2letRJzvz9OPSm/view?usp=sharing

@shirt-dev commented on GitHub (Dec 21, 2025): Appreciate the help with this! MP4 and MKV files still are failing. MP4: https://drive.google.com/file/d/1w36ic-gbLAc6_fjMB5TDpBV8Pcs40bGC/view?usp=sharing MKV: https://drive.google.com/file/d/10X8R95TUnFzAZZkP_M2letRJzvz9OPSm/view?usp=sharing
Author
Owner

@cfsmp3 commented on GitHub (Dec 21, 2025):

Hi @shirt-dev,

I analyzed the MP4 and MKV files you provided. The issue is that the caption data is not present in these files - it was stripped during the remuxing process.

Analysis with mediainfo:

File Container Created with Caption tracks
Original TS (from issue) MPEG-TS - EIA-608 + EIA-708
hevc_test.mp4 Matroska mkvmerge v95.0 None
hevc_test.mkv MP4 FFmpeg Lavf62.6.101 None

When you remux an HEVC stream from TS to MP4/MKV using FFmpeg or mkvmerge, the CEA-608/708 captions embedded in the video SEI NAL units are typically not preserved unless you use specific options.

The PR #1852 fix works correctly - it extracts captions from HEVC streams in MPEG-TS containers where the caption data exists. The issue with your new samples is that they simply don't contain caption data to extract.

To verify this yourself:

# Check original TS - shows Text tracks
mediainfo caption_test.ts | grep -A5 "Text"

# Check remuxed files - no Text tracks
mediainfo hevc_test.mp4 | grep -A5 "Text"
mediainfo hevc_test.mkv | grep -A5 "Text"

If you need to preserve captions when remuxing:
Unfortunately, FFmpeg and mkvmerge don't have built-in support for preserving CEA-608/708 captions during remux. You would need to:

  1. Extract captions first with CCExtractor (from the original TS)
  2. Remux the video
  3. Mux the captions as a separate subtitle track

Is there a specific use case where you need HEVC captions in MP4/MKV format? If you have original MP4/MKV files with embedded captions (not remuxed from TS), please share those and I'll investigate.

@cfsmp3 commented on GitHub (Dec 21, 2025): Hi @shirt-dev, I analyzed the MP4 and MKV files you provided. The issue is that **the caption data is not present in these files** - it was stripped during the remuxing process. **Analysis with mediainfo:** | File | Container | Created with | Caption tracks | |------|-----------|--------------|----------------| | Original TS (from issue) | MPEG-TS | - | ✅ EIA-608 + EIA-708 | | hevc_test.mp4 | Matroska | mkvmerge v95.0 | ❌ None | | hevc_test.mkv | MP4 | FFmpeg Lavf62.6.101 | ❌ None | When you remux an HEVC stream from TS to MP4/MKV using FFmpeg or mkvmerge, the CEA-608/708 captions embedded in the video SEI NAL units are typically **not preserved** unless you use specific options. **The PR #1852 fix works correctly** - it extracts captions from HEVC streams in MPEG-TS containers where the caption data exists. The issue with your new samples is that they simply don't contain caption data to extract. **To verify this yourself:** ```bash # Check original TS - shows Text tracks mediainfo caption_test.ts | grep -A5 "Text" # Check remuxed files - no Text tracks mediainfo hevc_test.mp4 | grep -A5 "Text" mediainfo hevc_test.mkv | grep -A5 "Text" ``` **If you need to preserve captions when remuxing:** Unfortunately, FFmpeg and mkvmerge don't have built-in support for preserving CEA-608/708 captions during remux. You would need to: 1. Extract captions first with CCExtractor (from the original TS) 2. Remux the video 3. Mux the captions as a separate subtitle track Is there a specific use case where you need HEVC captions in MP4/MKV format? If you have **original** MP4/MKV files with embedded captions (not remuxed from TS), please share those and I'll investigate.
Author
Owner

@shirt-dev commented on GitHub (Dec 21, 2025):

Appreciate the follow-up on this. The original source files for this actually were MP4, but I no longer have them. When I attempt to play the samples I submitted with VLC or MPV, EIA-608 data is detected and playable though, which is interesting to me.

Image
@shirt-dev commented on GitHub (Dec 21, 2025): Appreciate the follow-up on this. The original source files for this actually were MP4, but I no longer have them. When I attempt to play the samples I submitted with VLC or MPV, EIA-608 data is detected and playable though, which is interesting to me. <img width="2555" height="1378" alt="Image" src="https://github.com/user-attachments/assets/62e81484-d07b-4383-8966-a3ad59575cf6" />
Author
Owner

@cfsmp3 commented on GitHub (Dec 21, 2025):

Appreciate the follow-up on this. The original source files for this actually were MP4, but I no longer have them. When I attempt to play the samples I submitted with VLC or MPV, EIA-608 data is detected and playable though, which is interesting to me.

Maybe you're right and we failed to find them :-) Can you open a separate issue with details? (since we closed this already based on what we already implemented)

@cfsmp3 commented on GitHub (Dec 21, 2025): > Appreciate the follow-up on this. The original source files for this actually were MP4, but I no longer have them. When I attempt to play the samples I submitted with VLC or MPV, EIA-608 data is detected and playable though, which is interesting to me. > Maybe you're right and we failed to find them :-) Can you open a separate issue with details? (since we closed this already based on what we already implemented)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#829