QUESTION: Can't get MCC output to work with known good input #767

Open
opened 2026-01-29 16:53:08 +00:00 by claunia · 0 comments
Owner

Originally created by @PyCoder040 on GitHub (Jun 7, 2023).

CCExtractor detailed version info
Version: 0.94
Git commit: 5b7666965f
Compilation date: 2023-06-07
CEA-708 decoder: Rust
File SHA256: Could not open file

Libraries used by CCExtractor
Tesseract Version: 5.2.0
Leptonica Version: leptonica-1.82.0
libGPAC Version: 1.0.1
zlib: 1.2.11
utf8proc Version: 2.4.0
protobuf-c Version: 1.3.1
libpng Version: 1.6.37
FreeType
libhash
nuklear
libzvbi

In raising this issue, I confirm the following:

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • I have used the latest available version of CCExtractor to verify this issue exists.
  • I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.

Necessary information

  • Is this a regression (i.e. did it work before)? {notsure}
  • What platform did you use? {Linux}
  • What were the used arguments? ccextractor -debug -in=raw -out=mcc 3.bin -o 4.mcc

So testing my input I get valid captions

[user@p5810r hyperdeck]# ccextractor -debug -in=raw 3.bin -stdout

CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: Files (1): 3.bin
[Extract: 1] [Stream mode: McPoodle's raw]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: Yes] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: 3.bin
Analyzing data in McPoodle raw mode
Sending captions to stdout.
1
00:00:03,404 --> 00:00:06,171
 In this lesson, we're going to 
 be talking about finance. And  

2
00:00:06,173 --> 00:00:10,009
    one of the most important   
             aspects            
    of finance is interest.     



Total frames time:	  00:00:00:000  (0 frames at 29.97fps)

Min PTS:				00:00:00:001
Max PTS:				00:00:10:111
Length:				 00:00:10:110
Done, processing time = 0 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

But when I try to switch the output format to MCC, I get a fatal error message:

[user@p5810r hyperdeck]# ccextractor -debug -in=raw -out=mcc 3.bin -o 4.mcc 
CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: Files (1): 3.bin
[Extract: 1] [Stream mode: McPoodle's raw]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: Yes] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .mcc] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: 3.bin
Analyzing data in McPoodle raw mode
Output format not supported
Output format not supported


Total frames time:	  00:00:00:000  (0 frames at 29.97fps)

Min PTS:				00:00:00:001
Max PTS:				00:00:10:111
Length:				 00:00:10:110
Done, processing time = 0 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

After reviewing the code (ccx_encoders_common.c), I see that this error can be generated when a 608 caption is to be stored in a MCC format. This leads me to believe that I can output MCC files as long as the input is not considered to be 608. I don't see where I can declare my payload to be non-608. So I think I'm just doing this wrong. Do I need to provide 708, or can I tell ccextractor to "convert" to 708?

My input file was created by using subrip2scc.pl to convert a SRT into SCC. Then I used scc2raw.pl to convert the SCC into a McPoodle RAW file. I think that maybe this is creating a standard definition (608) caption file? Anyway, that's what I fed into ccextractor as shown above.

If I'm unknowingly making a 608-only file, this may be where I'm going wrong. I need 708 captions to feed into ccextractor to get a MCC file.

Is it possible for someone to post a working example of a commandline producing a valid MCC output file?

P.S. I have also tried this with the 0.94 release version. No difference.


As a second test, I took an entirely different approach. I injected a SRT into a MP4 using ffmpeg. VLC will show these captions all day long. But when fed to ccextractor, the captions are seen if specified in -stdout, but if you add the MCC output, you get a zero length mcc file.

ffmpeg -i HyperDeck_0002.mp4 -t 15 -i 0.srt -t 15 -c:v libx264 -profile:v main -b:v 53000k -pix_fmt yuv420p -c:s mov_text -metadata:s:s:0 language=eng HyperDeck_0004srt.mp4


[user@p5810r hyperdeck]# ccextractor -debug HyperDeck_0004srt.mp4 -out=mcc -o HyperDeck_0004srt.mcc
CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: Files (1): HyperDeck_0004srt.mp4
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: Yes] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .mcc] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: HyperDeck_0004srt.mp4
Detected MP4 box with name: ftyp
Detected MP4 box with name: free
Detected MP4 box with name: mdat
File seems to be a MP4
Analyzing data with GPAC (MP4 library)
Opening 'HyperDeck_0004srt.mp4': ok
Track 1, type=vide subtype=avc1
Track 2, type=soun subtype=MPEG
Track 3, type=sbtl subtype=tx3g
MP4: found 3 tracks: 1 avc and 1 cc
Processing track 1, type=vide subtype=avc1
Processing track 2, type=soun subtype=MPEG
Processing track 3, type=sbtl subtype=tx3g
100%  |  00:17
Closing media: ok
Found 1 AVC track(s). Found 1 CC track(s).


Total frames time:        00:00:00:000  (0 frames at 29.97fps)

Min PTS:                                00:00:00:000
Max PTS:                                00:00:17:720
Length:                          00:00:17:720
Done, processing time = 0 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

Originally created by @PyCoder040 on GitHub (Jun 7, 2023). CCExtractor detailed version info Version: 0.94 Git commit: 5b7666965fa92735eba8953d8acda7ba4f7ef220 Compilation date: 2023-06-07 CEA-708 decoder: Rust File SHA256: Could not open file Libraries used by CCExtractor Tesseract Version: 5.2.0 Leptonica Version: leptonica-1.82.0 libGPAC Version: 1.0.1 zlib: 1.2.11 utf8proc Version: 2.4.0 protobuf-c Version: 1.3.1 libpng Version: 1.6.37 FreeType libhash nuklear libzvbi # In raising this issue, I confirm the following: - [x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [x] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present. - [x] I have checked that the issue I'm posting isn't already reported. - [x] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues) - [x] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion. - [x] I have used the latest available version of CCExtractor to verify this issue exists. - [x] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text. # Necessary information - Is this a regression (i.e. did it work before)? {notsure} - What platform did you use? {Linux} - What were the used arguments? ccextractor -debug -in=raw -out=mcc 3.bin -o 4.mcc -------------------------------------------------------------------------- So testing my input I get valid captions ``` [user@p5810r hyperdeck]# ccextractor -debug -in=raw 3.bin -stdout CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc -------------------------------------------------------------------------- Input: Files (1): 3.bin [Extract: 1] [Stream mode: McPoodle's raw] [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] [CEA-708: 63 decoders active] [CEA-708: using charset "none" for all services] [Timing mode: Auto] [Debug: Yes] [Buffer input: No] [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] [Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No] [Add font color data: Yes] [Add font typesetting: Yes] [Convert case: No][Filter profanity: No] [Video-edit join: No] [Extraction start time: not set (from start)] [Extraction end time: not set (to end)] [Live stream: No] [Clock frequency: 90000] [Teletext page: Autodetect] [Start credits text: None] [Quantisation-mode: CCExtractor's internal function] ----------------------------------------------------------------- Opening file: 3.bin Analyzing data in McPoodle raw mode Sending captions to stdout. 1 00:00:03,404 --> 00:00:06,171 In this lesson, we're going to be talking about finance. And 2 00:00:06,173 --> 00:00:10,009 one of the most important aspects of finance is interest. Total frames time: 00:00:00:000 (0 frames at 29.97fps) Min PTS: 00:00:00:001 Max PTS: 00:00:10:111 Length: 00:00:10:110 Done, processing time = 0 seconds Issues? Open a ticket here https://github.com/CCExtractor/ccextractor/issues ``` But when I try to switch the output format to MCC, I get a fatal error message: ``` [user@p5810r hyperdeck]# ccextractor -debug -in=raw -out=mcc 3.bin -o 4.mcc CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc -------------------------------------------------------------------------- Input: Files (1): 3.bin [Extract: 1] [Stream mode: McPoodle's raw] [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] [CEA-708: 63 decoders active] [CEA-708: using charset "none" for all services] [Timing mode: Auto] [Debug: Yes] [Buffer input: No] [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] [Target format: .mcc] [Encoding: UTF-8] [Delay: 0] [Trim lines: No] [Add font color data: Yes] [Add font typesetting: Yes] [Convert case: No][Filter profanity: No] [Video-edit join: No] [Extraction start time: not set (from start)] [Extraction end time: not set (to end)] [Live stream: No] [Clock frequency: 90000] [Teletext page: Autodetect] [Start credits text: None] [Quantisation-mode: CCExtractor's internal function] ----------------------------------------------------------------- Opening file: 3.bin Analyzing data in McPoodle raw mode Output format not supported Output format not supported Total frames time: 00:00:00:000 (0 frames at 29.97fps) Min PTS: 00:00:00:001 Max PTS: 00:00:10:111 Length: 00:00:10:110 Done, processing time = 0 seconds Issues? Open a ticket here https://github.com/CCExtractor/ccextractor/issues ``` After reviewing the code (ccx_encoders_common.c), I see that this error can be generated when a 608 caption is to be stored in a MCC format. This leads me to believe that I can output MCC files as long as the input is not considered to be 608. I don't see where I can declare my payload to be non-608. So I think I'm just doing this wrong. Do I need to provide 708, or can I tell ccextractor to "convert" to 708? My input file was created by using subrip2scc.pl to convert a SRT into SCC. Then I used scc2raw.pl to convert the SCC into a McPoodle RAW file. I think that *maybe* this is creating a standard definition (608) caption file? Anyway, that's what I fed into ccextractor as shown above. If I'm unknowingly making a 608-only file, this may be where I'm going wrong. I need 708 captions to feed into ccextractor to get a MCC file. Is it possible for someone to post a working example of a commandline producing a valid MCC output file? P.S. I have also tried this with the 0.94 release version. No difference. ----------------------------------------------------------------- As a second test, I took an entirely different approach. I injected a SRT into a MP4 using ffmpeg. VLC will show these captions all day long. But when fed to ccextractor, the captions are seen if specified in -stdout, but if you add the MCC output, you get a zero length mcc file. ``` ffmpeg -i HyperDeck_0002.mp4 -t 15 -i 0.srt -t 15 -c:v libx264 -profile:v main -b:v 53000k -pix_fmt yuv420p -c:s mov_text -metadata:s:s:0 language=eng HyperDeck_0004srt.mp4 [user@p5810r hyperdeck]# ccextractor -debug HyperDeck_0004srt.mp4 -out=mcc -o HyperDeck_0004srt.mcc CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc -------------------------------------------------------------------------- Input: Files (1): HyperDeck_0004srt.mp4 [Extract: 1] [Stream mode: Autodetect] [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] [CEA-708: 63 decoders active] [CEA-708: using charset "none" for all services] [Timing mode: Auto] [Debug: Yes] [Buffer input: No] [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] [Target format: .mcc] [Encoding: UTF-8] [Delay: 0] [Trim lines: No] [Add font color data: Yes] [Add font typesetting: Yes] [Convert case: No][Filter profanity: No] [Video-edit join: No] [Extraction start time: not set (from start)] [Extraction end time: not set (to end)] [Live stream: No] [Clock frequency: 90000] [Teletext page: Autodetect] [Start credits text: None] [Quantisation-mode: CCExtractor's internal function] ----------------------------------------------------------------- Opening file: HyperDeck_0004srt.mp4 Detected MP4 box with name: ftyp Detected MP4 box with name: free Detected MP4 box with name: mdat File seems to be a MP4 Analyzing data with GPAC (MP4 library) Opening 'HyperDeck_0004srt.mp4': ok Track 1, type=vide subtype=avc1 Track 2, type=soun subtype=MPEG Track 3, type=sbtl subtype=tx3g MP4: found 3 tracks: 1 avc and 1 cc Processing track 1, type=vide subtype=avc1 Processing track 2, type=soun subtype=MPEG Processing track 3, type=sbtl subtype=tx3g 100% | 00:17 Closing media: ok Found 1 AVC track(s). Found 1 CC track(s). Total frames time: 00:00:00:000 (0 frames at 29.97fps) Min PTS: 00:00:00:000 Max PTS: 00:00:17:720 Length: 00:00:17:720 Done, processing time = 0 seconds Issues? Open a ticket here https://github.com/CCExtractor/ccextractor/issues ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#767