[BUG] Odd output when inputting an MKV file #657

Closed
opened 2026-01-29 16:50:16 +00:00 by claunia · 4 comments
Owner

Originally created by @Southpaw1496 on GitHub (Aug 6, 2021).

CCExtractor version:

CCExtractor detailed version info
	Version: 0.91
	Git commit: Unknown
	Compilation date: 2021-07-26
	File SHA256: Could not open file
Libraries used by CCExtractor
	Tesseract Version: 4.1.1
	Leptonica Version: leptonica-1.81.1
	libGPAC Version: 1.0.1
	zlib: 1.2.11
	utf8proc Version: 2.6.1
	protobuf-c Version: 1.4.0
	libpng Version: 1.6.37
	FreeType 
	libhash
	nuklear
	libzvbi

Necessary information

  • Is this a regression (i.e. did it work before)? No
  • What platform did you use? Mac (with Homebrew)
  • What were the used arguments? None

Video links

Additional information

MKV was ripped from a disk using MakeMKV. I've included the output of running ccextreactor [filename] in the below output.zip file.
Output.zip
Please let me know if any more information is required.

Originally created by @Southpaw1496 on GitHub (Aug 6, 2021). # CCExtractor version: ``` CCExtractor detailed version info Version: 0.91 Git commit: Unknown Compilation date: 2021-07-26 File SHA256: Could not open file Libraries used by CCExtractor Tesseract Version: 4.1.1 Leptonica Version: leptonica-1.81.1 libGPAC Version: 1.0.1 zlib: 1.2.11 utf8proc Version: 2.6.1 protobuf-c Version: 1.4.0 libpng Version: 1.6.37 FreeType libhash nuklear libzvbi ``` # Necessary information - Is this a regression (i.e. did it work before)? No - What platform did you use? Mac (with Homebrew) - What were the used arguments? None # Video links * https://drive.google.com/file/d/1B9ZeUZ-Pv1dUeu0aJP5G1s_8-_SzA_xU/view?usp=sharing (ZIP encrypted, password shared with relevent people). # Additional information MKV was ripped from a disk using MakeMKV. I've included the output of running `ccextreactor [filename]` in the below `output.zip` file. [Output.zip](https://github.com/CCExtractor/ccextractor/files/6945015/Output.zip) Please let me know if any more information is required.
Author
Owner

@sheharyaar commented on GitHub (Aug 26, 2021):

Hi @Southpaw1496, I would like to look into the issue. Can you please send the password to the archive at : sheharyaar48@gmail.com

@sheharyaar commented on GitHub (Aug 26, 2021): Hi @Southpaw1496, I would like to look into the issue. Can you please send the password to the archive at : sheharyaar48@gmail.com
Author
Owner

@Southpaw1496 commented on GitHub (Jan 3, 2022):

Here is another file of a Bugs Bunny short, unencrypted this time https://drive.google.com/file/d/1cmntXqJFZGRdoNGLljPBYqFBgeckJlO7/view?usp=sharing

Output for this file:
Output-bugs.zip

@Southpaw1496 commented on GitHub (Jan 3, 2022): Here is another file of a Bugs Bunny short, unencrypted this time https://drive.google.com/file/d/1cmntXqJFZGRdoNGLljPBYqFBgeckJlO7/view?usp=sharing Output for this file: [Output-bugs.zip](https://github.com/CCExtractor/ccextractor/files/7801404/Output-bugs.zip)
Author
Owner

@canihavesomecoffee commented on GitHub (Jan 3, 2022):

Bugs file has been made available here: https://sampleplatform.ccextractor.org/sample/178

Logs from output zip above:

CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
Input: bugs.mkv
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[CEA-708: 63 decoders active]
[CEA-708: using charset "none" for all services]
[Timing mode: Auto] [Debug: No] [Buffer input: No]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No][Filter profanity: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
[Quantisation-mode: CCExtractor's internal function]

-----------------------------------------------------------------
Opening file: bugs.mkv
File seems to be a Matroska/WebM container
Analyzing data in Matroska mode


Document type: matroska
Timecode scale: 1000000
Muxing app: libmakemkv v1.16.4 (1.3.10/1.5.2) darwin(x64-release)
Writing app: MakeMKV v1.16.4 darwin(x64-release)

Track entry:
    Track number: 1
    UID: 1
    Type: video
    Codec ID: V_MPEG2

Track entry:
    Track number: 2
    UID: 2
    Type: audio
    Codec ID: A_AC3
    Language: eng
    Name: Stereo

Track entry:
    Track number: 3
    UID: 3
    Type: subtitle
    Codec ID: S_VOBSUB
    Language: eng

Track entry:
    Track number: 4
    UID: 4
    Type: subtitle
    Codec ID: S_VOBSUB
    Language: eng
 99%  |  06:50
100%  |  06:50
Output file: bugs_eng.(null)
Output file: bugs_eng_1.(null)

Found no AVC track. 

Total frames time:	  00:00:00:000  (0 frames at 29.97fps)
Done, processing time = 1 seconds
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

It creates an empty .srt, and the two files for the VOBSUB ones (albeit with a "(null)" extension?), but no conteint is in either file.

@canihavesomecoffee commented on GitHub (Jan 3, 2022): Bugs file has been made available here: https://sampleplatform.ccextractor.org/sample/178 Logs from output zip above: ``` CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc -------------------------------------------------------------------------- Input: bugs.mkv [Extract: 1] [Stream mode: Autodetect] [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] [CEA-708: 63 decoders active] [CEA-708: using charset "none" for all services] [Timing mode: Auto] [Debug: No] [Buffer input: No] [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] [Target format: .srt] [Encoding: UTF-8] [Delay: 0] [Trim lines: No] [Add font color data: Yes] [Add font typesetting: Yes] [Convert case: No][Filter profanity: No] [Video-edit join: No] [Extraction start time: not set (from start)] [Extraction end time: not set (to end)] [Live stream: No] [Clock frequency: 90000] [Teletext page: Autodetect] [Start credits text: None] [Quantisation-mode: CCExtractor's internal function] ----------------------------------------------------------------- Opening file: bugs.mkv File seems to be a Matroska/WebM container Analyzing data in Matroska mode Document type: matroska Timecode scale: 1000000 Muxing app: libmakemkv v1.16.4 (1.3.10/1.5.2) darwin(x64-release) Writing app: MakeMKV v1.16.4 darwin(x64-release) Track entry: Track number: 1 UID: 1 Type: video Codec ID: V_MPEG2 Track entry: Track number: 2 UID: 2 Type: audio Codec ID: A_AC3 Language: eng Name: Stereo Track entry: Track number: 3 UID: 3 Type: subtitle Codec ID: S_VOBSUB Language: eng Track entry: Track number: 4 UID: 4 Type: subtitle Codec ID: S_VOBSUB Language: eng 99% | 06:50 100% | 06:50 Output file: bugs_eng.(null) Output file: bugs_eng_1.(null) Found no AVC track. Total frames time: 00:00:00:000 (0 frames at 29.97fps) Done, processing time = 1 seconds Issues? Open a ticket here https://github.com/CCExtractor/ccextractor/issues ``` It creates an empty .srt, and the two files for the VOBSUB ones (albeit with a "(null)" extension?), but no conteint is in either file.
Author
Owner

@PunitLodha commented on GitHub (Jul 4, 2022):

So, the issue here is that we don't support VOBSUB subtitles.
To support it, we need to create 2 files, .idx and .sub. We generate .idx file (although incorrect), but no .sub file.

Current .idx file:-

# VobSub index file, v7 (do not modify this line!)
Headers...

+ timestamp: 00:00:01:101, filepos: 000000000
+ timestamp: 00:00:08:708, filepos: 000001000
  • Header is correct
  • timestamp: Missing, but correct (stored in time_start)
  • filepos:- Missing. Need to get correct file positions according to the .sub file

We also need to write correct data to the .sub file.

Reference,

@PunitLodha commented on GitHub (Jul 4, 2022): So, the issue here is that we don't support VOBSUB subtitles. To support it, we need to create 2 files, `.idx` and `.sub`. We generate `.idx` file (although incorrect), but no `.sub` file. Current .idx file:- ``` diff # VobSub index file, v7 (do not modify this line!) Headers... + timestamp: 00:00:01:101, filepos: 000000000 + timestamp: 00:00:08:708, filepos: 000001000 ``` - Header is correct - timestamp: Missing, but correct (stored in `time_start`) - filepos:- Missing. Need to get correct file positions according to the `.sub` file We also need to write correct data to the `.sub` file. Reference, - https://www.matroska.org/technical/subtitles.html - [mkvextract](https://mkvtoolnix.download/doc/mkvextract.html)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#657