[BUG] Burned-in subtitles failed / killed when detect italics #635

Open
opened 2026-01-29 16:49:41 +00:00 by claunia · 0 comments
Owner

Originally created by @Hayholten on GitHub (Jun 8, 2021).

Hello everyone !

Hope you are wel ;)

CCExtractor version:

CCExtractor detailed version info
Version: 0.87
Git commit: Unknown
Compilation date: 2018-11-30
File SHA256: Could not open file
Libraries used by CCExtractor
Tesseract Version: 4.1.1
Leptonica Version: leptonica-1.79.0
libGPAC Version: 0.7.2-DEV
zlib: 1.2.11
utf8proc Version: 2.5.0
protobuf-c Version: 1.1.1
libpng Version: 1.6.34
FreeType
libhash
nuklear
libzvbi

In raising this issue, I confirm the following:

  • [ OK ] I have read and understood the contributors guide.
  • [ NOT SURE ] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • [ OK ] I have checked that the issue I'm posting isn't already reported.
  • [ NOT SURE ] I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • [ OK ] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • [ OK ] I have used the latest available version of CCExtractor to verify this issue exists.
  • [ OK ] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.

Necessary information

  • Is this a regression (i.e. did it work before)? {NO}
  • What platform did you use? {Linux}
  • What were the used arguments?

ccextractor input.mp4 -ocrlang fra -out=webvtt -nobom -utf8 --nofontcolor -hardsubx -subcolor white -detect_italics -conf_thresh 60

Video links

https://drive.google.com/file/d/1ef-iiBtucK7qZQGzMuKMPsPn7ZBJp203/view?usp=sharing

Additional information

So here's the problem.

Everything starts well with CCExtractor until it encounters italic subtitles. In my example, it starts at 06:41 and every time, regardless of the settings, the process seems to be killed. Here is the log of my command:

HardsubX (Hard Subtitle Extractor) - Burned-in subtitle extraction subsystem
Input : ./Shadowz/WIP/Lifechanger.mp4
Subtitle Color : White
OCR Mode : Word-wise
OCR Confidence Threshold : 60.00
OCR Luminance Threshold : 95.00 (Default)
OCR Italic Detection : Off
Minimum subtitle duration : 0.5 seconds (Default)
FFMpeg Media Information:-
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from './Shadowz/WIP/Lifechanger.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.45.100
Duration: 01:23:55.68, start: 0.000000, bitrate: 3335 kb/s
Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x800, 2689 kb/s, SAR 1:1 DAR 12:5, 23.98 fps, 23.98 tbr, 16k tbn, 47.95 tbc (default)
Metadata:
handler_name : VideoHandler
Stream #0:1(eng): Audio: eac3 (ec-3 / 0x332D6365), 48000 Hz, 5.1(side), fltp, 640 kb/s (default)
Metadata:
handler_name : SoundHandler
Side data:
audio service type: main
Beginning burned-in subtitle detection...
7% | 06:41Killed

Can you help me with this problem? I should point out that even when I don't ask for italics, the issue still appears.

Thank you very much for your feedback.

Originally created by @Hayholten on GitHub (Jun 8, 2021). Hello everyone ! Hope you are wel ;) CCExtractor version: CCExtractor detailed version info Version: 0.87 Git commit: Unknown Compilation date: 2018-11-30 File SHA256: Could not open file Libraries used by CCExtractor Tesseract Version: 4.1.1 Leptonica Version: leptonica-1.79.0 libGPAC Version: 0.7.2-DEV zlib: 1.2.11 utf8proc Version: 2.5.0 protobuf-c Version: 1.1.1 libpng Version: 1.6.34 FreeType libhash nuklear libzvbi # In raising this issue, I confirm the following: - [ OK ] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [ NOT SURE ] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present. - [ OK ] I have checked that the issue I'm posting isn't already reported. - [ NOT SURE ] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues) - [ OK ] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion. - [ OK ] I have used the latest available version of CCExtractor to verify this issue exists. - [ OK ] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text. # Necessary information - Is this a regression (i.e. did it work before)? {NO} - What platform did you use? {Linux} - What were the used arguments? ccextractor input.mp4 -ocrlang fra -out=webvtt -nobom -utf8 --nofontcolor -hardsubx -subcolor white -detect_italics -conf_thresh 60 # Video links https://drive.google.com/file/d/1ef-iiBtucK7qZQGzMuKMPsPn7ZBJp203/view?usp=sharing # Additional information So here's the problem. Everything starts well with CCExtractor until it encounters italic subtitles. In my example, it starts at 06:41 and every time, regardless of the settings, the process seems to be killed. Here is the log of my command: > HardsubX (Hard Subtitle Extractor) - Burned-in subtitle extraction subsystem Input : ./Shadowz/WIP/Lifechanger.mp4 Subtitle Color : White OCR Mode : Word-wise OCR Confidence Threshold : 60.00 OCR Luminance Threshold : 95.00 (Default) OCR Italic Detection : Off Minimum subtitle duration : 0.5 seconds (Default) FFMpeg Media Information:- Input #0, mov,mp4,m4a,3gp,3g2,mj2, from './Shadowz/WIP/Lifechanger.mp4': Metadata: major_brand : isom minor_version : 512 compatible_brands: isomiso2avc1mp41 encoder : Lavf58.45.100 Duration: 01:23:55.68, start: 0.000000, bitrate: 3335 kb/s Stream #0:0(und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709), 1920x800, 2689 kb/s, SAR 1:1 DAR 12:5, 23.98 fps, 23.98 tbr, 16k tbn, 47.95 tbc (default) Metadata: handler_name : VideoHandler Stream #0:1(eng): Audio: eac3 (ec-3 / 0x332D6365), 48000 Hz, 5.1(side), fltp, 640 kb/s (default) Metadata: handler_name : SoundHandler Side data: audio service type: main Beginning burned-in subtitle detection... 7% | 06:41Killed Can you help me with this problem? I should point out that even when I don't ask for italics, the issue still appears. Thank you very much for your feedback.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#635