[BUG] or [QUESTION] : Hardsubx didn't extract burn-in subs exactly as expected #680

Closed
opened 2026-01-29 16:50:51 +00:00 by claunia · 4 comments
Owner

Originally created by @brebetez on GitHub (Feb 1, 2022).

Originally assigned to: @shashwat1002 on GitHub.

Please prefix your issue with one of the following: [BUG], [QUESTION].

CCExtractor version: 0.94
CCExtractor detailed version info
Git commit: 290e2f10f9
Compilation date: 2021-12-27
Libraries used by CCExtractor
Tesseract Version: 4.1.1
Leptonica Version: leptonica-1.79.0
libGPAC Version: 1.0.1
zlib: 1.2.11
utf8proc Version: 2.4.0
protobuf-c Version: 1.3.1
libpng Version: 1.6.37
FreeType
libhash
nuklear
libzvbi

In raising this issue, I confirm the following:

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • I have used the latest available version of CCExtractor to verify this issue exists.
  • I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.

Necessary information

  • Is this a regression (i.e. did it work before)? {DON'T KNOW}
  • What platform did you use? {Linux}
  • What were the used arguments? myVideo.mp4 -ocrlang fra -hardsubx -ocr_mode frame -subcolor white -min_sub_duration 0.01 -detect_italics -whiteness_thresh 97 -conf_thresh 75

Video links

  • {you can ask for an exemple if needed, run the tool on 24 videos with some differants look and feel of burn in subs...all the time the same issues.}

Additional information

{I have several issue:

  • Subtitle output didn't fit on a time base with the burn in subtitle. Sometimes, the subtitles start a little bit (between 2sec until a few frame) before as the burn in subtitle. I tried to change the parameter of OCR_MODE, but no change on this delay.
  • Subtitle duration of the output: All subtitle extracted has a a fix duration of 1 or 2 seconds. But nothing in between, more or less. That means, the subtitle disappears before that the burn in subtitle disappear.
  • Burn in subtitle on 2 lines: Inside the output, mostly, only the second line a recognized. Or it create 2 or 3 subtitles with always something more inside the subtitle...but online on a one line, not exactly as it is in burn in subs.

I don't know if it is an issue or if I didn't use correctly the parameters, but as I said, I tried a lot of different ways...with all the time the same result. That is why, I'm opening an Issue.

Thanks for your feedback and help.}

Originally created by @brebetez on GitHub (Feb 1, 2022). Originally assigned to: @shashwat1002 on GitHub. Please prefix your issue with one of the following: [BUG], [QUESTION]. CCExtractor version: 0.94 CCExtractor detailed version info Git commit: 290e2f10f9e681c0ba1d53df5ba29166622b0a20 Compilation date: 2021-12-27 Libraries used by CCExtractor Tesseract Version: 4.1.1 Leptonica Version: leptonica-1.79.0 libGPAC Version: 1.0.1 zlib: 1.2.11 utf8proc Version: 2.4.0 protobuf-c Version: 1.3.1 libpng Version: 1.6.37 FreeType libhash nuklear libzvbi # In raising this issue, I confirm the following: - [x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [x] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present. - [x] I have checked that the issue I'm posting isn't already reported. - [x] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues) - [x] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion. - [x] I have used the latest available version of CCExtractor to verify this issue exists. - [x] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text. # Necessary information - Is this a regression (i.e. did it work before)? {DON'T KNOW} - What platform did you use? {Linux} - What were the used arguments? `myVideo.mp4 -ocrlang fra -hardsubx -ocr_mode frame -subcolor white -min_sub_duration 0.01 -detect_italics -whiteness_thresh 97 -conf_thresh 75` # Video links * {you can ask for an exemple if needed, run the tool on 24 videos with some differants look and feel of burn in subs...all the time the same issues.} # Additional information {I have several issue: - Subtitle output didn't fit on a time base with the burn in subtitle. Sometimes, the subtitles start a little bit (between 2sec until a few frame) before as the burn in subtitle. I tried to change the parameter of OCR_MODE, but no change on this delay. - Subtitle duration of the output: All subtitle extracted has a a fix duration of 1 or 2 seconds. But nothing in between, more or less. That means, the subtitle disappears before that the burn in subtitle disappear. - Burn in subtitle on 2 lines: Inside the output, mostly, only the second line a recognized. Or it create 2 or 3 subtitles with always something more inside the subtitle...but online on a one line, not exactly as it is in burn in subs. I don't know if it is an issue or if I didn't use correctly the parameters, but as I said, I tried a lot of different ways...with all the time the same result. That is why, I'm opening an Issue. Thanks for your feedback and help.}
claunia added the HardsubX label 2026-01-29 16:50:51 +00:00
Author
Owner

@PunitLodha commented on GitHub (Jun 1, 2022):

Please share the videos, so that we could look into this issue

@PunitLodha commented on GitHub (Jun 1, 2022): Please share the videos, so that we could look into this issue
Author
Owner

@shashwat1002 commented on GitHub (Jun 2, 2022):

On investigating, at least on the files I have there are subtitles extracted with duration less than a second.
So I guess the second point is not entirely general, either it has changed since the issue or it's an artefact of the files used.

@brebetez please consider sharing the file you used

cc: @PunitLodha

@shashwat1002 commented on GitHub (Jun 2, 2022): On investigating, at least on the files I have there are subtitles extracted with duration less than a second. So I guess the second point is not entirely general, either it has changed since the issue or it's an artefact of the files used. @brebetez please consider sharing the file you used cc: @PunitLodha
Author
Owner

@cfsmp3 commented on GitHub (Dec 20, 2025):

@brebetez We do need a sample and exact instructions to reproduce the problem.

@cfsmp3 commented on GitHub (Dec 20, 2025): @brebetez We do need a sample and exact instructions to reproduce the problem.
Author
Owner

@cfsmp3 commented on GitHub (Jan 1, 2026):

Closing since we didn't hear back from @brebetez - @brebetez , feel free to reply if you still have problems.

@cfsmp3 commented on GitHub (Jan 1, 2026): Closing since we didn't hear back from @brebetez - @brebetez , feel free to reply if you still have problems.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#680