mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-04 05:44:53 +00:00
OCR status summary - tests not passing, a fix broke something else #643
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cfsmp3 on GitHub (Jun 11, 2021).
Originally assigned to: @PunitLodha on GitHub.
Summarizing the situation here so we have all the information handy.
One of the tests have been failing for a while. Specifically, we're getting garbage in some of the subtitle frames (but not all) for one specific sample. The failing test is here:
https://sampleplatform.ccextractor.org/test/3308
We know that the guilty commit is this:
84a9ea5572Which itself fixed something else, so just reverting it would probably fix this test at the expense of breaking the original sample again.
I spent a bit of time yesterday on it, and it's clearly a problem with the OCR, however the input images are correct. Enabling DEBUG_OCR (which writes the massaged images as the OCR engine -tesseract- gets them) show that the input contains what we expect.
So currently I suspect a problem with the internal status of the OCR (possibly we're not reinitializing something, who knows).
Since we have all samples, the previous code, the new code, etc, I think troubleshooting this should take a reasonable amount of time (and patience).
We want to release 0.89 in the next couple of days, with 0.90 following shortly after. This should be fixed (properly) in one of the two releases.
@cfsmp3 commented on GitHub (Jun 11, 2021):
I'm assigning this to @harrynull (don't know if around though - haven't seen him in a way) because he sent that commit, and to @PunitLodha since at some point this code will be rewritten to Rust anyway and Punit is working preparing things for the Rust work.
@MauryaRitesh commented on GitHub (Dec 4, 2021):
Is the issue still not resolved? I would like to work on this issue(or any other issue).
@cfsmp3 commented on GitHub (Dec 4, 2021):
Not solved, got for it
On Fri, Dec 3, 2021, 21:23 Ritesh Maurya @.***> wrote:
@cfsmp3 commented on GitHub (Mar 22, 2023):
Tested it just now. Unfortunately, still broken.