mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-15 21:23:10 +00:00
[QUESTION] Instructions for using ccextractor with Tesseract 4 #499
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @rboy1 on GitHub (Jun 24, 2019).
Please prefix your issue with one of the following: [QUESTION].
CCExtractor version (using the --version parameter preferably) : 0.88
My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):
Necessary information
What are the instructions for using CCExtractor 0.88 with Tesseract 4? Does it mean Tesseract 3.04 won't work anymore?
The docs page doesn't seem to talk about this, can it updated to include that information please.
https://github.com/CCExtractor/ccextractor/blob/master/docs/OCR.md
@soulspark666 commented on GitHub (Jun 25, 2019):
According to https://github.com/UB-Mannheim/tesseract/wiki Tesseract 5.0 exists. I think updating the ccextractor to 5.0 build and then updating the OCR.md will be a better option.
@rboy1 commented on GitHub (Jun 26, 2019):
In the meanwhile can someone confirm if 0.88 works with Tesseract v3 or only v4 or both and/or if there's any difference in the stability/quality between the two.
@rboy1 commented on GitHub (Sep 26, 2019):
Anyone thoughts?
@NilsIrl commented on GitHub (Jan 2, 2020):
I can confirm it works with both v3 and v4.
However I've had terrible results with v4
@NilsIrl commented on GitHub (Jan 2, 2020):
What I did is I just installed tesseract4 instead of tesseract3 and then built it. Everything worked without a problem
There is already some code in ccextractor related to tesseract4:
3a1815163f/src/lib_ccx/ocr.c (L202)