mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-03 21:23:48 +00:00
[PR #759] [MERGED] [IMPROVEMENT] Adding grayscale conversion for better OCR #1572
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/CCExtractor/ccextractor/pull/759
Author: @Abhinav95
Created: 7/21/2017
Status: ✅ Merged
Merged: 7/21/2017
Merged by: @cfsmp3
Base:
master← Head:master📝 Commits (1)
b1cc95dAdding grayscale conversion for better OCR📊 Changes
1 file changed (+8 additions, -3 deletions)
View changed files
📝
src/lib_ccx/ocr.c(+8 -3)📄 Description
Please prefix your pull request with one of the following: [FEATURE] [FIX] [IMPROVEMENT].
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
Sometimes when DVB subtitle bitmaps involved transparent backgrounds, Tesseract OCR would fail to accurately recognize the text, which led to nonsensical outputs. This can be fixed by first converting the Leptonica pix used by Tesseract to a grayscale which solves the problems caused by the transparent elements.
This is a well documented problem on the Tesseract repository.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.