mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-04-20 13:03:58 +00:00
[PR #1601] [MERGED] Add flag for Page Segmentation Modes control #2314
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/CCExtractor/ccextractor/pull/1601
Author: @Neo2SHYAlien
Created: 3/5/2024
Status: ✅ Merged
Merged: 9/3/2024
Merged by: @PunitLodha
Base:
master← Head:master📝 Commits (10+)
ba09cb4Add flag for Page Segmentation Modes controlf60841fMerge branch 'master' into master2121165feat: add psm for rust parser1ac6cc7Merge pull request #1 from prateekmedia/add-psm-rust47b3757Merge branch 'master' into masterbc8c86ffix: add psm to optionsd5d703cMerge branch 'master' into master036d34cfix: add default value of psm to 3b8c24a9fix: correct type of ocr oem7222fcafix(rust): use fatal! instead of exit📊 Changes
9 files changed (+78 additions, -0 deletions)
View changed files
📝
docs/CHANGES.TXT(+1 -0)📝
src/lib_ccx/ccx_common_option.c(+1 -0)📝
src/lib_ccx/ccx_common_option.h(+1 -0)📝
src/lib_ccx/ocr.c(+3 -0)📝
src/lib_ccx/params.c(+38 -0)📝
src/lib_ccx/params_dump.c(+2 -0)📝
src/rust/lib_ccxr/src/common/options.rs(+3 -0)📝
src/rust/src/args.rs(+19 -0)📝
src/rust/src/parser.rs(+10 -0)📄 Description
In raising this pull request, I confirm the following (please check boxes):
My familiarity with the project is as follows (check one):
I added an flag
-psmfor controlling PSM (Page Segmentation Modes) in Tesseract. The default option (3) gives me quite bad results. When I use 6, 11, or 12 for Bulgarian, it gives me much better OCR results. I haven't tested other languages yet, but I expect improvements as well if other mode is used.p.s This PR is continue #1544 which was closed after the rebase 🥲
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.