mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-04-22 05:59:48 +00:00
[PR #1925] [MERGED] feat(ocr): Add character blacklist and line-split options for better accuracy #2727
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/CCExtractor/ccextractor/pull/1925
Author: @cfsmp3
Created: 12/29/2025
Status: ✅ Merged
Merged: 12/31/2025
Merged by: @cfsmp3
Base:
master← Head:feat/ocr-blacklist-default📝 Commits (2)
8c586bcfeat(ocr): Add character blacklist and line-split options for better accuracyd28bc4estyle: Fix formatting issues in ocr.c and options.rs📊 Changes
8 files changed (+320 additions, -0 deletions)
View changed files
📝
src/lib_ccx/ccx_common_option.c(+2 -0)📝
src/lib_ccx/ccx_common_option.h(+2 -0)📝
src/lib_ccx/ocr.c(+279 -0)📝
src/lib_ccx/params.c(+7 -0)📝
src/rust/lib_ccxr/src/common/options.rs(+6 -0)📝
src/rust/src/args.rs(+12 -0)📝
src/rust/src/common.rs(+4 -0)📝
src/rust/src/parser.rs(+8 -0)📄 Description
Summary
I→|--ocr-line-splitmode for multi-line subtitle imagesNew Options
--no-ocr-blacklist|,\,`,_,~)--ocr-line-splitTest Results (VOBSUB MKV sample)
|errorsThe blacklist completely eliminates pipe character misrecognition, matching subtile-ocr's accuracy.
Test plan
--no-ocr-blacklistto verify it can be disabled--ocr-line-splitoption🤖 Generated with Claude Code
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.