[PR #1908] [MERGED] fix(windows): Bundle tessdata for OCR support out of the box #2696

Closed
opened 2026-01-29 17:23:28 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/CCExtractor/ccextractor/pull/1908
Author: @cfsmp3
Created: 12/26/2025
Status: Merged
Merged: 12/26/2025
Merged by: @cfsmp3

Base: masterHead: fix/issue-1578-bundle-tessdata


📝 Commits (1)

  • 20448bf fix(windows): Bundle tessdata for OCR support out of the box

📊 Changes

3 files changed (+91 additions, -2 deletions)

View changed files

📝 .github/workflows/release.yml (+8 -0)
📝 src/lib_ccx/ocr.c (+72 -2)
📝 windows/installer.wxs (+11 -0)

📄 Description

Summary

The Windows release was missing Tesseract OCR runtime dependencies (tessdata files) needed for the HardSubx feature to work. Users had to manually install Tesseract OCR and set TESSDATA_PREFIX environment variable.

This PR fixes that by:

  • Adding get_executable_directory() to ocr.c that finds the directory where CCExtractor is installed (works on Windows, Linux, and macOS)
  • Updating probe_tessdata_location() to search for tessdata in the executable directory, enabling bundled tessdata to be found automatically
  • Updating release workflow to download eng.traineddata and osd.traineddata from tesseract-ocr/tessdata_fast during release builds
  • Updating WiX installer to include the tessdata/ directory with the traineddata files

Now the Windows release includes tessdata files, and CCExtractor will automatically find them in the installation directory without requiring users to install Tesseract separately or set environment variables.

Test plan

Fixes #1578

🤖 Generated with Claude Code


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/CCExtractor/ccextractor/pull/1908 **Author:** [@cfsmp3](https://github.com/cfsmp3) **Created:** 12/26/2025 **Status:** ✅ Merged **Merged:** 12/26/2025 **Merged by:** [@cfsmp3](https://github.com/cfsmp3) **Base:** `master` ← **Head:** `fix/issue-1578-bundle-tessdata` --- ### 📝 Commits (1) - [`20448bf`](https://github.com/CCExtractor/ccextractor/commit/20448bfeb22545c9dd2dc567ce45fe03d1da810d) fix(windows): Bundle tessdata for OCR support out of the box ### 📊 Changes **3 files changed** (+91 additions, -2 deletions) <details> <summary>View changed files</summary> 📝 `.github/workflows/release.yml` (+8 -0) 📝 `src/lib_ccx/ocr.c` (+72 -2) 📝 `windows/installer.wxs` (+11 -0) </details> ### 📄 Description ## Summary The Windows release was missing Tesseract OCR runtime dependencies (tessdata files) needed for the HardSubx feature to work. Users had to manually install Tesseract OCR and set `TESSDATA_PREFIX` environment variable. This PR fixes that by: - Adding `get_executable_directory()` to `ocr.c` that finds the directory where CCExtractor is installed (works on Windows, Linux, and macOS) - Updating `probe_tessdata_location()` to search for tessdata in the executable directory, enabling bundled tessdata to be found automatically - Updating release workflow to download `eng.traineddata` and `osd.traineddata` from `tesseract-ocr/tessdata_fast` during release builds - Updating WiX installer to include the `tessdata/` directory with the traineddata files Now the Windows release includes tessdata files, and CCExtractor will automatically find them in the installation directory without requiring users to install Tesseract separately or set environment variables. ## Test plan - [x] Created test release on fork: https://github.com/cfsmp3/ccextractor/releases/tag/v0.97.0-test-tessdata - [x] Verified portable ZIP contains `tessdata/eng.traineddata` and `tessdata/osd.traineddata` - [x] Tested on Windows VM - HardSubx works out of the box without Tesseract installation Fixes #1578 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 17:23:28 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#2696