[PR #2059] [CLOSED] Fix/dvb subtitle ocr and spupng #2866

Open
opened 2026-01-29 17:24:20 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/CCExtractor/ccextractor/pull/2059
Author: @aniketchawardol
Created: 1/24/2026
Status: Closed

Base: masterHead: fix/dvb-subtitle-ocr-and-spupng


📝 Commits (3)

  • efdff1d Fix DVB subtitle OCR and SPUPNG output for options test on Linux
  • 69c5f38 Update changelog for DVB subtitle fixes
  • 5cda73f Apply clang-format

📊 Changes

9 files changed (+116 additions, -7 deletions)

View changed files

📝 docs/CHANGES.TXT (+3 -0)
dvb.mpg (+0 -0)
options.p1.svc01.srt (+24 -0)
options.srt (+20 -0)
options.ts (+0 -0)
output-test.p1.svc01.srt (+24 -0)
output-test.srt (+20 -0)
📝 src/lib_ccx/ccx_encoders_spupng.c (+6 -5)
📝 src/lib_ccx/dvb_subtitle_decoder.c (+19 -2)

📄 Description

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

DVB subtitle OCR extraction was failing in the options test on Linux due to three bugs:

The write_dvb_sub function used an undefined region variable when calling ocr_rect, causing crashes or incorrect OCR results.

The SPUPNG encoder wrote the closing tags immediately after the opening tags in write_spumux_header, so the output file had no subtitle content between the subpictures tags.

DVB subtitle regions were not marked as processed after OCR extraction, causing them to be processed multiple times and creating duplicate subtitle entries.

Solution
Fixed the undefined region variable by finding the first valid region from the display list and using that for the bgcolor parameter in ocr_rect.

Removed the code that prematurely wrote the footer in write_spumux_header. The footer now writes during normal cleanup in write_spumux_footer.

Added a loop at the end of write_dvb_sub to clear the dirty flag for all processed regions, preventing duplicate processing.

Added safety code for builds without OCR support to set ocr_text pointers to NULL, preventing use-after-free errors.

Testing
Tested with the failed test sample of linux platform https://sampleplatform.ccextractor.org/test/7992# SPUPNG output has proper XML structure with subpictures wrapper tags and all subtitle entries with OCR comments. PNG files are generated correctly.

As per what I know PR has to be raised to test the updated code. I will promptly close this PR if changes made by me prove to be invaluable :)


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/CCExtractor/ccextractor/pull/2059 **Author:** [@aniketchawardol](https://github.com/aniketchawardol) **Created:** 1/24/2026 **Status:** ❌ Closed **Base:** `master` ← **Head:** `fix/dvb-subtitle-ocr-and-spupng` --- ### 📝 Commits (3) - [`efdff1d`](https://github.com/CCExtractor/ccextractor/commit/efdff1d3dbc7f4d894eba8554da2684734f16cae) Fix DVB subtitle OCR and SPUPNG output for options test on Linux - [`69c5f38`](https://github.com/CCExtractor/ccextractor/commit/69c5f382a3b911dd31a47d2ec7c3bb5d267fa095) Update changelog for DVB subtitle fixes - [`5cda73f`](https://github.com/CCExtractor/ccextractor/commit/5cda73fa557604a8cfd7d667b0257ce1b4401370) Apply clang-format ### 📊 Changes **9 files changed** (+116 additions, -7 deletions) <details> <summary>View changed files</summary> 📝 `docs/CHANGES.TXT` (+3 -0) ➕ `dvb.mpg` (+0 -0) ➕ `options.p1.svc01.srt` (+24 -0) ➕ `options.srt` (+20 -0) ➕ `options.ts` (+0 -0) ➕ `output-test.p1.svc01.srt` (+24 -0) ➕ `output-test.srt` (+20 -0) 📝 `src/lib_ccx/ccx_encoders_spupng.c` (+6 -5) 📝 `src/lib_ccx/dvb_subtitle_decoder.c` (+19 -2) </details> ### 📄 Description <!-- Please prefix your pull request with one of the following: **[FEATURE]** **[FIX]** **[IMPROVEMENT]**. --> **In raising this pull request, I confirm the following (please check boxes):** - [x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [x] I have checked that another pull request for this purpose does not exist. - [x] I have considered, and confirmed that this submission will be valuable to others. - [x] I accept that this submission may not be used, and the pull request closed at the will of the maintainer. - [x] I give this submission freely, and claim no ownership to its content. - [x] **I have mentioned this change in the [changelog](https://github.com/CCExtractor/ccextractor/blob/master/docs/CHANGES.TXT).** **My familiarity with the project is as follows (check one):** - [ ] I have never used CCExtractor. - [x] I have used CCExtractor just a couple of times. - [ ] I absolutely love CCExtractor, but have not contributed previously. - [ ] I am an active contributor to CCExtractor. --- DVB subtitle OCR extraction was failing in the options test on Linux due to three bugs: The write_dvb_sub function used an undefined region variable when calling ocr_rect, causing crashes or incorrect OCR results. The SPUPNG encoder wrote the closing tags immediately after the opening tags in write_spumux_header, so the output file had no subtitle content between the subpictures tags. DVB subtitle regions were not marked as processed after OCR extraction, causing them to be processed multiple times and creating duplicate subtitle entries. Solution Fixed the undefined region variable by finding the first valid region from the display list and using that for the bgcolor parameter in ocr_rect. Removed the code that prematurely wrote the footer in write_spumux_header. The footer now writes during normal cleanup in write_spumux_footer. Added a loop at the end of write_dvb_sub to clear the dirty flag for all processed regions, preventing duplicate processing. Added safety code for builds without OCR support to set ocr_text pointers to NULL, preventing use-after-free errors. Testing Tested with the failed test sample of linux platform https://sampleplatform.ccextractor.org/test/7992# SPUPNG output has proper XML structure with subpictures wrapper tags and all subtitle entries with OCR comments. PNG files are generated correctly. As per what I know PR has to be raised to test the updated code. I will promptly close this PR if changes made by me prove to be invaluable :) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 17:24:20 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#2866