[PR #858] [MERGED] [IMPROVEMENT] Checks for text before newlines on DVB subtitles #1690

Closed
opened 2026-01-29 17:17:59 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/CCExtractor/ccextractor/pull/858
Author: @ghost
Created: 12/27/2017
Status: Merged
Merged: 12/28/2017
Merged by: @cfsmp3

Base: masterHead: master


📝 Commits (1)

  • e22f96a reworks scanning newlines to look for content in a line

📊 Changes

1 file changed (+27 additions, -8 deletions)

View changed files

📝 src/lib_ccx/ocr.c (+27 -8)

📄 Description

Please prefix your pull request with one of the following: [FEATURE] [FIX] [IMPROVEMENT].

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.

My familiarity with the project is as follows (check one):

  • I have used CCExtractor just a couple of times.

Tesseract caused some previous problems with lots of newlines, which would cause issues with output being full of empty and useless whitespace. PR checks each line of text for any content, and moves up to the next line if there is no content.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/CCExtractor/ccextractor/pull/858 **Author:** [@ghost](https://github.com/ghost) **Created:** 12/27/2017 **Status:** ✅ Merged **Merged:** 12/28/2017 **Merged by:** [@cfsmp3](https://github.com/cfsmp3) **Base:** `master` ← **Head:** `master` --- ### 📝 Commits (1) - [`e22f96a`](https://github.com/CCExtractor/ccextractor/commit/e22f96a739107381783e89c924a1e2472f696a99) reworks scanning newlines to look for content in a line ### 📊 Changes **1 file changed** (+27 additions, -8 deletions) <details> <summary>View changed files</summary> 📝 `src/lib_ccx/ocr.c` (+27 -8) </details> ### 📄 Description Please prefix your pull request with one of the following: **[FEATURE]** **[FIX]** **[IMPROVEMENT]**. **In raising this pull request, I confirm the following (please check boxes):** - [X] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [X] I have checked that another pull request for this purpose does not exist. - [X] I have considered, and confirmed that this submission will be valuable to others. - [X] I accept that this submission may not be used, and the pull request closed at the will of the maintainer. - [X] I give this submission freely, and claim no ownership to its content. **My familiarity with the project is as follows (check one):** - [X] I have used CCExtractor just a couple of times. --- Tesseract caused some previous problems with lots of newlines, which would cause issues with output being full of empty and useless whitespace. PR checks each line of text for any content, and moves up to the next line if there is no content. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 17:17:59 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#1690