[PR #1105] [MERGED] [FIX] Fix several memory leaks using Leptonica API for hardcoded subtitle extraction #1897

Closed
opened 2026-01-29 17:19:08 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/CCExtractor/ccextractor/pull/1105
Author: @dbarelop
Created: 9/9/2019
Status: Merged
Merged: 9/12/2019
Merged by: @cfsmp3

Base: masterHead: master


📝 Commits (6)

  • 5346475 Rewritten Tesseract and Leptonica imports
  • f79d87f Fixed memory leak extracting hardcoded subtitles
  • c1c32a2 Minor code enhancements and cleanups
  • e88bea7 Fixed memory leak using function pixSauvolaBinarize
  • 3d6bd56 Updated changelog
  • 41fd3eb Merge branch 'master' into master

📊 Changes

10 files changed (+189 additions, -146 deletions)

View changed files

📝 docs/CHANGES.TXT (+1 -0)
📝 src/lib_ccx/hardsubx.c (+2 -2)
📝 src/lib_ccx/hardsubx.h (+4 -4)
📝 src/lib_ccx/hardsubx_classifier.c (+1 -1)
📝 src/lib_ccx/hardsubx_decoder.c (+172 -133)
📝 src/lib_ccx/hardsubx_imgops.c (+1 -1)
📝 src/lib_ccx/hardsubx_utility.c (+1 -1)
📝 src/lib_ccx/ocr.c (+2 -2)
📝 src/lib_ccx/params.c (+2 -2)
📝 src/lib_ccx/utility.h (+3 -0)

📄 Description

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

Leptonica functions in hardsubx_decoder.c such as pixConvertRGBToGray, pixSobelEdgeFilter, pixDilateGray or pixThresholdToBinary allocate a new Pix struct everytime they are called, rather than modifying the one passed as an argument.

This leads to a memory leak everytime the functions _process_frame_white_basic, _process_frame_color_basic, _display_frame or _process_frame_tickertext are called, which happens at every frame of the input video and results in a huge memory consumption, even for a video the size of a few hundred MBs.

The proposed fix keeps track of all the created Pix structs and destroys them properly after each call to the mentioned functions. Library inclusions for Leptonica and Tesseract libraries have also been modified to look up the standard include directories.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/CCExtractor/ccextractor/pull/1105 **Author:** [@dbarelop](https://github.com/dbarelop) **Created:** 9/9/2019 **Status:** ✅ Merged **Merged:** 9/12/2019 **Merged by:** [@cfsmp3](https://github.com/cfsmp3) **Base:** `master` ← **Head:** `master` --- ### 📝 Commits (6) - [`5346475`](https://github.com/CCExtractor/ccextractor/commit/5346475e7536c5573a11a241ca9a2ea6fe851ba4) Rewritten Tesseract and Leptonica imports - [`f79d87f`](https://github.com/CCExtractor/ccextractor/commit/f79d87f9eddb88634e52b004a288ce7620e4da73) Fixed memory leak extracting hardcoded subtitles - [`c1c32a2`](https://github.com/CCExtractor/ccextractor/commit/c1c32a2d578593680af6f95d7dd15662d4a2d712) Minor code enhancements and cleanups - [`e88bea7`](https://github.com/CCExtractor/ccextractor/commit/e88bea7f1556efb7066515158c34b162f0336a7c) Fixed memory leak using function pixSauvolaBinarize - [`3d6bd56`](https://github.com/CCExtractor/ccextractor/commit/3d6bd563d8f1a27b53215c9056ab02bf371df352) Updated changelog - [`41fd3eb`](https://github.com/CCExtractor/ccextractor/commit/41fd3ebdcfd62c16fd924f6a3b2aa794831dac32) Merge branch 'master' into master ### 📊 Changes **10 files changed** (+189 additions, -146 deletions) <details> <summary>View changed files</summary> 📝 `docs/CHANGES.TXT` (+1 -0) 📝 `src/lib_ccx/hardsubx.c` (+2 -2) 📝 `src/lib_ccx/hardsubx.h` (+4 -4) 📝 `src/lib_ccx/hardsubx_classifier.c` (+1 -1) 📝 `src/lib_ccx/hardsubx_decoder.c` (+172 -133) 📝 `src/lib_ccx/hardsubx_imgops.c` (+1 -1) 📝 `src/lib_ccx/hardsubx_utility.c` (+1 -1) 📝 `src/lib_ccx/ocr.c` (+2 -2) 📝 `src/lib_ccx/params.c` (+2 -2) 📝 `src/lib_ccx/utility.h` (+3 -0) </details> ### 📄 Description **In raising this pull request, I confirm the following (please check boxes):** - [X] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [X] I have checked that another pull request for this purpose does not exist. - [X] I have considered, and confirmed that this submission will be valuable to others. - [X] I accept that this submission may not be used, and the pull request closed at the will of the maintainer. - [X] I give this submission freely, and claim no ownership to its content. - [X] **I have mentioned this change in the [changelog](https://github.com/CCExtractor/ccextractor/blob/master/docs/CHANGES.TXT).** **My familiarity with the project is as follows (check one):** - [ ] I have never used CCExtractor. - [X] I have used CCExtractor just a couple of times. - [ ] I absolutely love CCExtractor, but have not contributed previously. - [ ] I am an active contributor to CCExtractor. --- Leptonica functions in [`hardsubx_decoder.c`](https://github.com/CCExtractor/ccextractor/blob/2bcd993c0f9ba97fe33f5bdb43d4596b9b927fa3/src/lib_ccx/hardsubx.c) such as `pixConvertRGBToGray`, `pixSobelEdgeFilter`, `pixDilateGray` or `pixThresholdToBinary` allocate a new Pix struct everytime they are called, rather than modifying the one passed as an argument. This leads to a memory leak everytime the functions `_process_frame_white_basic`, `_process_frame_color_basic`, `_display_frame` or `_process_frame_tickertext` are called, which happens at every frame of the input video and results in a huge memory consumption, even for a video the size of a few hundred MBs. The proposed fix keeps track of all the created Pix structs and destroys them properly after each call to the mentioned functions. Library inclusions for Leptonica and Tesseract libraries have also been modified to look up the standard include directories. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 17:19:08 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#1897