[BUG] Not enough memory to initialize Tesseract #449

Closed
opened 2026-01-29 16:44:10 +00:00 by claunia · 19 comments
Owner

Originally created by @Aradmey on GitHub (Oct 22, 2018).

CCExtractor version (using the --version parameter preferably) : 0.85

In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):

  • I have read and understood the contributors guide.
  • I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • I have checked that the issue I'm posting isn't already reported.
  • I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • I have used the latest available version of CCExtractor to verify this issue exists.

My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):

  • I have never used CCExtractor.

Necessary information

  • Is this a regression (did it work before)? [X] NO
  • What platform did you use? [X] Windows
  • What were the used arguments? E:\CCExtractor\ccextractorwinfull.exe --gui_mode_reports -in=mp4 -autoprogram -out=srt -bom -unicode -hardsubx -subcolor white -conf_thresh 60 [+input files]

Additional information

Hello, I've tried using the program to extract burned-in subtitles from a .mp4 movie, but it seems to always show me this error: "Not enough memory to initialize Tesseract!"
Is there any solution known for this issue?

Originally created by @Aradmey on GitHub (Oct 22, 2018). CCExtractor version (using the --version parameter preferably) : 0.85 **In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):** - [x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [X] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present. - [X] I have checked that the issue I'm posting isn't already reported. - [X] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues) - [X] I have used the latest available version of CCExtractor to verify this issue exists. **My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):** - [X] I have never used CCExtractor. **Necessary information** - Is this a regression (did it work before)? [X] NO - What platform did you use? [X] Windows - What were the used arguments? `E:\CCExtractor\ccextractorwinfull.exe --gui_mode_reports -in=mp4 -autoprogram -out=srt -bom -unicode -hardsubx -subcolor white -conf_thresh 60 [+input files]` **Additional information** Hello, I've tried using the program to extract burned-in subtitles from a .mp4 movie, but it seems to always show me this error: "**Not enough memory to initialize Tesseract!**" Is there any solution known for this issue?
claunia added the needs-confirmation-of-being-brokendifficulty: medium labels 2026-01-29 16:44:10 +00:00
Author
Owner

@MatejMecka commented on GitHub (Oct 22, 2018):

It seems your computer doesn't have the power to run Tesseract. Therefore there isn't any issue with CCExtractor but with your computer running it.

@MatejMecka commented on GitHub (Oct 22, 2018): It seems your computer doesn't have the power to run Tesseract. Therefore there isn't any issue with CCExtractor but with your computer running it.
Author
Owner

@saurabhshri commented on GitHub (Oct 22, 2018):

Could you please post complete logs along with procedure you followed to
compile CCExtractor?

On Mon, Oct 22, 2018 at 8:34 PM Aradmey notifications@github.com wrote:

CCExtractor version (using the --version parameter preferably) : 0.85

In raising this issue, I confirm the following (please check boxes, eg
[X] - and delete unchecked ones):

My familiarity with the project is as follows (check one, eg [X] - and
delete unchecked ones):

  • I have never used CCExtractor.

Necessary information

  • Is this a regression (did it work before)? [X] NO
  • What platform did you use? [X] Windows
  • What were the used arguments? E:\CCExtractor\ccextractorwinfull.exe
    --gui_mode_reports -in=mp4 -autoprogram -out=srt -bom -unicode -hardsubx
    -subcolor white -conf_thresh 60 [+input files]

Additional information

Hello, I've tried using the program to extract burned-in subtitles from a
.mp4 movie, but it seems to always show me this error: "Not enough
memory to initialize Tesseract!
"
Is there any solution known for this issue?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/1008, or mute the
thread
https://github.com/notifications/unsubscribe-auth/AL1y1HrN0ugJUjFZqFlNU1oLNOsG2id3ks5und5sgaJpZM4XzjNv
.

--

Saurabh Shrivastava

@saurabhshri commented on GitHub (Oct 22, 2018): Could you please post complete logs along with procedure you followed to compile CCExtractor? On Mon, Oct 22, 2018 at 8:34 PM Aradmey <notifications@github.com> wrote: > CCExtractor version (using the --version parameter preferably) : 0.85 > > *In raising this issue, I confirm the following (please check boxes, eg > [X] - and delete unchecked ones):* > > - I have read and understood the contributors guide > <https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md> > . > - I have checked that the bug-fix I am reporting can be replicated, or > that the feature I am suggesting isn't already present. > - I have checked that the issue I'm posting isn't already reported. > - I have checked that the issue I'm porting isn't already solved and > no duplicates exist in closed issues > <https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed> > and in opened issues > <https://github.com/CCExtractor/ccextractor/issues> > - I have used the latest available version of CCExtractor to verify > this issue exists. > > *My familiarity with the project is as follows (check one, eg [X] - and > delete unchecked ones):* > > - I have never used CCExtractor. > > *Necessary information* > > - Is this a regression (did it work before)? [X] NO > - What platform did you use? [X] Windows > - What were the used arguments? E:\CCExtractor\ccextractorwinfull.exe > --gui_mode_reports -in=mp4 -autoprogram -out=srt -bom -unicode -hardsubx > -subcolor white -conf_thresh 60 [+input files] > > *Additional information* > > Hello, I've tried using the program to extract burned-in subtitles from a > .mp4 movie, but it seems to always show me this error: "*Not enough > memory to initialize Tesseract!*" > Is there any solution known for this issue? > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/1008>, or mute the > thread > <https://github.com/notifications/unsubscribe-auth/AL1y1HrN0ugJUjFZqFlNU1oLNOsG2id3ks5und5sgaJpZM4XzjNv> > . > -- - Saurabh Shrivastava
Author
Owner

@Aradmey commented on GitHub (Oct 22, 2018):

It seems your computer doesn't have the power to run Tesseract. Therefore there isn't any issue with CCExtractor but with your computer running it.

I doubt it, as my PC has 16GB.

Could you please post complete logs along with procedure you followed to compile CCExtractor?
-- - Saurabh Shrivastava

I did not compile CCExtractor, I downloaded the binaries (GUI and command line programs) and later the installer itself. None of them worked..
All I did is open CCExtractor, selected my file, selected "With OCR" below, ticked "Perform burned-in subtitle extraction", started and received the mentioned error.

@Aradmey commented on GitHub (Oct 22, 2018): > It seems your computer doesn't have the power to run Tesseract. Therefore there isn't any issue with CCExtractor but with your computer running it. I doubt it, as my PC has 16GB. > Could you please post complete logs along with procedure you followed to compile CCExtractor? > -- - Saurabh Shrivastava I did not compile CCExtractor, I downloaded the binaries (GUI and command line programs) and later the installer itself. None of them worked.. All I did is open CCExtractor, selected my file, selected "With OCR" below, ticked "Perform burned-in subtitle extraction", started and received the mentioned error.
Author
Owner

@AntonOfTheWoods commented on GitHub (Nov 17, 2018):

This is also happening on Ubuntu 18.04 with ccextractor compiled from master using the tesseract from the normal repos. If I manually extract images using ffmpeg and run tesseract then there is no complaining about memory on my 8GB Dell XPS laptop.

Now my C++ is almost non-existent but looking at the tesseract code looks like it may have nothing to do with memory at all.
https://github.com/CCExtractor/ccextractor/blob/master/src/lib_ccx/hardsubx.c#L238
Assumes that any non-zero return value means "Not enough memory to intialize Tesseract" but I don't see anything in https://github.com/tesseract-ocr/tesseract/blob/master/src/api/capi.cpp#L241 or https://github.com/tesseract-ocr/tesseract/blob/master/src/api/baseapi.h#L189 that suggest that non-zero is guaranteed to be memory related. It simply says:

  • Start tesseract. Returns zero on success and -1 on failure.
  • NOTE that the only members that may be called before Init are those
  • listed above here in the class definition.

I may well not be looking at the right place but it seems to me that this could well be something other than insufficient memory.

@AntonOfTheWoods commented on GitHub (Nov 17, 2018): This is also happening on Ubuntu 18.04 with ccextractor compiled from master using the tesseract from the normal repos. If I manually extract images using ffmpeg and run tesseract then there is no complaining about memory on my 8GB Dell XPS laptop. Now my C++ is almost non-existent but looking at the tesseract code looks like it may have nothing to do with memory at all. [https://github.com/CCExtractor/ccextractor/blob/master/src/lib_ccx/hardsubx.c#L238](url) Assumes that any non-zero return value means "Not enough memory to intialize Tesseract" but I don't see anything in https://github.com/tesseract-ocr/tesseract/blob/master/src/api/capi.cpp#L241 or https://github.com/tesseract-ocr/tesseract/blob/master/src/api/baseapi.h#L189 that suggest that non-zero is guaranteed to be memory related. It simply says: > * Start tesseract. Returns zero on success and -1 on failure. > * NOTE that the only members that may be called before Init are those > * listed above here in the class definition. I may well not be looking at the right place but it seems to me that this could well be something other than insufficient memory.
Author
Owner

@AntonOfTheWoods commented on GitHub (Nov 26, 2018):

@saurabhshri , do you have any ideas about this? Am I completely wrong in my interpretation of the code?

@AntonOfTheWoods commented on GitHub (Nov 26, 2018): @saurabhshri , do you have any ideas about this? Am I completely wrong in my interpretation of the code?
Author
Owner

@saurabhshri commented on GitHub (Nov 26, 2018):

@AntonOfTheWoods No, you're not. It's not your machine. It has been reported previously, but they were able to solve it. Happy debugging :)

@saurabhshri commented on GitHub (Nov 26, 2018): @AntonOfTheWoods No, you're not. It's not your machine. It has been reported previously, but they were able to solve it. Happy debugging :)
Author
Owner

@cfsmp3 commented on GitHub (Nov 26, 2018):

OK let's try to figure this one out... @Aradmey first, does it happen with all files or just some, or a specific hone? Can you share one?

Have you tried in 0.87?

@cfsmp3 commented on GitHub (Nov 26, 2018): OK let's try to figure this one out... @Aradmey first, does it happen with all files or just some, or a specific hone? Can you share one? Have you tried in 0.87?
Author
Owner

@AntonOfTheWoods commented on GitHub (Nov 30, 2018):

@Aradmey , was it you that was able to solve it or did you abandon CCExtractor?

@AntonOfTheWoods commented on GitHub (Nov 30, 2018): @Aradmey , was it you that was able to solve it or did you abandon CCExtractor?
Author
Owner

@AntonOfTheWoods commented on GitHub (Nov 30, 2018):

@cfsmp3 , I am using master rather than an official release version (like 0.8.7) so I can get support for tesseract 4 (the version available on Ubuntu 18.04). The git log suggests I need the HEAD of origin/master for that. Could this be simply a matter of tesseract 4 not being fully supported yet? I have also tried with the latest tesseract version from https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr and have the same error. Is it worth trying to get tesseract 3 installed and using 0.8.7? Thanks.

@AntonOfTheWoods commented on GitHub (Nov 30, 2018): @cfsmp3 , I am using master rather than an official release version (like 0.8.7) so I can get support for tesseract 4 (the version available on Ubuntu 18.04). The git log suggests I need the HEAD of origin/master for that. Could this be simply a matter of tesseract 4 not being fully supported yet? I have also tried with the latest tesseract version from https://launchpad.net/~alex-p/+archive/ubuntu/tesseract-ocr and have the same error. Is it worth trying to get tesseract 3 installed and using 0.8.7? Thanks.
Author
Owner

@cfsmp3 commented on GitHub (Nov 30, 2018):

Give tesseract 3 a try indeed... in any case it's going to be faster,
tesseract 4 seems better handling handwritten stuff but for our use
doesn't seem like a great upgrade.
On Fri, Nov 30, 2018 at 6:30 AM Anton Melser notifications@github.com wrote:

@cfsmp3 , I am using master rather than an official release version (like 0.8.7) so I can get support for tesseract 4 (the version available on Ubuntu 18.04). The git log suggests I need the HEAD of origin/master for that. Could this be simply a matter of tesseract 4 not being fully supported yet? Is it worth trying to get tesseract 3 installed and using 0.8.7? Thanks.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

@cfsmp3 commented on GitHub (Nov 30, 2018): Give tesseract 3 a try indeed... in any case it's going to be faster, tesseract 4 seems better handling handwritten stuff but for our use doesn't seem like a great upgrade. On Fri, Nov 30, 2018 at 6:30 AM Anton Melser <notifications@github.com> wrote: > > @cfsmp3 , I am using master rather than an official release version (like 0.8.7) so I can get support for tesseract 4 (the version available on Ubuntu 18.04). The git log suggests I need the HEAD of origin/master for that. Could this be simply a matter of tesseract 4 not being fully supported yet? Is it worth trying to get tesseract 3 installed and using 0.8.7? Thanks. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub, or mute the thread.
Author
Owner

@AntonOfTheWoods commented on GitHub (Dec 1, 2018):

@Aradmey and @cfsmp3 , I can confirm that manually compiling tesseract 3.05 on Ubuntu 18.04 and compiling ccextractor at master and pointing to the tesseract 3 gets rid of the error. I definitely think the error message could do with some improvement though!

@AntonOfTheWoods commented on GitHub (Dec 1, 2018): @Aradmey and @cfsmp3 , I can confirm that manually compiling tesseract 3.05 on Ubuntu 18.04 and compiling ccextractor at master and pointing to the tesseract 3 gets rid of the error. I definitely think the error message could do with some improvement though!
Author
Owner

@RobJacobson commented on GitHub (Dec 11, 2018):

I'm completely new to CCExtractor. I'm encountering the same issue when running 0.87 on Windows. My steps to reproduce:

  1. Install the Windows installer for CCExtractor on Windows 10.

  2. Run the GUI version with the following options:

C:\Program Files (x86)\CCExtractor\ccextractorwinfull.exe --gui_mode_reports -out=srt -bom -latin1 -hardsubx -subcolor white -conf_thresh 60 [+input files]

When I click the "Start" button, I get the message "Not enough memory to initialize Tesseract."

Could this be a problem with Windows support for Tesseract? I noticed that the Windows version seems to be lagging behind the Linux version.

For what it's worth, I'm trying to use CCExtractor to make some HBO shows more accessible. The show "My Brilliant Friend" is spoken in Italian and has burned-in subtitles in English, but those aren't accessible for for English-speaking blind users. Details below. If there's a way to OCR these subtitles, that would be completely amazing.

https://www.huffingtonpost.com/entry/hbo-discriminates-against-blind_us_5be073e1e4b04367a87f1cab

@RobJacobson commented on GitHub (Dec 11, 2018): I'm completely new to CCExtractor. I'm encountering the same issue when running 0.87 on Windows. My steps to reproduce: 1. Install the Windows installer for CCExtractor on Windows 10. 2. Run the GUI version with the following options: C:\Program Files (x86)\CCExtractor\ccextractorwinfull.exe --gui_mode_reports -out=srt -bom -latin1 -hardsubx -subcolor white -conf_thresh 60 [+input files] When I click the "Start" button, I get the message "Not enough memory to initialize Tesseract." Could this be a problem with Windows support for Tesseract? I noticed that the Windows version seems to be lagging behind the Linux version. For what it's worth, I'm trying to use CCExtractor to make some HBO shows more accessible. The show "My Brilliant Friend" is spoken in Italian and has burned-in subtitles in English, but those aren't accessible for for English-speaking blind users. Details below. If there's a way to OCR these subtitles, that would be completely amazing. https://www.huffingtonpost.com/entry/hbo-discriminates-against-blind_us_5be073e1e4b04367a87f1cab
Author
Owner

@anonynamja commented on GitHub (Jan 15, 2019):

Same issue as above, is there any solution yet? Run with different options? Many thanks.

@anonynamja commented on GitHub (Jan 15, 2019): Same issue as above, is there any solution yet? Run with different options? Many thanks.
Author
Owner

@Pi7on commented on GitHub (Jan 22, 2019):

Same issue here

@Pi7on commented on GitHub (Jan 22, 2019): Same issue here
Author
Owner

@bioluminesceme commented on GitHub (Mar 8, 2019):

Same issue.
Windows 10, Tesseract3 is installed and in my System PATH.

@bioluminesceme commented on GitHub (Mar 8, 2019): Same issue. Windows 10, Tesseract3 is installed and in my System PATH.
Author
Owner

@DaniGTA commented on GitHub (Apr 11, 2019):

Is this already fixed ?

@DaniGTA commented on GitHub (Apr 11, 2019): Is this already fixed ?
Author
Owner

@thelastpolaris commented on GitHub (Apr 12, 2019):

@DaniGTA I guess that this problem was already fixed by #1083 that changed the way Tesseract is initialized. Previously if for some reason Tesseract was not initialized, you were getting a memory error. #1083 updated the way Tesseract is initialized to be more stable. Anybody who had this error - kindly ask you to check it again with CCExtractor's master.

@thelastpolaris commented on GitHub (Apr 12, 2019): @DaniGTA I guess that this problem was already fixed by #1083 that changed the way Tesseract is initialized. Previously if for some reason Tesseract was not initialized, you were getting a memory error. #1083 updated the way Tesseract is initialized to be more stable. Anybody who had this error - kindly ask you to check it again with CCExtractor's master.
Author
Owner

@drodz11 commented on GitHub (May 9, 2019):

Hello,

I was having the same problem (error message while running 0.87 GUI - "Not enough memory to initialize Tesseract") so I cloned the master and compiled on Windows 10 using Visual Studio 2019 (Community) and the instructions given here. However, when I launch the new GUI I am seeing the following message:

ccext

I have tried compiling with both the Debug and Release configurations. Has anyone else had this problem or have an idea why the library can't be found?

@drodz11 commented on GitHub (May 9, 2019): Hello, I was having the same problem (error message while running 0.87 GUI - "Not enough memory to initialize Tesseract") so I cloned the master and compiled on Windows 10 using Visual Studio 2019 (Community) and the instructions given [here](https://github.com/CCExtractor/ccextractor/blob/master/docs/COMPILATION.MD). However, when I launch the new GUI I am seeing the following message: ![ccext](https://user-images.githubusercontent.com/25777161/57483892-b8e0ef80-7275-11e9-9117-8e1f6637c5ba.PNG) I have tried compiling with both the Debug and Release configurations. Has anyone else had this problem or have an idea why the library can't be found?
Author
Owner

@cfsmp3 commented on GitHub (Nov 21, 2021):

Closing - this seems fixed. Feel free to comment if anyone is having this problem in current master.

@cfsmp3 commented on GitHub (Nov 21, 2021): Closing - this seems fixed. Feel free to comment if anyone is having this problem in current master.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#449