mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-12 13:35:15 +00:00
[BUG] Segmentation fault when extracting ts file #558
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @jamoore5 on GitHub (Feb 19, 2020).
CCExtractor version: 0.88
Necessary information
-out=spupng -quietVideo links
https://drive.google.com/open?id=1xs7GyYPR-DPd75CP3XJLj7CPPdVJHAVp
Additional information
!strcmp(locale, "C"):Error:Assert failed:in file baseapi.cpp, line 209
Segmentation fault
@NilsIrl commented on GitHub (Feb 19, 2020):
I can't reproduce the issue on master and on 0.88.
@jamoore5 commented on GitHub (Feb 19, 2020):
I am on a raspberry pi 4 if that helps troubleshoot why I am getting this error.
@NilsIrl commented on GitHub (Feb 19, 2020):
How did you install ccextractor?
Did you build it yourself or is it included in the repos (if yes please mention the distribution you're using) or something else?
@jamoore5 commented on GitHub (Feb 19, 2020):
I followed this tutorial https://magpi.raspberrypi.org/articles/make-comics-from-tv-recordings
However, I had to remove
tesseract-ocr-devfrom the install@NilsIrl commented on GitHub (Feb 19, 2020):
Why?
tesseract-ocr-dev(AFAIK) is required to have OCR support.@jamoore5 commented on GitHub (Feb 19, 2020):
@NilsIrl commented on GitHub (Feb 19, 2020):
Could you try with the
libtesseract-devpackage instead? (They changed the name of the package)@jamoore5 commented on GitHub (Feb 19, 2020):
installed it but still getting the same error
@NilsIrl commented on GitHub (Feb 19, 2020):
segfault? Also from what I can tell, the
.tsfile you're using doesn't have any subtitles.Once you've installed
libtesseract-dev, you need to rebuild ccextractor.@MatejMecka commented on GitHub (Feb 19, 2020):
Both VLC and ffmpeg detect there is a subtitle stream but when playing through VLC no subtitles were shown.
@jamoore5 commented on GitHub (Feb 19, 2020):
just thought the error was strange, but my file is definitely bad.
@NilsIrl commented on GitHub (Feb 19, 2020):
I don't have an error though. It shouldn't seg fault.
@canihavesomecoffee commented on GitHub (Feb 19, 2020):
But why is that assert being triggered? I think that might be the cause of that segfault :)
@cfsmp3 commented on GitHub (Feb 19, 2020):
I can't reproduce on a Raspberry either.
@jamoore5 commented on GitHub (Feb 20, 2020):
To compare
Also any tips on how to convert am mp4 with subtitles to a ts with subtitles?
@NilsIrl commented on GitHub (Feb 21, 2020):
Seems to segfault inside of tesseract.
@cfsmp3 commented on GitHub (Feb 21, 2020):
@jamoore5 Run it with valgrind, maybe it will give info some clue. Also note that you are using tesseract 4, not tesseract 3 which is really like to be the reason we're seeing different things.
Would tell us more.
@NilsIrl Yes, but let's assume it's our code causing that segfault somehow :-)
@NilsIrl commented on GitHub (Feb 21, 2020):
Yes, I was mentioning that so that someone didn't went looking for
baseapi.cppwithout ever finding it.@jamoore5 commented on GitHub (Feb 21, 2020):
@cfsmp3 I got the following output then it hang without exiting
@NilsIrl commented on GitHub (Feb 21, 2020):
I can't reproduce on x86 (NixOS) with tesseract 4.1.0 (in addition to tesseract 3). I'd expect valgrind to give a backtrace.
Can you try to compile with(nevermind,linux/builddebuginstead oflinux/buildand rerun with valgrind?linux/buildalso gives a backtrace so the problem seems to be elsewhere.)@canihavesomecoffee commented on GitHub (Feb 21, 2020):
@jamoore5 The triggered assert looks similar to https://github.com/tesseract-ocr/tesseract/issues/1670.
Can you try to explicitely set the LANG variable to "C" before running CCExtractor?
@jamoore5 commented on GitHub (Feb 21, 2020):
I installed my packages for tesseract, and installed the binaries which seemed to have fixed the issue.
however I got the latest version
and ccextractor still sees it as 4.0.0
@NilsIrl commented on GitHub (Feb 22, 2020):
Could it be that you have both versions installed?
@cfsmp3 commented on GitHub (Feb 22, 2020):
Labeling as medium because we can't reproduce.
@shikharmn commented on GitHub (Feb 27, 2020):
Hey!
I would like to contribute to this issue. Can someone give me a direction as to where to start?
@canihavesomecoffee commented on GitHub (Nov 20, 2021):
@cfsmp3 Closing this (reopen if you don't agree).
Original poster fixed the issue by installing tesseract 5.0 (this is also mentioned here: https://github.com/tesseract-ocr/tesseract/issues/1670#issuecomment-515324015).
I encountered this on the SP (Tesseract 4.0.0), and there it's fixed by exporting LC_ALL.
So either we can add this in our README/compilation guide somewhere, but there's nothing to fix in our code I think.