mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-03 21:23:48 +00:00
OCR - jumps based on uninitialised values #257
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cfsmp3 on GitHub (Jan 24, 2017).
Originally assigned to: @Abhinav95 on GitHub.
==22257== Conditional jump or move depends on uninitialised value(s)
==22257== at 0x428EC5: compare_rect_by_ypos (ocr.c:705)
==22257== by 0x454991: shell_sort (ccx_encoders_helpers.c:459)
==22257== by 0x428FBF: paraof_ocrtext (ocr.c:749)
==22257== by 0x44E7B3: write_cc_bitmap_as_srt (ccx_encoders_srt.c:103)
==22257== by 0x42D83C: encode_sub (ccx_encoders_common.c:1174)
==22257== by 0x43B7F6: dinit_libraries (lib_ccx.c:227)
==22257== by 0x407CB4: main (ccextractor.c:435)
@cfsmp3 commented on GitHub (Jan 26, 2017):
GSoC qualification: This issue gives 2 points.
@siddharthjindal1997 commented on GitHub (Feb 22, 2017):
@cfsmp3 can you please explain the issue. i would love to work on it .
thanks.
@cfsmp3 commented on GitHub (Feb 22, 2017):
@siddharthjindal1997 The issue description gives all the info :-) Running CCExtractor on valgrind we get those messages. If you check the code, it seems to happen here:
int compare_rect_by_ypos(const voidp1, const void p2, voidarg)
{
const struct cc_bitmap r1 = p1;
const struct cc_bitmap* r2 = p2;
if(r1->y > r2->y)
We know that r1 and r2 are initialized because that happens there, but looks like r1->y and/or r2->y are not. So the job would be to find out in which circumstances it's possible that compare_rect_by_ypos () is called without its parameters having been initialized.
@mahalwal commented on GitHub (Oct 17, 2017):
@cfsmp3 hey can you please tell how to generate the same memcheck? I was doing
valgrind --leak-check=yes /linux/./buildIs this the right way?@saurabhshri commented on GitHub (Oct 20, 2017):
@mahalwal That’s the Valrgind output while processing a file which uses OCR (e.g. a DVB file). So, you'd want to do
valgrind --leak-check=yes ./ccextractor /path/to/file/ <options>. If you're looking for samples, find some at https://sampleplatform.ccextractor.org/sample or https://www.ccextractor.org/public:general:tvsamples .@Sudoxo commented on GitHub (Jan 19, 2020):
I would like to work on it, but I have never run ccextractor with file which uses OCR. Could someone tell me what commands/options need to be used?
@NilsIrl commented on GitHub (Jan 19, 2020):
Nothing specific has to be done.
You can use the EastEnders file to test.
@Sudoxo commented on GitHub (Jan 19, 2020):
Building with
./build, and running with./ccextractor /home/BBC1.mp4gives me empty .srt file and it says: no captions were found in inputAre you sure that nothing specific has to be done?
link to folder with that BBC1.mp4: https://drive.google.com/drive/folders/0B_61ywKPmI0TYk9vMzhHU2QtdVk
@NilsIrl commented on GitHub (Jan 19, 2020):
try with the first one from https://drive.google.com/drive/folders/0B_61ywKPmI0TUUk5LXJPeG1feFE
@Sudoxo commented on GitHub (Jan 19, 2020):
Running valgrind with that file which @NilsIrl mentioned, gives me a lot of info, but I don't have this specific one: https://github.com/CCExtractor/ccextractor/issues/662#issue-202700471
The only one with uninitalised value(s) I have is:
@NilsIrl commented on GitHub (Jan 19, 2020):
You can fix this one if you want
@kdrag0n commented on GitHub (Jan 21, 2020):
I believe this issue has already been fixed by some commit after the issue was opened, since I can't reproduce it either.
I'm only getting uninitialized jump/move messages from internal Tesseract code (allocated inside Tesseract), which is out of the scope of this project:
@cfsmp3 commented on GitHub (Jan 25, 2020):
@kdrag0n but are you sure it's not because we are passing uninitialized values to tesseract?
@kdrag0n commented on GitHub (Jan 26, 2020):
According to the stack trace, Tesseract allocated the uninitialized memory during initialization, not us. Valgrind doesn't complain when I add code right above the call to Tesseract to dump all of the values being passed to it, so it shouldn't be caused by the inputs.