mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-14 21:23:42 +00:00
[BUG] OCR works only for first DVB subtitle stream (OCR context is not shared) #475
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @nikop on GitHub (Jan 20, 2019).
CCExtractor version (using the --version parameter preferably) : 0.87
In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):
My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):
Necessary information
-out=srt -bom -latin1 -codec dvbsub "test.ts" -datapid 0xCE0 -ocrlang finVideo links (replace text below with your links)
https://madjoki.com/ts/test.ts
Additional information
Following works, but will extract first subtitle (0xCDF). This is expected result.
-out=srt -bom -latin1 -codec dvbsub "test.ts" -ocrlang fin-out=srt -bom -latin1 -codec dvbsub "test.ts" -datapid 0xCDF -ocrlang finCause seems to be:
dac9de4d67/src/lib_ccx/ts_tables.c (L358)It will work by commenting these two lines. It seems to be intent to share OCR between decoders, but this is not done.
Which means:
dac9de4d67/src/lib_ccx/dvb_subtitle_decoder.c (L1664)will become false and skip OCR
@Murmur commented on GitHub (Dec 30, 2019):
@nikop Do we have the same issue? This is my ticket for not able to extract second dvbsub text but png image extract works fine for all tracks. https://github.com/CCExtractor/ccextractor/issues/1163
@nikop commented on GitHub (Jan 3, 2020):
@Murmur Yes, this seems to be same issue.
@nikop commented on GitHub (Jan 5, 2020):
@Murmur
If you want to try and compile yourself:
ts_tables.diff.txt
@cfsmp3 commented on GitHub (Jan 25, 2020):
@nikop is it still happening in current master? We've done a lot of work in the past weeks and I'm going over all the issues - cleaning up. Thanks.
@mfarberbrodsky commented on GitHub (Jan 26, 2020):
Hi, I just checked it with the current master, still same result (no captions produced with -datapid 0xCE0). Same problem as #1163
@mfarberbrodsky commented on GitHub (Jan 26, 2020):
It still does work by commenting the two lines nikop suggested:
What's their purpose? Everything seems to be working without them.
@cfsmp3 commented on GitHub (Jan 26, 2020):
@mfarberbrodsky It declares the OCR "initialized" if it wasn't. I don't think however that the problem is there but rather that some other place must be checking that variable and only do something is the ocr is not initialized.
Once you've gotten that far I'd say it can't be too hard to fix.
@mfarberbrodsky commented on GitHub (Jan 29, 2020):
@cfsmp3 I investigated this problem a bit more, and I think I found the root of the issue. It starts here. On line 357, ocr_ctx is initiated only once, when pinfo->initialized_ocr is still 0. It is then stored in the returned ptr. That ptr is written to ctx in update_capinfo, and then it is actually stored as codec_private_data only in that specific pid (you can see that here) - which is the first pid that contains caption data, since ocr_ctx is initiated once. All the other pids won't have ocr_ctx, and this is why no captions are produced.
I believe this is why the problem occurs, however I'm not sure yet what solution I can implement.
@cfsmp3 commented on GitHub (Jan 29, 2020):
@mfarberbrodsky That's a good investigation, good job :-)
@hamelg commented on GitHub (Apr 18, 2020):
Hi, I have the same issue. I tried the workaround :
https://github.com/CCExtractor/ccextractor/issues/1067#issuecomment-578506743
Unfortunately, it causes an other issue : the subtitle timestamps are wrong, the first offset is null.
@hamelg commented on GitHub (Apr 29, 2020):
No, the workaround works fine. My timestamp issue was related to my ts file.
@rboy1 commented on GitHub (Jun 18, 2021):
Has there been any workaround or solution for this, I'm seeing the same issue here:
@cfsmp3 commented on GitHub (Jun 18, 2021):
You answered yourself :-)
@cfsmp3 commented on GitHub (Mar 22, 2023):
@nikop the file is not available, do you have it somewhere?
@nikop commented on GitHub (Mar 22, 2023):
@cfsmp3 I reuploaded the file
@cfsmp3 commented on GitHub (Mar 22, 2023):
Thanks. Please leave it there until this is fixed :-)
Your original post points to code, but it uses master instead of a specific commit, so the lines you point at doesn't seem to match current master. Can you update that?