mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-15 13:35:30 +00:00
[CEA-708] Missing the last subtitle with "Premature end of file" #243
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Izaron on GitHub (Jan 14, 2017).
Sometimes we have incomplete ts parts because of cropped .ts files, for example:
Premature end of file - Transport Stream packet is incomplete (expected 188 bytes, got 168).After work we must flush subtitles in current window:
^ This subtitles are written to the file
But if we have broken .ts file at the end, then we can't flush last sub because of missing show time, but CEA-608 have last sub. Example:
^ These subtitles have written in CEA-608 file with correct timing, but have not in CEA-708 file.
@cfsmp3 commented on GitHub (Jan 20, 2017):
GSoC qualification: This issues gives 2 points.
@AlexeyBelezeko commented on GitHub (Mar 3, 2017):
@Izaron can you provide me with some samples for the task?
@Izaron commented on GitHub (Mar 3, 2017):
@AlexeyBelezeko https://ccextractor.org/public:general:tvsamples "US TV 10 minutes samples" - "ESPN.ts" last sub is missed as far I remembered.
@Izaron commented on GitHub (Mar 5, 2017):
Sample:
https://drive.google.com/drive/folders/0B_61ywKPmI0Ta2diT3l0eTlHc2c - USA.ts
Run as
ccextractor <path_to_USA.ts> -svc allThe first file USA.srt is correct
The second file USA.p8.svc01.srt is incorrect
@thetransformerr commented on GitHub (Jul 9, 2018):
@cfsmp3
hi all,
I worked on this issue and found that this issue can be solved with modifications in decoder output methods but would like to say that such situation is not a bug , as it only happens in streams that are clipped and packet data is incomplete acc to standard size of ts packet parsing is 188 byte
and error originates from here:
25a8b53ff5/src/lib_ccx/ts_functions.c (L216)from what I see below this line is we are using 188 as hardcoded value for size of remaining data packet as per standard/recommendation defined in ISO 13818:1 , and some of the following functions depend on it , like
25a8b53ff5/src/lib_ccx/ts_functions.c (L240)so here we can use dynamic size if possible , I have no idea abt that , but if that is right approach I can move in that direction.
thanks.
@T1duS commented on GitHub (Nov 7, 2018):
So, basically we need to add a parameter like --croppedts which when called replaces all 188 to 168. Am I right?
@cfsmp3 commented on GitHub (Nov 7, 2018):
No, this has nothing to do with packet size, that's always 188. It's about
the command to display the last subtitle frame not being there (since the
stream is incomplete).
What we need is a way to dump the contents of the 708 decoder once we reach
the end of the stream.
On Wed, Nov 7, 2018, 02:09 Udit Sanghi <notifications@github.com wrote:
@siv2r commented on GitHub (Jan 15, 2021):
Hey all,
I would like to take a shot at this issue :)
Edit: If anyone is able to solve this issue faster please go ahead!!
@canihavesomecoffee commented on GitHub (Jan 15, 2021):
Just go ahead 👍
@cfsmp3 commented on GitHub (Mar 2, 2021):
Sorry for the late reply - just found this in a pile of emails.
I'd start by looking at the contents of the decode buffer at the very end
of the program. Does it contain the missing text? (and therefore it's just
that we're not writing it to file)
On Sat, Jan 16, 2021 at 9:01 AM Sivaram D notifications@github.com wrote:
@PunitLodha commented on GitHub (Mar 19, 2021):
@cfsmp3
I was not sure which decode buffer to look for. At the end of
general_loop(), I looked atdec_ctx->dtvcc->decoders[0], and it had cc_count = 245, but insidetv->chars, everything has sym = 0. There was also a window defined, but rows in it were empty.So, I am not sure what to make of it, and how to proceed ahead
Also i looked at how 708 subtitles were extracted, and i guess the store_hdcc() is used for storing data into
dec_ctx->cc_data_pkts. This function is not being called at the end when we get the message:- Premature end of file - Transport Stream packet is incomplete (expected 188 bytes, got 168). So it might mean that subtitles were not extracted?@cfsmp3 commented on GitHub (Mar 21, 2021):
I wrote this so long ago I don't even remember the specifics. What you can do this set a breakpoint in the code where the 708 data starts being processed. That's in this file: ccx_decoders_common.c, function do_cb, look for case 3: //EIA-708
And well, follow the code to see what happens.
@PunitLodha commented on GitHub (Mar 22, 2021):
Tried that, after getting the message:- Premature end of file - Transport Stream packet is incomplete (expected 188 bytes, got 168), we return from this line https://github.com/CCExtractor/ccextractor/blob/master/src/lib_ccx/ccx_decoders_common.c#L103-L106 and never reach case 3
It says skipping non data, So that could mean data for the final subtitle is not extracted?
@cfsmp3 commented on GitHub (Mar 22, 2021):
This suggests that the sample does not have CEA-708 subtitles but just 608 subtitles. You can find that out by playing the video with VLC and look at the media information or with ffprobe.
If it doesn't have 708 subtitles then the first problem would be on the title of this issue :-) It could be just 608, which would be handled by case 0.
And then the issue is actually the same - we're not processing the buffer at the end of the stream. Which I must say seems strange but it's possible.
@PunitLodha commented on GitHub (Mar 22, 2021):
I didn't frame my comment correctly. It does reach case 3, just not after the message. So here is how it goes:-
@cfsmp3 commented on GitHub (Mar 22, 2021):
Of course, when that message appears there's no more data.
So you'd need to check if there's anything on the 708 buffer and if yes, manually trigger a buffer flush.
In theory that should be happening already, but set a breakpoint on the function flush_cc_decode and see if it's called or not.
That's fine, that just means that the stream was abruptly cut. If you added 20 bytes to the end of the file you would get rid of that specific message (since now the file would end with a complete TS block) but wouldn't get any additional data (I think? Feel free to try). In any case this message it's not important.
@PunitLodha commented on GitHub (Mar 23, 2021):
flush_cc_decode is called, and that in turn calls, ccx_dtvcc_decoder_flush (i guess this is for 708). Here, https://github.com/CCExtractor/ccextractor/blob/master/src/lib_ccx/ccx_decoders_708.c#L836, no decoder window is visible(All windows have
window->visible==0), so nothing changes@cfsmp3 commented on GitHub (Mar 23, 2021):
OK let's close this then as a non-issue then.
@PunitLodha commented on GitHub (Apr 11, 2021):
So I found out what the issue is. None of the windows are visible, but one window is defined, and it has the last subtitle in it. So i changed the condition to check if window is defined or not. But now there's an issue that there is no show time for the subtitle. Hide time is update when flushed, but we are missing the show time. This leads to no subtitles being displayed.
How should we get the show time?