Anagrammed words #74

Closed
opened 2026-01-29 16:34:29 +00:00 by claunia · 16 comments
Owner

Originally created by @aramacciotti on GitHub (Oct 12, 2015).

CC extracting from this MPEG2-TS source (https://onedrive.live.com/redir?resid=9AE76B6175D960D3!381261&authkey=!ANmlBJmNWnzFyos&ithint=video%2cts) I get some anagrams instead of the original source.
Which could be the cause?

Here what I get currently:

2
00:00:02,336 --> 00:00:04,545
It's too late for Dre.amamin
We're practillcathy ere.

that should be instead:

2
00:00:02,336 --> 00:00:04,545
It's too late for Dreamamin.
We're practically there.

Originally created by @aramacciotti on GitHub (Oct 12, 2015). CC extracting from this MPEG2-TS source (https://onedrive.live.com/redir?resid=9AE76B6175D960D3!381261&authkey=!ANmlBJmNWnzFyos&ithint=video%2cts) I get some anagrams instead of the original source. Which could be the cause? Here what I get currently: 2 00:00:02,336 --> 00:00:04,545 It's too late for Dre.amamin We're practillcathy ere. that should be instead: 2 00:00:02,336 --> 00:00:04,545 It's too late for Dreamamin. We're practically there.
Author
Owner

@anshul1912 commented on GitHub (Oct 12, 2015):

Thanks for report, it look like bug in caption, that part should be ignored by decoder. I will have a look at the video soon

@anshul1912 commented on GitHub (Oct 12, 2015): Thanks for report, it look like bug in caption, that part should be ignored by decoder. I will have a look at the video soon
Author
Owner

@canihavesomecoffee commented on GitHub (Oct 12, 2015):

We've had the same issues with other samples before (groups of letters swapped around), as far as I can remember.

@canihavesomecoffee commented on GitHub (Oct 12, 2015): We've had the same issues with other samples before (groups of letters swapped around), as far as I can remember.
Author
Owner

@cfsmp3 commented on GitHub (Oct 12, 2015):

Yes, garbling that comes and goes is a constant... there's always some
sample that breaks things...

On Mon, Oct 12, 2015 at 6:02 PM, Willem notifications@github.com wrote:

We've had the same issues with other samples before (groups of letters
swapped around), as far as I can remember.


Reply to this email directly or view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/235#issuecomment-147446816
.

@cfsmp3 commented on GitHub (Oct 12, 2015): Yes, garbling that comes and goes is a constant... there's always some sample that breaks things... On Mon, Oct 12, 2015 at 6:02 PM, Willem notifications@github.com wrote: > We've had the same issues with other samples before (groups of letters > swapped around), as far as I can remember. > > — > Reply to this email directly or view it on GitHub > https://github.com/CCExtractor/ccextractor/issues/235#issuecomment-147446816 > .
Author
Owner

@aramacciotti commented on GitHub (Oct 12, 2015):

Does it mean that it's clear for you how to fix?

@aramacciotti commented on GitHub (Oct 12, 2015): Does it mean that it's clear for you how to fix?
Author
Owner

@anshul1912 commented on GitHub (Oct 16, 2015):

No its not clear how to fix it, but I have confirmed with video that this bug

@anshul1912 commented on GitHub (Oct 16, 2015): No its not clear how to fix it, but I have confirmed with video that this bug
Author
Owner

@aramacciotti commented on GitHub (Oct 19, 2015):

Do you think I can help in some way? I can debug the code, but I have not still identified where the closed caption source characters are got.
If you think so, you can address me to the specific code fragment and I'll try to debug it

@aramacciotti commented on GitHub (Oct 19, 2015): Do you think I can help in some way? I can debug the code, but I have not still identified where the closed caption source characters are got. If you think so, you can address me to the specific code fragment and I'll try to debug it
Author
Owner

@anshul1912 commented on GitHub (Oct 19, 2015):

Problem is not in closed caption decoder but its in avcfuntions or sequenceing.c file names are not exact in src but similar.

though I am working on it, but if you can find out solution that would be nice of you 😄

@anshul1912 commented on GitHub (Oct 19, 2015): Problem is not in closed caption decoder but its in avcfuntions or sequenceing.c file names are not exact in src but similar. though I am working on it, but if you can find out solution that would be nice of you :smile:
Author
Owner

@anshul1912 commented on GitHub (Oct 21, 2015):

when slice_header function is called twice with same pts and single payload then data which was pivot while previous call is flushed and code which do sequencing does not work, since that data should not been written or decoded until data with lower sequence is decoded. hence it makes output as jumbled up
@aramacciotti if you want to try you can change in lib_ccx/avc_functions.c slice_header

        // if slices are buffered - flush
        if (isref)
        {
+                static LLONG last_pts = 0;
                dvprint("\nReference pic! [%s]\n", slice_types[slice_type]);
                dbg_print(CCX_DMT_TIME, "\nReference pic! [%s] maxrefcnt: %3d\n",
                                slice_types[slice_type], maxrefcnt);

                // Flush buffered cc blocks before doing the housekeeping
-               if (ctx->has_ccdata_buffered)
+               if (ctx->has_ccdata_buffered && last_pts != ctx->timing->current_pts)
                {
                        process_hdcc(ctx, sub);
                }
+                last_pts = ctx->timing->current_pts;

Just now I am looking at reason in Specs that is this scenerio really allowed in video with same pts have multiple slice header.
I also have to confirm manually through hex editor that is that video really having multiple slice header with same pts or we have some problem in code which call this function twice.
Above solution is just a hack to understand and confirm problem, I would suggest to not take this solution as final solution until investigation is complete

@anshul1912 commented on GitHub (Oct 21, 2015): when slice_header function is called twice with same pts and single payload then data which was pivot while previous call is flushed and code which do sequencing does not work, since that data should **not** been written or decoded until data with lower sequence is decoded. hence it makes output as jumbled up @aramacciotti if you want to try you can change in lib_ccx/avc_functions.c slice_header ``` // if slices are buffered - flush if (isref) { + static LLONG last_pts = 0; dvprint("\nReference pic! [%s]\n", slice_types[slice_type]); dbg_print(CCX_DMT_TIME, "\nReference pic! [%s] maxrefcnt: %3d\n", slice_types[slice_type], maxrefcnt); // Flush buffered cc blocks before doing the housekeeping - if (ctx->has_ccdata_buffered) + if (ctx->has_ccdata_buffered && last_pts != ctx->timing->current_pts) { process_hdcc(ctx, sub); } + last_pts = ctx->timing->current_pts; ``` Just now I am looking at reason in Specs that is this scenerio really allowed in video with same pts have multiple slice header. I also have to confirm manually through hex editor that is that video really having multiple slice header with same pts or we have some problem in code which call this function twice. Above solution is just a hack to understand and confirm problem, I would suggest to not take this solution as final solution until investigation is complete
Author
Owner

@aramacciotti commented on GitHub (Oct 21, 2015):

I downloaded the 0.77 source version and in its avc_functions.c slice_header (I found it in the root folder of the VisualStudio project) the current code is different from the fragment you wrote here.
I find:

    // Flush buffered cc blocks before doing the housekeeping
    if (has_ccdata_buffered)

instead of:

    if (ctx->has_ccdata_buffered)

and I'm not able to find has_ccdata_buffered and timing in ctx.
Have I to download a different version?

@aramacciotti commented on GitHub (Oct 21, 2015): I downloaded the 0.77 source version and in its avc_functions.c slice_header (I found it in the root folder of the VisualStudio project) the current code is different from the fragment you wrote here. I find: ``` // Flush buffered cc blocks before doing the housekeeping if (has_ccdata_buffered) ``` instead of: ``` if (ctx->has_ccdata_buffered) ``` and I'm not able to find _has_ccdata_buffered_ and _timing_ in _ctx_. Have I to download a different version?
Author
Owner

@anshul1912 commented on GitHub (Oct 21, 2015):

Please git clone the latest code

@anshul1912 commented on GitHub (Oct 21, 2015): Please git clone the latest code
Author
Owner

@aramacciotti commented on GitHub (Oct 21, 2015):

I don't see any significant improvement with that code change.
Just to be sure, this is my current fragment:

    if (isref)
    {
        static LLONG last_pts = 0;
        dvprint("\nReference pic! [%s]\n", slice_types[slice_type]);
        dbg_print(CCX_DMT_TIME, "\nReference pic! [%s] maxrefcnt: %3d\n",
                slice_types[slice_type], maxrefcnt);

        // Flush buffered cc blocks before doing the housekeeping
        //if (ctx->has_ccdata_buffered)
        if (ctx->has_ccdata_buffered && last_pts != ctx->timing->current_pts)
        {
            process_hdcc(ctx, sub);
        }
        last_pts = ctx->timing->current_pts;
@aramacciotti commented on GitHub (Oct 21, 2015): I don't see any significant improvement with that code change. Just to be sure, this is my current fragment: ``` if (isref) { static LLONG last_pts = 0; dvprint("\nReference pic! [%s]\n", slice_types[slice_type]); dbg_print(CCX_DMT_TIME, "\nReference pic! [%s] maxrefcnt: %3d\n", slice_types[slice_type], maxrefcnt); // Flush buffered cc blocks before doing the housekeeping //if (ctx->has_ccdata_buffered) if (ctx->has_ccdata_buffered && last_pts != ctx->timing->current_pts) { process_hdcc(ctx, sub); } last_pts = ctx->timing->current_pts; ```
Author
Owner

@anshul1912 commented on GitHub (Oct 21, 2015):

This was output with change in code

00:00:00,835 --> 00:00:02,292
                unless you take 
                your Dramamine. 

2
00:00:02,336 --> 00:00:04,545
 It's too late for Dramamine.   
 We're practically there.    

3
00:00:04,588 --> 00:00:05,796
              Don't roll    
              your eyes at me.  

Do you have some different output then mine

@anshul1912 commented on GitHub (Oct 21, 2015): This was output with change in code ``` 00:00:00,835 --> 00:00:02,292 unless you take your Dramamine. 2 00:00:02,336 --> 00:00:04,545 It's too late for Dramamine. We're practically there. 3 00:00:04,588 --> 00:00:05,796 Don't roll your eyes at me. ``` Do you have some different output then mine
Author
Owner

@aramacciotti commented on GitHub (Oct 21, 2015):

Yes, I get this:

1
00:00:00,835 --> 00:00:02,292
                 yunlessouak tue
                yo Dramamine.   

2
00:00:02,336 --> 00:00:04,545
 It's too late for Dramamine.   
 We're practillcathy ere.       

3
00:00:04,588 --> 00:00:05,796
              Don't roll        
             your eyes at me.   
@aramacciotti commented on GitHub (Oct 21, 2015): Yes, I get this: ``` 1 00:00:00,835 --> 00:00:02,292 yunlessouak tue yo Dramamine. 2 00:00:02,336 --> 00:00:04,545 It's too late for Dramamine. We're practillcathy ere. 3 00:00:04,588 --> 00:00:05,796 Don't roll your eyes at me. ```
Author
Owner

@anshul1912 commented on GitHub (Oct 21, 2015):

can you try with my clone https://github.com/anshul1912/ccextractor.git
It works perfectly at my machine(linux, openSuse, 64 bit)

@anshul1912 commented on GitHub (Oct 21, 2015): can you try with my clone https://github.com/anshul1912/ccextractor.git It works perfectly at my machine(linux, openSuse, 64 bit)
Author
Owner

@aramacciotti commented on GitHub (Oct 21, 2015):

It's working fine! I'll test also with other contents where the issue was not present to verify that no regressions were added.

Thanks!

@aramacciotti commented on GitHub (Oct 21, 2015): It's working fine! I'll test also with other contents where the issue was not present to verify that no regressions were added. Thanks!
Author
Owner

@anshul1912 commented on GitHub (Oct 21, 2015):

There are some regression added you can have a look at https://github.com/CCExtractor/ccextractor/issues/240 to see what files previously working fine are not working after my changes, but it will be good if you test with your own file too

@anshul1912 commented on GitHub (Oct 21, 2015): There are some regression added you can have a look at https://github.com/CCExtractor/ccextractor/issues/240 to see what files previously working fine are not working after my changes, but it will be good if you test with your own file too
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#74