mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-14 05:25:44 +00:00
Duplicating lines from iTunes #183
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @SteverB11 on GitHub (Sep 21, 2016).
I have this problem periodically and none of the settings seem to help. When extracting a subtitle from an iTunes file, the resulting .srt file has duplicated lines as shown in the upload at: https://www.dropbox.com/s/hvx129ih4bu5pxb/03%20He%20Calls%20Himself%20Daxis%20%281080p%20HD%29.m4v?dl=0
Subs: https://www.dropbox.com/s/yios5l1lifxhe67/03%20He%20Calls%20Himself%20Daxis%20%281080p%20HD%29.srt?dl=0
I assumed it had something to do with roll-up captions, but those settings don't seem to change the output.
Settings used: C:\Program Files (x86)\CCExtractor\ccextractorwin.exe --gui_mode_reports -autoprogram -out=srt -ru3 -bom -utf8 -trim -autodash --norollup [+input files]
Thanks for your help!
@cfsmp3 commented on GitHub (Sep 21, 2016):
The repeated lines are even minutes apart. First guess (without actually inspecting the file in depth) is that the data is actually repeated in the file for some reason.
Do the .mp4 file play OK with any player? Which one?
@SteverB11 commented on GitHub (Sep 22, 2016):
It plays OK in iTunes with the CC on -- no repeating. but I only use iTunes for content. I use Kodi (XBMC) for playback and it doesn't play the CC in the file, only a subtitle format like .srt. I only sent this to you because it happens once in awhile, maybe 3 out of 10 shows. It is weird and I thought you might have some insight. Thanks!
@geopapyrus commented on GitHub (Oct 20, 2016):
I ran into this bug as well.
The good news is that it is easy to identify when it happens.
For each subtitle position X, check if X-2 is identical. if it is, skip it.
The bug is pretty consistent.
It happens always this way.
@SteverB11 commented on GitHub (Oct 25, 2016):
Ahh, OK, thanks! Naturally, I haven't seen it in awhile. But thanks.
Steve Boyd
On Thu, Oct 20, 2016 at 5:51 PM, geopapyrus notifications@github.com
wrote:
@Izaron commented on GitHub (Dec 1, 2016):
For each subtitle position X, check if X-2 is identical. if it is, skip it.We can fix it with bite-sized script after extracting. Seems we should do it at CCExtractor code?
@cfsmp3 commented on GitHub (Dec 1, 2016):
No, that wouldn't be a fix, that would be hidden the problem under the rug
:-)
On Wed, Nov 30, 2016 at 9:28 PM, Evgeny Shulgin notifications@github.com
wrote:
@ndjamena commented on GitHub (Dec 10, 2016):
Well, unless there's a bug further up the chain, the problem here seems to be the duplication of commands. The repeating subtitle in the output file changes whenever an "Erase Display Memory" command is issued.
I think the only way around this is that whenever EOC is issued twice in succession the second one should be ignored. I'm pretty sure that's not part of the actual specs so I assume it's a bug fix in iTunes to get around a few bad encoders.
CCExtractor still isn't doing it correctly in any case. Two End of Captions in succession really ought to accomplish nothing other than a slight flickering appearance of the subtitle for a single frame. The fact that iTunes doesn't do that anymore really isn't a good thing because it means if that's what you intend to happen it no longer will.
@cfsmp3 commented on GitHub (Dec 10, 2016):
Actually the specs (well, the FCC best practices I think) do say that
repeated commands are OK and that the 2nd one has to be ignored. Of course,
since they do take time to be transmitted they should still be counted for
timing...
I'll work on this on Monday.
On Sat, Dec 10, 2016 at 2:08 PM, ndjamena notifications@github.com wrote:
@ndjamena commented on GitHub (Dec 10, 2016):
Oh, I must have missed that, or it wasn't in the PDF I was working from.
I guess that means the problem is you're not ignoring them properly...
It makes sense actually, in case one command is lost in static there's a second one to back it up.
@alexbrt commented on GitHub (Dec 11, 2016):
I think the bug is located in the function encode_sub() of ccx_encoders_common.c.
At line 1059, a for loop starts which checks the subtitle output file type (SRT, SSA etc.).
At line 1089, the program writes the cc buffer and returns a boolean value to indicate if it was successful or not.
At line 1137 the program whether the program wrote something to the file or not and adds the last displayed subs ms.
->The problem is that after adding the last displayed subs ms the program doesn't exit the for loop. This is a problem because the program will go once again through the switch at line 1084 and therefore will write the same buffer twice to the file, which is not right.
Of course, I could've put a break at line 1139 to exit the for loop after having written the subs to the file but that would cause the program to no longer output the buffer to the gui and would only output the progress.
->So, to solve the subtitle duplication problem I added the condition at line 1082 to enter the switch only if no subs were written to the file.
Unfortunately though, I think this fix has an impact on subtitle timing for itunes files (not sure).
@cfsmp3 commented on GitHub (Dec 13, 2016):
Fixed duplicate lines.
@ndjamena if you still see issues in timing (possible, we'll start digging into that, too) please feel free to comment here or in a separate ticket.
Thanks @AlexBratosin2001 for not giving up and working with me to squash this bug.
@ndjamena commented on GitHub (Dec 14, 2016):
I take it I'm going to have to compile my own exe. I've downloaded the source code, hopefully it won't be hard to get an executable from it.
For the record I extracted the caption track from the example file, used a hex editor to remove all the duplicate commands, used mp4box to remux a new file with the modified caption track then fed it to CCExtractor. CCExtractor converted the captions without adding all the duplicate lines. Hopefully the fix fixed the actual core problem and hasn't affected any other subtitle formats. It does seem kind of odd though. I don't get how that happened, and haven't built up enough of an understanding of the source code to figure it out yet.
@cfsmp3 commented on GitHub (Dec 14, 2016):
There was a condition in which some byte pairs could be processed more than
once - I mean not because they're twice on the stream (that's correctly
handled in the 608 parser) but due a bug in the reading loop. That's been
corrected now (was introduced a long time ago, but we didn't pay enough
attention apparently).
On Wed, Dec 14, 2016 at 7:19 AM, ndjamena notifications@github.com wrote: