mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-14 05:25:44 +00:00
Incorrect timing in iTunes MP4 #67
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @cfsmp3 on GitHub (Aug 2, 2015).
Originally assigned to: @rkuchumov on GitHub.
Looks like we are doing it wrong in iTunes MP4.
There's a track that contains the captions (as opposed to being embedded in the video track) and timing is a bit different than normally because that track can contain a lot of data.
We've received a couple useful links about this:
http://forum.doom9.org/showthread.php?p=1718273#post1718273
(It continues to the bottom of the page.)
https://trac.videolan.org/vlc/ticket/12685
(There's an extra sample here you should be able to download.)
@ndjamena commented on GitHub (Aug 19, 2015):
I'm not sure how it works in other formats, but iTunes just assumes a 30/1.001 fps frame rate for captions regardless of the frame rate of the actual video track.
The first byte pair in an MP4 packet is enacted at the DTS time code from the MP4 headers then each successive byte pair after that is enacted in 1/(30/1.001) of a second intervals just as if it was being piped through an NTSC analogue connection via line 21.
@ghost commented on GitHub (Nov 29, 2016):
Sample file direct link: http://streams.videolan.org/issues/12685/Full%20Episode%202.mp4
@canihavesomecoffee commented on GitHub (Dec 10, 2016):
File looks to be fine with 0.82... @cfsmp3 can you confirm?
@ndjamena commented on GitHub (Dec 10, 2016):
I haven't looked at mp4 cc in a while so I'm taking a moment to catch up.
This is the email I originally sent:
I found the file and it's episode 2 of Ben 10:
02 Washington B.C. (1080p HD).m4v
I ripped the subtitles with the current version of CCExtractor and the timecodes it's producing are exactly the same as when I wrote the original email.
You can see that last subtitle ends a full second early in the CCExtractor srt.
I don't have my program anymore, or the source code for that matter. The subtitles for the VLC file LOOK fine when ripped with CCExtractor but if you compare them to how they are when played back via iTunes there's a definite difference, and the iTunes timings definitely look better (plus were identical to my program's output which followed the 608 specs to the letter).
This was a while ago so I'm not sure I can get back into this... um...
What will we need?
@cfsmp3 commented on GitHub (Dec 10, 2016):
We also think CCExtractor follow the specs to the letter :-) Of course
something must be wrong here.
What did your program do?
On Sat, Dec 10, 2016 at 10:11 AM, ndjamena notifications@github.com wrote:
@ndjamena commented on GitHub (Dec 10, 2016):
I extracted the subtitles with mp4box into... either NHNT or NHML, whichever one was newer (the links on the GPAC website are broken at the moment), I pointed my program at it and it used the extracted timecodes and track data to rebuild the 608 track as an SRT.
Apparently MediaFire has a version of my program still... I'll have to figure out how to get it working... I think the extra text file it outputs will be the most useful thing I can contribute:
That's from the VLC bug tracker.
This is CCEXtractor:
Until I can pull something better off from all this...
According to the timecode my program got from the extracted MP4Box track the mp4 frame that contained the 608 packet had a timecode of 00:00:27,227.
You can see the first subtitle I've posted from the CCExtractor file starts at exactly 00:00:27,228.
The problem as I see it is that according to the output of my program between 00:00:27,227 and the "End Caption" there are at least two commands, each of which should take 1/(30000/1001) of a second. There could be other factors at play that could legitimately cause your program to set the timecode where it is but that's a start (and it does put the extracted timecode at odds with iTunes).
I don't know if that actually means anything but at the moment I don't have much in front of me to work with and it can't hurt to look at it.
@ndjamena commented on GitHub (Dec 10, 2016):
Oh, right pièce de résistance.
You can see they use [Erase Display Memory] and THEN [End of Caption]. There should be almost exactly 33 milliseconds between one command executing and the next, yet CCExtractors subtitles are only 2ms apart.
I don't know if that's deliberate but it's wrong via 608 specs.
That blink between EDR and EOC is deliberate by the way, it's in the specs. If you don't want the blink you're supposed to End of Caption and then Erase Non-Display Memory.
@cfsmp3 commented on GitHub (Dec 10, 2016):
@ndjamena Are you able to provide a few more samples for us to analyze? I want to solve this once and for all.
@ndjamena commented on GitHub (Dec 10, 2016):
I got my program working and I can see why I felt the need to use that particular file as an example for CCExtractor.
When it's a very simple frame to interpret my program and CCExtractor agree on the timecodes so they're both getting them from the same place:
But then things start going wrong as the EOC gets further from the start of the frame:
You can see it's using the timecode for the mp4 frame that contains the packet of 608 data as the start timecode for the second subtitle, even though there's god knows how many characters/commands between the start of the frame and the End of Caption.
CCExtractor output is difficult to interpret because when there are multiple End of Captions in a single frame it still shows each and every available subtitle (unlike VLC), yet the core problem of giving the MP4 frame timecode to the End of Caption is still there. There must be some kind of workaround written into the program, which wouldn't be necessary if it just calculated the timecodes properly in the first place.
According to the NHML file the 14th frame starts at 1463462. 1463462 * (1\30000) = 48.782066666666666666666666666667. Which is 00:00:48,782. Which is when the FRAME starts. There's no reason the subtitle ending in "No Way" should start there that I can see, unless there's something seriously wrong with MP4Box.
<NHNTSample DTS="1567566" dataLength="72" isRAP="yes" />1567566*(1\30000) = 52.2522 = 00:00:52,252, which is when the very next frame starts. The previous frame is 46.613233333333333333333333333333 (00:00:46,613).
I think I've made my point. There's not much more I can do other than ranting.
@ndjamena commented on GitHub (Dec 10, 2016):
Well, that's difficult because they're iTunes Files... I prepared the Emperor's New Groove file by completely re-encoding it and doing everything I could to remove anything that might identify me.
I don't think I'm set up to do that at the moment and have no idea how long it will take to figure it out, or even if it's a good idea to try.
@xvkdev commented on GitHub (Dec 10, 2016):
There's plenty of useful info here to work with. Thanks :-)
On Sat, Dec 10, 2016 at 11:57 AM, ndjamena notifications@github.com wrote:
@ndjamena commented on GitHub (Dec 10, 2016):
http://www.mediafire.com/file/ycvc0h8lhxu05kw/Break+Final+Cut.exe
That was my program. Extract the track using MP4Box as NHML then drag/drop the NHML file onto the EXE and it will make an srt file. I wouldn't trust its output absolutely (it was never actually finished) but it does seem to get the actual timecodes close to perfect. Norton decided to delete it on my computer so...
@cfsmp3 commented on GitHub (Dec 13, 2016):
@ndjamena I think - fixed. Timing in MP4 now takes into account number of CC pairs since the last time the PTS was set. It looks good for the file I have. Can you confirm?
@ndjamena commented on GitHub (Dec 14, 2016):
Comparing the output of the new executable and my program, other than the weird character CCExtractor uses in place of '♪', the fact that my program has a bug that makes in not output characters that share a byte pair with a null and a few milliseconds rounding/formatting differences in the timecodes, the output of both programs are exactly the same.
I can't tell you if it's correct, but I can say the bug fixes make CCExtractors output very similar to the output of the program I wrote and the program I wrote gave output very similar to what iTunes displayed.
That's about all I can give.