Incorrect timing in iTunes MP4 #67

Closed
opened 2026-01-29 16:34:18 +00:00 by claunia · 14 comments
Owner

Originally created by @cfsmp3 on GitHub (Aug 2, 2015).

Originally assigned to: @rkuchumov on GitHub.

Looks like we are doing it wrong in iTunes MP4.

There's a track that contains the captions (as opposed to being embedded in the video track) and timing is a bit different than normally because that track can contain a lot of data.

We've received a couple useful links about this:

http://forum.doom9.org/showthread.php?p=1718273#post1718273
(It continues to the bottom of the page.)

https://trac.videolan.org/vlc/ticket/12685
(There's an extra sample here you should be able to download.)

Originally created by @cfsmp3 on GitHub (Aug 2, 2015). Originally assigned to: @rkuchumov on GitHub. Looks like we are doing it wrong in iTunes MP4. There's a track that contains the captions (as opposed to being embedded in the video track) and timing is a bit different than normally because that track can contain a lot of data. We've received a couple useful links about this: http://forum.doom9.org/showthread.php?p=1718273#post1718273 (It continues to the bottom of the page.) https://trac.videolan.org/vlc/ticket/12685 (There's an extra sample here you should be able to download.)
Author
Owner

@ndjamena commented on GitHub (Aug 19, 2015):

I'm not sure how it works in other formats, but iTunes just assumes a 30/1.001 fps frame rate for captions regardless of the frame rate of the actual video track.

The first byte pair in an MP4 packet is enacted at the DTS time code from the MP4 headers then each successive byte pair after that is enacted in 1/(30/1.001) of a second intervals just as if it was being piped through an NTSC analogue connection via line 21.

@ndjamena commented on GitHub (Aug 19, 2015): I'm not sure how it works in other formats, but iTunes just assumes a 30/1.001 fps frame rate for captions regardless of the frame rate of the actual video track. The first byte pair in an MP4 packet is enacted at the DTS time code from the MP4 headers then each successive byte pair after that is enacted in 1/(30/1.001) of a second intervals just as if it was being piped through an NTSC analogue connection via line 21.
Author
Owner

@ghost commented on GitHub (Nov 29, 2016):

Sample file direct link: http://streams.videolan.org/issues/12685/Full%20Episode%202.mp4

@ghost commented on GitHub (Nov 29, 2016): Sample file direct link: http://streams.videolan.org/issues/12685/Full%20Episode%202.mp4
Author
Owner

@canihavesomecoffee commented on GitHub (Dec 10, 2016):

File looks to be fine with 0.82... @cfsmp3 can you confirm?

@canihavesomecoffee commented on GitHub (Dec 10, 2016): File looks to be fine with 0.82... @cfsmp3 can you confirm?
Author
Owner

@ndjamena commented on GitHub (Dec 10, 2016):

I haven't looked at mp4 cc in a while so I'm taking a moment to catch up.

This is the email I originally sent:

 I've attached two sample output files, the differences aren't huge, but they do become noticeable in places.

This is how each starts:

CCExtractor	iTunesesque
1	1
00:00:02,236 --> 00:00:03,635	 00:00:02,535 --> 00:00:03,636
            [siren]             	[siren]
	
2	2
00:00:13,047 --> 00:00:15,414 00:00:13,479 --> 00:00:15,415
        >> Who are you?         	>> Who are you?
	
3	3
00:00:16,384 --> 00:00:18,550	 00:00:16,916 --> 00:00:18,551
      >> I'm here to help.      	>> I’m here to help.
	
4	4
00:00:30,798 --> 00:00:33,065 00:00:31,164 --> 00:00:33,733
           This way.            	This way.
	
5	5
00:00:33,067 --> 00:00:36,435 00:00:33,733 --> 00:00:36,436
  On second thought, that way.  	On second thought, that way.
	
6	6
00:00:43,077 --> 00:00:46,078 00:00:43,543 --> 00:00:46,512
            >> Ohh!             	>> Ohh!
            >> Aah!             	>> Aah!
	
7	7
00:00:46,080 --> 00:00:46,612 	00:00:46,512 --> 00:00:47,679
        [oohs and ahhs]         	[oohs and ahhs]
	
8	8
00:00:46,614 --> 00:00:48,781	 00:00:47,679 --> 00:00:49,782
  >> I'm sure you all want to   	>> I’m sure you all want to
  thank me personally,          	thank me personally,

I found the file and it's episode 2 of Ben 10:
02 Washington B.C. (1080p HD).m4v

I ripped the subtitles with the current version of CCExtractor and the timecodes it's producing are exactly the same as when I wrote the original email.

You can see that last subtitle ends a full second early in the CCExtractor srt.

I don't have my program anymore, or the source code for that matter. The subtitles for the VLC file LOOK fine when ripped with CCExtractor but if you compare them to how they are when played back via iTunes there's a definite difference, and the iTunes timings definitely look better (plus were identical to my program's output which followed the 608 specs to the letter).

This was a while ago so I'm not sure I can get back into this... um...

What will we need?

@ndjamena commented on GitHub (Dec 10, 2016): I haven't looked at mp4 cc in a while so I'm taking a moment to catch up. This is the email I originally sent: ``` I've attached two sample output files, the differences aren't huge, but they do become noticeable in places. This is how each starts: CCExtractor iTunesesque 1 1 00:00:02,236 --> 00:00:03,635 00:00:02,535 --> 00:00:03,636 [siren] [siren] 2 2 00:00:13,047 --> 00:00:15,414 00:00:13,479 --> 00:00:15,415 >> Who are you? >> Who are you? 3 3 00:00:16,384 --> 00:00:18,550 00:00:16,916 --> 00:00:18,551 >> I'm here to help. >> I’m here to help. 4 4 00:00:30,798 --> 00:00:33,065 00:00:31,164 --> 00:00:33,733 This way. This way. 5 5 00:00:33,067 --> 00:00:36,435 00:00:33,733 --> 00:00:36,436 On second thought, that way. On second thought, that way. 6 6 00:00:43,077 --> 00:00:46,078 00:00:43,543 --> 00:00:46,512 >> Ohh! >> Ohh! >> Aah! >> Aah! 7 7 00:00:46,080 --> 00:00:46,612 00:00:46,512 --> 00:00:47,679 [oohs and ahhs] [oohs and ahhs] 8 8 00:00:46,614 --> 00:00:48,781 00:00:47,679 --> 00:00:49,782 >> I'm sure you all want to >> I’m sure you all want to thank me personally, thank me personally, ``` I found the file and it's episode 2 of Ben 10: 02 Washington B.C. (1080p HD).m4v I ripped the subtitles with the current version of CCExtractor and the timecodes it's producing are exactly the same as when I wrote the original email. You can see that last subtitle ends a full second early in the CCExtractor srt. I don't have my program anymore, or the source code for that matter. The subtitles for the VLC file LOOK fine when ripped with CCExtractor but if you compare them to how they are when played back via iTunes there's a definite difference, and the iTunes timings definitely look better (plus were identical to my program's output which followed the 608 specs to the letter). This was a while ago so I'm not sure I can get back into this... um... What will we need?
Author
Owner

@cfsmp3 commented on GitHub (Dec 10, 2016):

We also think CCExtractor follow the specs to the letter :-) Of course
something must be wrong here.

What did your program do?

On Sat, Dec 10, 2016 at 10:11 AM, ndjamena notifications@github.com wrote:

I haven't looked at mp4 cc in a while so I'm taking a moment to catch up.

This is the email I originally sent:

` I've attached two sample output files, the differences aren't huge, but
they do become noticeable in places.

This is how each starts:

CCExtractor iTunesesque
1 1
00:00:02,236 --> 00:00:03,635 00:00:02,535 --> 00:00:03,636
[siren] [siren]

2 2
00:00:13,047 --> 00:00:15,414 00:00:13,479 --> 00:00:15,415

Who are you? >> Who are you?

3 3
00:00:16,384 --> 00:00:18,550 00:00:16,916 --> 00:00:18,551

I'm here to help. >> I’m here to help.

4 4
00:00:30,798 --> 00:00:33,065 00:00:31,164 --> 00:00:33,733
This way. This way.

5 5
00:00:33,067 --> 00:00:36,435 00:00:33,733 --> 00:00:36,436
On second thought, that way. On second thought, that way.

6 6
00:00:43,077 --> 00:00:46,078 00:00:43,543 --> 00:00:46,512

Ohh! >> Ohh!
Aah! >> Aah!

7 7
00:00:46,080 --> 00:00:46,612 00:00:46,512 --> 00:00:47,679
[oohs and ahhs] [oohs and ahhs]

8 8
00:00:46,614 --> 00:00:48,781 00:00:47,679 --> 00:00:49,782

I'm sure you all want to >> I’m sure you all want to
thank me personally, thank me personally,
`

I found the file and it's episode 2 of Ben 10:
02 Washington B.C. (1080p HD).m4v

I ripped the subtitles with the current version of CCExtractor and the
timecodes it's producing are exactly the same as when I wrote the original
email.

You can see that last subtitle ends a full second early in the CCExtractor
srt.

I don't have my program anymore, or the source code for that matter. The
subtitles for the VLC file LOOK fine when ripped with CCExtractor but if
you compare them to how they are when played back via iTunes there's a
definite difference, and the iTunes timings definitely look better (plus
were identical to my program's output which followed the 608 specs to the
letter).

This was a while ago so I'm not sure I can get back into this... um...

What will we need?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/204#issuecomment-266226870,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2RCwRr2w6P0emO6ee9AVD5yEY8dzks5rGutjgaJpZM4FkJeV
.

@cfsmp3 commented on GitHub (Dec 10, 2016): We also think CCExtractor follow the specs to the letter :-) Of course something must be wrong here. What did your program do? On Sat, Dec 10, 2016 at 10:11 AM, ndjamena <notifications@github.com> wrote: > I haven't looked at mp4 cc in a while so I'm taking a moment to catch up. > > This is the email I originally sent: > > ` I've attached two sample output files, the differences aren't huge, but > they do become noticeable in places. > > This is how each starts: > > CCExtractor iTunesesque > 1 1 > 00:00:02,236 --> 00:00:03,635 00:00:02,535 --> 00:00:03,636 > [siren] [siren] > > 2 2 > 00:00:13,047 --> 00:00:15,414 00:00:13,479 --> 00:00:15,415 > >> Who are you? >> Who are you? > > 3 3 > 00:00:16,384 --> 00:00:18,550 00:00:16,916 --> 00:00:18,551 > >> I'm here to help. >> I’m here to help. > > 4 4 > 00:00:30,798 --> 00:00:33,065 00:00:31,164 --> 00:00:33,733 > This way. This way. > > 5 5 > 00:00:33,067 --> 00:00:36,435 00:00:33,733 --> 00:00:36,436 > On second thought, that way. On second thought, that way. > > 6 6 > 00:00:43,077 --> 00:00:46,078 00:00:43,543 --> 00:00:46,512 > >> Ohh! >> Ohh! > >> Aah! >> Aah! > > 7 7 > 00:00:46,080 --> 00:00:46,612 00:00:46,512 --> 00:00:47,679 > [oohs and ahhs] [oohs and ahhs] > > 8 8 > 00:00:46,614 --> 00:00:48,781 00:00:47,679 --> 00:00:49,782 > > I'm sure you all want to >> I’m sure you all want to > thank me personally, thank me personally, > ` > > I found the file and it's episode 2 of Ben 10: > 02 Washington B.C. (1080p HD).m4v > > I ripped the subtitles with the current version of CCExtractor and the > timecodes it's producing are exactly the same as when I wrote the original > email. > > You can see that last subtitle ends a full second early in the CCExtractor > srt. > > I don't have my program anymore, or the source code for that matter. The > subtitles for the VLC file LOOK fine when ripped with CCExtractor but if > you compare them to how they are when played back via iTunes there's a > definite difference, and the iTunes timings definitely look better (plus > were identical to my program's output which followed the 608 specs to the > letter). > > This was a while ago so I'm not sure I can get back into this... um... > > What will we need? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/204#issuecomment-266226870>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2RCwRr2w6P0emO6ee9AVD5yEY8dzks5rGutjgaJpZM4FkJeV> > . >
Author
Owner

@ndjamena commented on GitHub (Dec 10, 2016):

I extracted the subtitles with mp4box into... either NHNT or NHML, whichever one was newer (the links on the GPAC website are broken at the moment), I pointed my program at it and it used the extracted timecodes and track data to rebuild the 608 track as an SRT.

Apparently MediaFire has a version of my program still... I'll have to figure out how to get it working... I think the extra text file it outputs will be the most useful thing I can contribute:

----------14----------

9420 942C 942F 9420 9452 4558 C143 544C D9A1 947A 4558 C143 544C D9A1 

<<<Start = 00:00:27,227>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Display Memory
End Of Caption (Swap Non Displayed Memory with Display Memory)
Resume Caption Loading (Load to Non Displayed Memory)
[Main Line 11]
[Indent: 4]EXACTLY!
[Main Line 12]
[Indent: 20]EXACTLY!

----------15----------

9420 942C 942F 9420 9470 9137 204C 4554 A7D3 20C7 4F20 9137 

<<<Start = 00:00:29,896>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Display Memory
End Of Caption (Swap Non Displayed Memory with Display Memory)
Resume Caption Loading (Load to Non Displayed Memory)
[Main Line 12]
[Indent: 0]♪ LET’S GO ♪

----------16----------

9420 942C 942F 9420 9454 9137 20C8 45A7 D320 C74F 49CE C780 94F4 544F 20CB D5DA 434F 20C1 43C1 C445 CDD9 2080 9137 9420 942C 942F 9420 9470 9137 20CB D5DA 434F 20C1 43C1 C445 CDD9 2080 9137 

<<<Start = 00:00:30,864>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Display Memory
End Of Caption (Swap Non Displayed Memory with Display Memory)
Resume Caption Loading (Load to Non Displayed Memory)
[Main Line 11]
[Indent: 8]♪ HE’S GOING[NULL]
[Main Line 12]
[Indent: 8]TO KUZCO ACADEMY [NULL]♪
Resume Caption Loading (Load to Non Displayed Memory)
Erase Display Memory
End Of Caption (Swap Non Displayed Memory with Display Memory)
Resume Caption Loading (Load to Non Displayed Memory)
[Main Line 12]
[Indent: 0]♪ KUZCO ACADEMY [NULL]♪

That's from the VLC bug tracker.

This is CCEXtractor:

11
00:00:27,228 --> 00:00:29,895
            YOU KNOW,           
            IT'S ALL ABOUT ME.  

12
00:00:29,897 --> 00:00:30,863
    EXACTLY!                    
                    EXACTLY!    

13
00:00:30,865 --> 00:00:32,214
¶ LET'S GO ¶                    

14
00:00:32,215 --> 00:00:33,564
        ¶ HE'S GOING            
        TO KUZCO ACADEMY ¶      

15
00:00:33,567 --> 00:00:34,733
¶ KUZCO ACADEMY ¶               

Until I can pull something better off from all this...

According to the timecode my program got from the extracted MP4Box track the mp4 frame that contained the 608 packet had a timecode of 00:00:27,227.

You can see the first subtitle I've posted from the CCExtractor file starts at exactly 00:00:27,228.

The problem as I see it is that according to the output of my program between 00:00:27,227 and the "End Caption" there are at least two commands, each of which should take 1/(30000/1001) of a second. There could be other factors at play that could legitimately cause your program to set the timecode where it is but that's a start (and it does put the extracted timecode at odds with iTunes).

I don't know if that actually means anything but at the moment I don't have much in front of me to work with and it can't hurt to look at it.

@ndjamena commented on GitHub (Dec 10, 2016): I extracted the subtitles with mp4box into... either NHNT or NHML, whichever one was newer (the links on the GPAC website are broken at the moment), I pointed my program at it and it used the extracted timecodes and track data to rebuild the 608 track as an SRT. Apparently MediaFire has a version of my program still... I'll have to figure out how to get it working... I think the extra text file it outputs will be the most useful thing I can contribute: ``` ----------14---------- 9420 942C 942F 9420 9452 4558 C143 544C D9A1 947A 4558 C143 544C D9A1 <<<Start = 00:00:27,227>>> Resume Caption Loading (Load to Non Displayed Memory) Erase Display Memory End Of Caption (Swap Non Displayed Memory with Display Memory) Resume Caption Loading (Load to Non Displayed Memory) [Main Line 11] [Indent: 4]EXACTLY! [Main Line 12] [Indent: 20]EXACTLY! ----------15---------- 9420 942C 942F 9420 9470 9137 204C 4554 A7D3 20C7 4F20 9137 <<<Start = 00:00:29,896>>> Resume Caption Loading (Load to Non Displayed Memory) Erase Display Memory End Of Caption (Swap Non Displayed Memory with Display Memory) Resume Caption Loading (Load to Non Displayed Memory) [Main Line 12] [Indent: 0]♪ LET’S GO ♪ ----------16---------- 9420 942C 942F 9420 9454 9137 20C8 45A7 D320 C74F 49CE C780 94F4 544F 20CB D5DA 434F 20C1 43C1 C445 CDD9 2080 9137 9420 942C 942F 9420 9470 9137 20CB D5DA 434F 20C1 43C1 C445 CDD9 2080 9137 <<<Start = 00:00:30,864>>> Resume Caption Loading (Load to Non Displayed Memory) Erase Display Memory End Of Caption (Swap Non Displayed Memory with Display Memory) Resume Caption Loading (Load to Non Displayed Memory) [Main Line 11] [Indent: 8]♪ HE’S GOING[NULL] [Main Line 12] [Indent: 8]TO KUZCO ACADEMY [NULL]♪ Resume Caption Loading (Load to Non Displayed Memory) Erase Display Memory End Of Caption (Swap Non Displayed Memory with Display Memory) Resume Caption Loading (Load to Non Displayed Memory) [Main Line 12] [Indent: 0]♪ KUZCO ACADEMY [NULL]♪ ``` That's from the VLC bug tracker. This is CCEXtractor: ``` 11 00:00:27,228 --> 00:00:29,895 YOU KNOW, IT'S ALL ABOUT ME. 12 00:00:29,897 --> 00:00:30,863 EXACTLY! EXACTLY! 13 00:00:30,865 --> 00:00:32,214 ¶ LET'S GO ¶ 14 00:00:32,215 --> 00:00:33,564 ¶ HE'S GOING TO KUZCO ACADEMY ¶ 15 00:00:33,567 --> 00:00:34,733 ¶ KUZCO ACADEMY ¶ ``` Until I can pull something better off from all this... According to the timecode my program got from the extracted MP4Box track the mp4 frame that contained the 608 packet had a timecode of 00:00:27,227. You can see the first subtitle I've posted from the CCExtractor file starts at exactly 00:00:27,228. The problem as I see it is that according to the output of my program between 00:00:27,227 and the "End Caption" there are at least two commands, each of which should take 1/(30000/1001) of a second. There could be other factors at play that could legitimately cause your program to set the timecode where it is but that's a start (and it does put the extracted timecode at odds with iTunes). I don't know if that actually means anything but at the moment I don't have much in front of me to work with and it can't hurt to look at it.
Author
Owner

@ndjamena commented on GitHub (Dec 10, 2016):

Oh, right pièce de résistance.

You can see they use [Erase Display Memory] and THEN [End of Caption]. There should be almost exactly 33 milliseconds between one command executing and the next, yet CCExtractors subtitles are only 2ms apart.

I don't know if that's deliberate but it's wrong via 608 specs.

That blink between EDR and EOC is deliberate by the way, it's in the specs. If you don't want the blink you're supposed to End of Caption and then Erase Non-Display Memory.

@ndjamena commented on GitHub (Dec 10, 2016): Oh, right pièce de résistance. You can see they use [Erase Display Memory] and THEN [End of Caption]. There should be almost exactly 33 milliseconds between one command executing and the next, yet CCExtractors subtitles are only 2ms apart. I don't know if that's deliberate but it's wrong via 608 specs. That blink between EDR and EOC is deliberate by the way, it's in the specs. If you don't want the blink you're supposed to End of Caption and then Erase Non-Display Memory.
Author
Owner

@cfsmp3 commented on GitHub (Dec 10, 2016):

@ndjamena Are you able to provide a few more samples for us to analyze? I want to solve this once and for all.

@cfsmp3 commented on GitHub (Dec 10, 2016): @ndjamena Are you able to provide a few more samples for us to analyze? I want to solve this once and for all.
Author
Owner

@ndjamena commented on GitHub (Dec 10, 2016):

I got my program working and I can see why I felt the need to use that particular file as an example for CCExtractor.

When it's a very simple frame to interpret my program and CCExtractor agree on the timecodes so they're both getting them from the same place:

----------9----------

9420 94AE 94E0 97A2 4F6E 2073 E5E3 EF6E 6420 F468 EF75 6768 F42C 20F4 6861 F420 F761 79AE 8080 8080 942F 

<<<Start = 00:00:33,066>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Non Displayed Memory
[Main Line 12]
[Text: White][TAB2]On second thought, that way.[NULL|NULL][NULL|NULL]
End Of Caption (Swap Non Displayed Memory with Display Memory)


----------10----------

942C 

<<<Start = 00:00:36,436>>>

Erase Display Memory
5
00:00:33,067 --> 00:00:36,435
  On second thought, that way.  

But then things start going wrong as the EOC gets further from the start of the frame:


----------14----------

9420 94AE 9452 6275 F420 F2E5 61EC EC79 20E9 F4A7 7320 61EC EC20 E96E 2061 94F2 6461 79A7 7320 F7EF F26B 20E6 EFF2 ADAD 6EEF 20F7 6179 A180 8080 8080 942F 

<<<Start = 00:00:48,782>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Non Displayed Memory
[Main Line 11]
[Indent: 4]but really it’s all in a
[Main Line 12]
[Indent: 4]day’s work for--no way![NULL][NULL|NULL][NULL|NULL]
End Of Caption (Swap Non Displayed Memory with Display Memory)
8
00:00:46,614 --> 00:00:48,781
   I'm sure you all want to   
  thank me personally,          

9
00:00:48,783 --> 00:00:52,251
    but really it's all in a    
    day's work for--no way!     

You can see it's using the timecode for the mp4 frame that contains the packet of 608 data as the start timecode for the second subtitle, even though there's god knows how many characters/commands between the start of the frame and the End of Caption.

CCExtractor output is difficult to interpret because when there are multiple End of Captions in a single frame it still shows each and every available subtitle (unlike VLC), yet the core problem of giving the MP4 frame timecode to the End of Caption is still there. There must be some kind of workaround written into the program, which wouldn't be necessary if it just calculated the timecodes properly in the first place.

<?xml version="1.0" encoding="UTF-8" ?>
<NHNTStream version="1.0" timeScale="30000" mediaType="clcp" mediaSubType="c608" trackID="3" baseMediaFile="F:\Videos\iTunes\TV Shows\Ben 10\Season 1\02 Washington B.C. (1080p HD)_track3.media" >
<NHNTSample DTS="0" dataLength="12" isRAP="yes" />
<NHNTSample DTS="67067" dataLength="28" isRAP="yes" />
<NHNTSample DTS="109109" dataLength="10" isRAP="yes" />
<NHNTSample DTS="391391" dataLength="36" isRAP="yes" />
<NHNTSample DTS="462462" dataLength="10" isRAP="yes" />
<NHNTSample DTS="491491" dataLength="42" isRAP="yes" />
<NHNTSample DTS="556556" dataLength="10" isRAP="yes" />
<NHNTSample DTS="923923" dataLength="32" isRAP="yes" />
<NHNTSample DTS="991991" dataLength="50" isRAP="yes" />
<NHNTSample DTS="1093092" dataLength="10" isRAP="yes" />
<NHNTSample DTS="1292291" dataLength="38" isRAP="yes" />
<NHNTSample DTS="1382381" dataLength="36" isRAP="yes" />
<NHNTSample DTS="1398397" dataLength="74" isRAP="yes" />
<NHNTSample DTS="1463462" dataLength="70" isRAP="yes" />

According to the NHML file the 14th frame starts at 1463462. 1463462 * (1\30000) = 48.782066666666666666666666666667. Which is 00:00:48,782. Which is when the FRAME starts. There's no reason the subtitle ending in "No Way" should start there that I can see, unless there's something seriously wrong with MP4Box.

<NHNTSample DTS="1567566" dataLength="72" isRAP="yes" />

1567566*(1\30000) = 52.2522 = 00:00:52,252, which is when the very next frame starts. The previous frame is 46.613233333333333333333333333333 (00:00:46,613).

I think I've made my point. There's not much more I can do other than ranting.

@ndjamena commented on GitHub (Dec 10, 2016): I got my program working and I can see why I felt the need to use that particular file as an example for CCExtractor. When it's a very simple frame to interpret my program and CCExtractor agree on the timecodes so they're both getting them from the same place: ``` ----------9---------- 9420 94AE 94E0 97A2 4F6E 2073 E5E3 EF6E 6420 F468 EF75 6768 F42C 20F4 6861 F420 F761 79AE 8080 8080 942F <<<Start = 00:00:33,066>>> Resume Caption Loading (Load to Non Displayed Memory) Erase Non Displayed Memory [Main Line 12] [Text: White][TAB2]On second thought, that way.[NULL|NULL][NULL|NULL] End Of Caption (Swap Non Displayed Memory with Display Memory) ----------10---------- 942C <<<Start = 00:00:36,436>>> Erase Display Memory ``` ``` 5 00:00:33,067 --> 00:00:36,435 On second thought, that way. ``` But then things start going wrong as the EOC gets further from the start of the frame: ``` ----------14---------- 9420 94AE 9452 6275 F420 F2E5 61EC EC79 20E9 F4A7 7320 61EC EC20 E96E 2061 94F2 6461 79A7 7320 F7EF F26B 20E6 EFF2 ADAD 6EEF 20F7 6179 A180 8080 8080 942F <<<Start = 00:00:48,782>>> Resume Caption Loading (Load to Non Displayed Memory) Erase Non Displayed Memory [Main Line 11] [Indent: 4]but really it’s all in a [Main Line 12] [Indent: 4]day’s work for--no way![NULL][NULL|NULL][NULL|NULL] End Of Caption (Swap Non Displayed Memory with Display Memory) ``` ``` 8 00:00:46,614 --> 00:00:48,781 I'm sure you all want to thank me personally, 9 00:00:48,783 --> 00:00:52,251 but really it's all in a day's work for--no way! ``` You can see it's using the timecode for the mp4 frame that contains the packet of 608 data as the start timecode for the second subtitle, even though there's god knows how many characters/commands between the start of the frame and the End of Caption. CCExtractor output is difficult to interpret because when there are multiple End of Captions in a single frame it still shows each and every available subtitle (unlike VLC), yet the core problem of giving the MP4 frame timecode to the End of Caption is still there. There must be some kind of workaround written into the program, which wouldn't be necessary if it just calculated the timecodes properly in the first place. ``` <?xml version="1.0" encoding="UTF-8" ?> <NHNTStream version="1.0" timeScale="30000" mediaType="clcp" mediaSubType="c608" trackID="3" baseMediaFile="F:\Videos\iTunes\TV Shows\Ben 10\Season 1\02 Washington B.C. (1080p HD)_track3.media" > <NHNTSample DTS="0" dataLength="12" isRAP="yes" /> <NHNTSample DTS="67067" dataLength="28" isRAP="yes" /> <NHNTSample DTS="109109" dataLength="10" isRAP="yes" /> <NHNTSample DTS="391391" dataLength="36" isRAP="yes" /> <NHNTSample DTS="462462" dataLength="10" isRAP="yes" /> <NHNTSample DTS="491491" dataLength="42" isRAP="yes" /> <NHNTSample DTS="556556" dataLength="10" isRAP="yes" /> <NHNTSample DTS="923923" dataLength="32" isRAP="yes" /> <NHNTSample DTS="991991" dataLength="50" isRAP="yes" /> <NHNTSample DTS="1093092" dataLength="10" isRAP="yes" /> <NHNTSample DTS="1292291" dataLength="38" isRAP="yes" /> <NHNTSample DTS="1382381" dataLength="36" isRAP="yes" /> <NHNTSample DTS="1398397" dataLength="74" isRAP="yes" /> <NHNTSample DTS="1463462" dataLength="70" isRAP="yes" /> ``` According to the NHML file the 14th frame starts at 1463462. 1463462 * (1\30000) = 48.782066666666666666666666666667. Which is 00:00:48,782. Which is when the FRAME starts. There's no reason the subtitle ending in "No Way" should start there that I can see, unless there's something seriously wrong with MP4Box. `<NHNTSample DTS="1567566" dataLength="72" isRAP="yes" />` 1567566*(1\30000) = 52.2522 = 00:00:52,252, which is when the very next frame starts. The previous frame is 46.613233333333333333333333333333 (00:00:46,613). I think I've made my point. There's not much more I can do other than ranting.
Author
Owner

@ndjamena commented on GitHub (Dec 10, 2016):

Well, that's difficult because they're iTunes Files... I prepared the Emperor's New Groove file by completely re-encoding it and doing everything I could to remove anything that might identify me.

I don't think I'm set up to do that at the moment and have no idea how long it will take to figure it out, or even if it's a good idea to try.

@ndjamena commented on GitHub (Dec 10, 2016): Well, that's difficult because they're iTunes Files... I prepared the Emperor's New Groove file by completely re-encoding it and doing everything I could to remove anything that might identify me. I don't think I'm set up to do that at the moment and have no idea how long it will take to figure it out, or even if it's a good idea to try.
Author
Owner

@xvkdev commented on GitHub (Dec 10, 2016):

There's plenty of useful info here to work with. Thanks :-)

On Sat, Dec 10, 2016 at 11:57 AM, ndjamena notifications@github.com wrote:

I got my program working and I can see why I felt the need to use that
particular file as an example for CCExtractor.

When it's a very simple frame to interpret my program and CCExtractor
agree on the timecodes so they're both getting them from the same place:

`----------9----------

9420 94AE 94E0 97A2 4F6E 2073 E5E3 EF6E 6420 F468 EF75 6768 F42C 20F4 6861
F420 F761 79AE 8080 8080 942F

<<<Start = 00:00:33,066>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Non Displayed Memory
[Main Line 12]
[Text: White][TAB2]On second thought, that way.[NULL|NULL][NULL|NULL]
End Of Caption (Swap Non Displayed Memory with Display Memory)

----------10----------

942C

<<<Start = 00:00:36,436>>>

Erase Display Memory`

5 00:00:33,067 --> 00:00:36,435 On second thought, that way.
But then things start going wrong as the EOC gets further from the start
of the frame:
`
----------14----------

9420 94AE 9452 6275 F420 F2E5 61EC EC79 20E9 F4A7 7320 61EC EC20 E96E 2061
94F2 6461 79A7 7320 F7EF F26B 20E6 EFF2 ADAD 6EEF 20F7 6179 A180 8080 8080
942F

<<<Start = 00:00:48,782>>>

Resume Caption Loading (Load to Non Displayed Memory)
Erase Non Displayed Memory
[Main Line 11]
[Indent: 4]but really it’s all in a
[Main Line 12]
[Indent: 4]day’s work for--no way![NULL][NULL|NULL][NULL|NULL]
End Of Caption (Swap Non Displayed Memory with Display Memory)
`

`8
00:00:46,614 --> 00:00:48,781

I'm sure you all want to
thank me personally,

9
00:00:48,783 --> 00:00:52,251
but really it's all in a
day's work for--no way!

`

You can see it's using the timecode for the mp4 frame that contains the
packet of 608 data as the start timecode for the second subtitle, even
though there's god knows how many characters/commands between the start of
the frame and the End of Caption.

CCExtractor output is difficult to interpret because when there a multiple
End of Captions in a single frame it still shows each and every available
subtitle (unlike VLC), yet the core problem of giving the MP4 frame
timecode to the End of Caption is still there. There must be some kind of
workaround written into the program, which wouldn't be necessary if it just
calculated the timecodes properly in the first place.

<?xml version="1.0" encoding="UTF-8" ?> timeScale="30000" mediaType="clcp" mediaSubType="c608" trackID="3"
baseMediaFile="F:\Videos\iTunes\TV Shows\Ben 10\Season 1\02 Washington
B.C. (1080p HD)_track3.media" >


According to the NHML file the 14 frame starts at 1463462. 1463462 *
(1\30000) = 48.782066666666666666666666666667. Which is 00:00:48,782.
Which is when the FRAME starts. There's no reason the subtitle ending in
"No Way" should start there that I can see, unless there's something
seriously wrong with MP4Box.

1567566*(1\30000) = 52.2522 = 00:00:52,252, which is when the very next
frame starts. The previous frame is 46.613233333333333333333333333333
(00:00:46,613).

I think I've made my point. There's not much more I can do other than
ranting.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/204#issuecomment-266233743,
or mute the thread
https://github.com/notifications/unsubscribe-auth/ABF9zbbuG3z0CArgLmh8UWrxtdUigvUnks5rGwQtgaJpZM4FkJeV
.

@xvkdev commented on GitHub (Dec 10, 2016): There's plenty of useful info here to work with. Thanks :-) On Sat, Dec 10, 2016 at 11:57 AM, ndjamena <notifications@github.com> wrote: > I got my program working and I can see why I felt the need to use that > particular file as an example for CCExtractor. > > When it's a very simple frame to interpret my program and CCExtractor > agree on the timecodes so they're both getting them from the same place: > > `----------9---------- > > 9420 94AE 94E0 97A2 4F6E 2073 E5E3 EF6E 6420 F468 EF75 6768 F42C 20F4 6861 > F420 F761 79AE 8080 8080 942F > > <<<Start = 00:00:33,066>>> > > Resume Caption Loading (Load to Non Displayed Memory) > Erase Non Displayed Memory > [Main Line 12] > [Text: White][TAB2]On second thought, that way.[NULL|NULL][NULL|NULL] > End Of Caption (Swap Non Displayed Memory with Display Memory) > > ----------10---------- > > 942C > > <<<Start = 00:00:36,436>>> > > Erase Display Memory` > > 5 00:00:33,067 --> 00:00:36,435 On second thought, that way. > But then things start going wrong as the EOC gets further from the start > of the frame: > ` > ----------14---------- > > 9420 94AE 9452 6275 F420 F2E5 61EC EC79 20E9 F4A7 7320 61EC EC20 E96E 2061 > 94F2 6461 79A7 7320 F7EF F26B 20E6 EFF2 ADAD 6EEF 20F7 6179 A180 8080 8080 > 942F > > <<<Start = 00:00:48,782>>> > > Resume Caption Loading (Load to Non Displayed Memory) > Erase Non Displayed Memory > [Main Line 11] > [Indent: 4]but really it’s all in a > [Main Line 12] > [Indent: 4]day’s work for--no way![NULL][NULL|NULL][NULL|NULL] > End Of Caption (Swap Non Displayed Memory with Display Memory) > ` > > `8 > 00:00:46,614 --> 00:00:48,781 > > I'm sure you all want to > thank me personally, > > 9 > 00:00:48,783 --> 00:00:52,251 > but really it's all in a > day's work for--no way! > > ` > > You can see it's using the timecode for the mp4 frame that contains the > packet of 608 data as the start timecode for the second subtitle, even > though there's god knows how many characters/commands between the start of > the frame and the End of Caption. > > CCExtractor output is difficult to interpret because when there a multiple > End of Captions in a single frame it still shows each and every available > subtitle (unlike VLC), yet the core problem of giving the MP4 frame > timecode to the End of Caption is still there. There must be some kind of > workaround written into the program, which wouldn't be necessary if it just > calculated the timecodes properly in the first place. > > <?xml version="1.0" encoding="UTF-8" ?> <NHNTStream version="1.0" > timeScale="30000" mediaType="clcp" mediaSubType="c608" trackID="3" > baseMediaFile="F:\Videos\iTunes\TV Shows\Ben 10\Season 1\02 Washington > B.C. (1080p HD)_track3.media" > <NHNTSample DTS="0" dataLength="12" > isRAP="yes" /> <NHNTSample DTS="67067" dataLength="28" isRAP="yes" /> > <NHNTSample DTS="109109" dataLength="10" isRAP="yes" /> <NHNTSample > DTS="391391" dataLength="36" isRAP="yes" /> <NHNTSample DTS="462462" > dataLength="10" isRAP="yes" /> <NHNTSample DTS="491491" dataLength="42" > isRAP="yes" /> <NHNTSample DTS="556556" dataLength="10" isRAP="yes" /> > <NHNTSample DTS="923923" dataLength="32" isRAP="yes" /> <NHNTSample > DTS="991991" dataLength="50" isRAP="yes" /> <NHNTSample DTS="1093092" > dataLength="10" isRAP="yes" /> <NHNTSample DTS="1292291" dataLength="38" > isRAP="yes" /> <NHNTSample DTS="1382381" dataLength="36" isRAP="yes" /> > <NHNTSample DTS="1398397" dataLength="74" isRAP="yes" /> <NHNTSample > DTS="1463462" dataLength="70" isRAP="yes" /> > > According to the NHML file the 14 frame starts at 1463462. 1463462 * > (1\30000) = 48.782066666666666666666666666667. Which is 00:00:48,782. > Which is when the FRAME starts. There's no reason the subtitle ending in > "No Way" should start there that I can see, unless there's something > seriously wrong with MP4Box. > > <NHNTSample DTS="1567566" dataLength="72" isRAP="yes" /> > > 1567566*(1\30000) = 52.2522 = 00:00:52,252, which is when the very next > frame starts. The previous frame is 46.613233333333333333333333333333 > (00:00:46,613). > > I think I've made my point. There's not much more I can do other than > ranting. > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/204#issuecomment-266233743>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABF9zbbuG3z0CArgLmh8UWrxtdUigvUnks5rGwQtgaJpZM4FkJeV> > . >
Author
Owner

@ndjamena commented on GitHub (Dec 10, 2016):

http://www.mediafire.com/file/ycvc0h8lhxu05kw/Break+Final+Cut.exe

That was my program. Extract the track using MP4Box as NHML then drag/drop the NHML file onto the EXE and it will make an srt file. I wouldn't trust its output absolutely (it was never actually finished) but it does seem to get the actual timecodes close to perfect. Norton decided to delete it on my computer so...

@ndjamena commented on GitHub (Dec 10, 2016): http://www.mediafire.com/file/ycvc0h8lhxu05kw/Break+Final+Cut.exe That was my program. Extract the track using MP4Box as NHML then drag/drop the NHML file onto the EXE and it will make an srt file. I wouldn't trust its output absolutely (it was never actually finished) but it does seem to get the actual timecodes close to perfect. Norton decided to delete it on my computer so...
Author
Owner

@cfsmp3 commented on GitHub (Dec 13, 2016):

@ndjamena I think - fixed. Timing in MP4 now takes into account number of CC pairs since the last time the PTS was set. It looks good for the file I have. Can you confirm?

@cfsmp3 commented on GitHub (Dec 13, 2016): @ndjamena I think - fixed. Timing in MP4 now takes into account number of CC pairs since the last time the PTS was set. It looks good for the file I have. Can you confirm?
Author
Owner

@ndjamena commented on GitHub (Dec 14, 2016):

Comparing the output of the new executable and my program, other than the weird character CCExtractor uses in place of '♪', the fact that my program has a bug that makes in not output characters that share a byte pair with a null and a few milliseconds rounding/formatting differences in the timecodes, the output of both programs are exactly the same.

I can't tell you if it's correct, but I can say the bug fixes make CCExtractors output very similar to the output of the program I wrote and the program I wrote gave output very similar to what iTunes displayed.

That's about all I can give.

@ndjamena commented on GitHub (Dec 14, 2016): Comparing the output of the new executable and my program, other than the weird character CCExtractor uses in place of '♪', the fact that my program has a bug that makes in not output characters that share a byte pair with a null and a few milliseconds rounding/formatting differences in the timecodes, the output of both programs are exactly the same. I can't tell you if it's correct, but I can say the bug fixes make CCExtractors output very similar to the output of the program I wrote and the program I wrote gave output very similar to what iTunes displayed. That's about all I can give.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#67