DVB: Incorrect timing #172

New Issue

claunia · 2026-01-29T16:36:53Z

claunia commented

2026-01-29 16:36:53 +00:00

Originally created by @cfsmp3 on GitHub (Jul 5, 2016).

Originally assigned to: @anshul1912, @Abhinav95 on GitHub.

Using the usual "01-BBC1.London.News.ts" and comparing with playback via VLC (which is perfect) I can see that out timing is off and we have a number of issues.

First, in .srt
1
00:00:00,001 --> 00:00:00,000
Where am I? Where am I?!
Where am I?!

The first problem is obvious, with the end time being before the start time for that time.

In spupng that line looks like this:

The other thing is that the "Where am I..." text appears in VLC (in sync with audio, so they have it right) at around 00:04:xx. It doesn't appear immediately as both our .srt and .xml do.

Moving forward, picking a random line:
"Is this all a joke to you?"

Appears in VLC at 08:48, but we have it at
00:08:40,307 --> 00:08:42,976
Is this all a joke to you?

The final subtitles in VLC are
Jane had squeezed my hand.
Yeah. . .was it just that?

which appear at 16:57. We have this at the end:

334
00:16:49,197 --> 00:16:51,966
Jane had squeezed my hand.
Yeah. . .was it just that?

335
00:16:51,967 --> 00:16:55,346
Or was it to do
with the sentencing as well?

The final frame (335) doesn't appear with VLC at all, and that's correct as the audio isn't there.

Originally created by @cfsmp3 on GitHub (Jul 5, 2016). Originally assigned to: @anshul1912, @Abhinav95 on GitHub. Using the usual "01-BBC1.London.News.ts" and comparing with playback via VLC (which is perfect) I can see that out timing is off and we have a number of issues. First, in .srt 1 00:00:00,001 --> 00:00:00,000 Where am I? Where am I?! Where am I?! The first problem is obvious, with the end time being before the start time for that time. In spupng that line looks like this:  <spu start="0.001" end="2.720" image="C:\DeletableTempStuff\UK Samples\Official Repository\01-BBC1.EastEnders.d/sub0001.png" xoffset="0" yoffset="454"> </spu> The other thing is that the "Where am I..." text appears in VLC (in sync with audio, so they have it right) at around 00:04:xx. It doesn't appear immediately as both our .srt and .xml do. Moving forward, picking a random line: "Is this all a joke to you?" Appears in VLC at 08:48, but we have it at 00:08:40,307 --> 00:08:42,976 Is this all a joke to you? The final subtitles in VLC are Jane had squeezed my hand. Yeah. . .was it just that? which appear at 16:57. We have this at the end: 334 00:16:49,197 --> 00:16:51,966 Jane had squeezed my hand. Yeah. . .was it just that? 335 00:16:51,967 --> 00:16:55,346 Or was it to do with the sentencing as well? The final frame (335) doesn't appear with VLC at all, and that's correct as the audio isn't there.

claunia closed this issue

2026-01-29 16:36:53 +00:00

claunia commented

2026-01-29 16:36:54 +00:00

@cfsmp3 commented on GitHub (Aug 1, 2016):

That should make it obvious. You can see that when VLC shows the subtitles (in perfect sync with audio) CCExtractor is a few subtitles frames ahead.

@cfsmp3 commented on GitHub (Aug 1, 2016): ![bbc2_timing](https://cloud.githubusercontent.com/assets/5949913/17305151/029741ca-57de-11e6-817d-43df25ab6ccd.png) That should make it obvious. You can see that when VLC shows the subtitles (in perfect sync with audio) CCExtractor is a few subtitles frames ahead.

claunia commented

2026-01-29 16:36:54 +00:00

@Vindictor commented on GitHub (Oct 29, 2016):

The other issue I'm finding when editing files that have run through OCR is that the outputted .srt file doesn't provide any gaps on screen when there's no talking.

What I mean is this.
I've just been editing a documentary series.

Firstly, no matter when the talking actually first occurs, the first line of dialogue is shown right at the start of the .srt.

Also, when the title music plays and there's no talking, instead of showing no subtitle, it shows whatever the next line of dialogue will be throughout the title sequence and right up until the line is actually spoken.

Another example is that in the last episode I looked at, there was a 1 minute section with no talking, while some wildlife was doing it's thing.

Once again, after the last line of dialogue was spoken and the correct subtitle displayed, my .srt file then displays what will be the next line of dialogue for the entire 1 minute +, until the line is actually spoken.

The outputted .srt doesn't seem to support any gaps between speaking, and always shows the following line of text on screen.

I've been manually editing the .srt in Notepad++, setting the first line to start with wherever the actual dialogue starts. Then also changing the start time of the first line of dialogue after the title sequence, so that this line of text isn't on screen during the entire title sequence, and then I try to manually scroll through and spot where there seem to be any unnaturally long cases of a line of text remaining on screen.

EDIT: and if you're interested, here is the file I was referencing above.
https://drive.google.com/file/d/0B0DIrRkpdn12MmZBckw0N3cwNnc/view?usp=sharing

@Vindictor commented on GitHub (Oct 29, 2016): The other issue I'm finding when editing files that have run through OCR is that the outputted .srt file doesn't provide any gaps on screen when there's no talking. What I mean is this. I've just been editing a documentary series. Firstly, no matter when the talking actually first occurs, the first line of dialogue is shown right at the start of the .srt. Also, when the title music plays and there's no talking, instead of showing no subtitle, it shows whatever the next line of dialogue will be throughout the title sequence and right up until the line is actually spoken. Another example is that in the last episode I looked at, there was a 1 minute section with no talking, while some wildlife was doing it's thing. Once again, after the last line of dialogue was spoken and the correct subtitle displayed, my .srt file then displays what will be the next line of dialogue for the entire 1 minute +, until the line is actually spoken. The outputted .srt doesn't seem to support any gaps between speaking, and always shows the following line of text on screen. I've been manually editing the .srt in Notepad++, setting the first line to start with wherever the actual dialogue starts. Then also changing the start time of the first line of dialogue after the title sequence, so that this line of text isn't on screen during the entire title sequence, and then I try to manually scroll through and spot where there seem to be any unnaturally long cases of a line of text remaining on screen. EDIT: and if you're interested, here is the file I was referencing above. https://drive.google.com/file/d/0B0DIrRkpdn12MmZBckw0N3cwNnc/view?usp=sharing

claunia commented

2026-01-29 16:36:54 +00:00

@cfsmp3 commented on GitHub (Nov 9, 2016):

Code-in task created.

@cfsmp3 commented on GitHub (Nov 9, 2016): Code-in task created.

claunia commented

2026-01-29 16:36:55 +00:00

@Vindictor commented on GitHub (Nov 27, 2016):

I've just tried extracting DVB subs from another file, but am having similar timing issues. This one from a DVB-S source.
Once again, there is no gap in the subtitles on screen, even if minutes go by with no dialogue. On this file the last line of text remains on screen until somebody else speaks. IE. take note of subtitle 9 from my srt below

00:00:00,001 --> 00:00:02,230
This programme contains
some strong language.

2
00:00:02,231 --> 00:00:04,480
Good morning. Please come this way.

3
00:00:04,481 --> 00:00:06,719
Thank you.

4
00:00:06,720 --> 00:00:13,289
Good morning, Mr Sodergren.
How are you?

5
00:00:13,290 --> 00:00:17,029
Your bicycle is ready.

6
00:00:17,030 --> 00:00:18,230
How are you doing today, sir?
Good to see you.

7
00:00:18,231 --> 00:00:21,309
Did you go running this morning?
I woke up very early this morning.

8
00:00:21,310 --> 00:00:27,210
Have a good day, sir.
The same to you, my friend.

9
00:00:27,211 --> 00:02:36,649
I'll be with you in just a moment.

10
00:02:36,650 --> 00:02:38,499
It's quite a short interview,

When using OCR to extract dvb-subs it never seems to allow for gaps between dialogue. One line of text will remain onscreen until the next begins.
The line "I'll be with you in just a moment." remains on screen for approx 2 minutes, rather than for a few seconds.
Currently if I want to extract DVB-Subs I need to go through each file with Notepad++ and look for moments like these, and manually change the end time of a line of dialogue.
I've tried CCExtractor 0.82, and 0.83a3 which was given to me to work with Humax PVR files, but using the 0.82 GUI with the 0.83a3.exe file.

@Vindictor commented on GitHub (Nov 27, 2016): I've just tried extracting DVB subs from another file, but am having similar timing issues. This one from a DVB-S source. Once again, there is no gap in the subtitles on screen, even if minutes go by with no dialogue. On this file the last line of text remains on screen until somebody else speaks. IE. take note of subtitle 9 from my srt below 00:00:00,001 --> 00:00:02,230 This programme contains some strong language. 2 00:00:02,231 --> 00:00:04,480 Good morning. Please come this way. 3 00:00:04,481 --> 00:00:06,719 Thank you. 4 00:00:06,720 --> 00:00:13,289 Good morning, Mr Sodergren. How are you? 5 00:00:13,290 --> 00:00:17,029 Your bicycle is ready. 6 00:00:17,030 --> 00:00:18,230 How are you doing today, sir? Good to see you. 7 00:00:18,231 --> 00:00:21,309 Did you go running this morning? I woke up very early this morning. 8 00:00:21,310 --> 00:00:27,210 Have a good day, sir. The same to you, my friend. 9 00:00:27,211 --> 00:02:36,649 I'll be with you in just a moment. 10 00:02:36,650 --> 00:02:38,499 It's quite a short interview, When using OCR to extract dvb-subs it never seems to allow for gaps between dialogue. One line of text will remain onscreen until the next begins. The line "I'll be with you in just a moment." remains on screen for approx 2 minutes, rather than for a few seconds. Currently if I want to extract DVB-Subs I need to go through each file with Notepad++ and look for moments like these, and manually change the end time of a line of dialogue. I've tried CCExtractor 0.82, and 0.83a3 which was given to me to work with Humax PVR files, but using the 0.82 GUI with the 0.83a3.exe file.

claunia commented

2026-01-29 16:36:55 +00:00

@cfsmp3 commented on GitHub (Nov 29, 2016):

Direct link to 01-BBC1.London.News.ts:
https://drive.google.com/open?id=0B_61ywKPmI0TN3dERnRVazJIZ3c

@cfsmp3 commented on GitHub (Nov 29, 2016): Direct link to 01-BBC1.London.News.ts: https://drive.google.com/open?id=0B_61ywKPmI0TN3dERnRVazJIZ3c

claunia commented

2026-01-29 16:36:56 +00:00

@Vindictor commented on GitHub (Nov 30, 2016):

I've just been trying to extract subtitles from another programme, from a different source, and a different channel. (TvHeadend, DVB-T, Ch5HD, UK)

https://drive.google.com/file/d/0B03VrTibH96mQkZaZkNWQmR4XzQ/view?usp=sharing

Always the same timing issue regardless of source or channel, but only with DVB-Subs.
If I make a new recording from DVB-S in the UK, which also come with teletext subtitles, then CCExtractor does a perfect job, even retaining colour information. I love it.

Sadly, when using DVB-T recordings (which only come with DVB-Subs), or DVB-S recordings that have been remuxed and the teletext lost, I always have these timing issues with DVB-Subs.

Really hoping you clever guys can solve it. I'd just love for the DVB-Subs to be as good as the teletext. The OCR part seems really good.

Thanks, and good night.

@Vindictor commented on GitHub (Nov 30, 2016): I've just been trying to extract subtitles from another programme, from a different source, and a different channel. (TvHeadend, DVB-T, Ch5HD, UK) https://drive.google.com/file/d/0B03VrTibH96mQkZaZkNWQmR4XzQ/view?usp=sharing Always the same timing issue regardless of source or channel, but only with DVB-Subs. If I make a new recording from DVB-S in the UK, which also come with teletext subtitles, then CCExtractor does a perfect job, even retaining colour information. I love it. Sadly, when using DVB-T recordings (which only come with DVB-Subs), or DVB-S recordings that have been remuxed and the teletext lost, I always have these timing issues with DVB-Subs. Really hoping you clever guys can solve it. I'd just love for the DVB-Subs to be as good as the teletext. The OCR part seems really good. Thanks, and good night.

claunia commented

2026-01-29 16:36:56 +00:00

@JuanPotato commented on GitHub (Dec 7, 2016):

This seems like a fun task to take on. Once my current one is accepted I plan to work on this

EDIT: anyone going to try this, it is quite fun

@JuanPotato commented on GitHub (Dec 7, 2016): This seems like a *fun* task to take on. Once my current one is accepted I plan to work on this EDIT: anyone going to try this, it is quite fun

claunia commented

2026-01-29 16:36:56 +00:00

@cfsmp3 commented on GitHub (Dec 7, 2016):

Solve this problem and if you come to California I'll personally invite you
and your family to lunch.

On Tue, Dec 6, 2016 at 4:31 PM, Juan Potato notifications@github.com
wrote:

This seems like a fun task to take on. Once my current one is accepted
I plan to work on this

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/400#issuecomment-265318488,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2ehjvQUuS-l5pxzb38BD3JXNj1Qkks5rFf5kgaJpZM4JFbbL
.

@cfsmp3 commented on GitHub (Dec 7, 2016): Solve this problem and if you come to California I'll personally invite you and your family to lunch. On Tue, Dec 6, 2016 at 4:31 PM, Juan Potato <notifications@github.com> wrote: > This seems like a *fun* task to take on. Once my current one is accepted > I plan to work on this > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/400#issuecomment-265318488>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2ehjvQUuS-l5pxzb38BD3JXNj1Qkks5rFf5kgaJpZM4JFbbL> > . >

claunia commented

2026-01-29 16:36:56 +00:00

@JuanPotato commented on GitHub (Dec 11, 2016):

@cfsmp3 Maybe another task then, I don't have enough understanding of the code in the project to know how to really get things together.

@JuanPotato commented on GitHub (Dec 11, 2016): @cfsmp3 Maybe another task then, I don't have enough understanding of the code in the project to know how to really get things together.

claunia commented

2026-01-29 16:36:57 +00:00

@cfsmp3 commented on GitHub (Dec 11, 2016):

That's fine, still plenty of time before code-in is over. Maybe in 3 weeks
or so from now you'll feel ready to give it another go.

On Sun, Dec 11, 2016 at 1:58 PM, Juan Potato notifications@github.com
wrote:

@cfsmp3 https://github.com/cfsmp3 Maybe another task then, I don't have
enough understanding of the code in the project to know how to really get
things together.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/400#issuecomment-266311257,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2R4gbo39rYjMYSAgR9UoNRVLfmPJks5rHHH7gaJpZM4JFbbL
.

@cfsmp3 commented on GitHub (Dec 11, 2016): That's fine, still plenty of time before code-in is over. Maybe in 3 weeks or so from now you'll feel ready to give it another go. On Sun, Dec 11, 2016 at 1:58 PM, Juan Potato <notifications@github.com> wrote: > @cfsmp3 <https://github.com/cfsmp3> Maybe another task then, I don't have > enough understanding of the code in the project to know how to really get > things together. > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/400#issuecomment-266311257>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2R4gbo39rYjMYSAgR9UoNRVLfmPJks5rHHH7gaJpZM4JFbbL> > . >

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/ccextractor#172