Freeview UK - no subtitles detected #279

New Issue

claunia · 2026-01-29T16:39:45Z

claunia commented

2026-01-29 16:39:45 +00:00

Originally created by @akiller on GitHub (Feb 17, 2017).

Hi all,

I'm trying to extract subtitles from Freeview recordings but no matter what options I choose it never seems to detect anything. I've tried different recordings and channels and used both your latest release version as well as compiling the latest code myself to no avail.

Here's a sample file (~12MB) recorded on a Hauppauge Nova-T
https://www.dropbox.com/s/8ir9ofo03zir90h/8ooTC.zip?dl=1

And here's a log when I tried to process it:
C:\Temp\ccextractor\ccextractor-master\windows\Release-Full\ccextractorwinfull.exe --gui_mode_reports -haup -autoprogram -out=srt -bom -latin1 [+input files]

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc

Input: \serv\videos\TV\8 Out of 10 Cats Does Countdown\8ooTC.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: Yes] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]

Opening file: \serv\videos\TV\8 Out of 10 Cats Does Countdown\8ooTC.ts

File seems to be a transport stream, enabling TS mode

Analyzing data in general mode
Creating \serv\videos\TV\8 Out of 10 Cats Does Countdown\8ooTC.srt

Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0

Min PTS: 00:00:00:246
Max PTS: 00:00:44:219
Length: 00:00:43:973

Done, processing time = 0 seconds

No captions were found in input.
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

Any help would be appreciated - cheers :).

Originally created by @akiller on GitHub (Feb 17, 2017). Hi all, I'm trying to extract subtitles from Freeview recordings but no matter what options I choose it never seems to detect anything. I've tried different recordings and channels and used both your latest release version as well as compiling the latest code myself to no avail. Here's a sample file (~12MB) recorded on a Hauppauge Nova-T https://www.dropbox.com/s/8ir9ofo03zir90h/8ooTC.zip?dl=1 And here's a log when I tried to process it: `C:\Temp\ccextractor\ccextractor-master\windows\Release-Full\ccextractorwinfull.exe --gui_mode_reports -haup -autoprogram -out=srt -bom -latin1 [+input files]` > CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke. > Teletext portions taken from Petr Kutalek's telxcc > -------------------------------------------------------------------------- > Input: \\serv\videos\TV\8 Out of 10 Cats Does Countdown\8ooTC.ts > [Extract: 1] [Stream mode: Autodetect] > [Program : Auto ] [Hauppage mode: Yes] [Use MythTV code: Auto] > [Timing mode: Auto] [Debug: No] [Buffer input: Yes] > [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] > [Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No] > [Add font color data: Yes] [Add font typesetting: Yes] > [Convert case: No] [Video-edit join: No] > [Extraction start time: not set (from start)] > [Extraction end time: not set (to end)] > [Live stream: No] [Clock frequency: 90000] > [Teletext page: Autodetect] > [Start credits text: None] > > > ----------------------------------------------------------------- > > Opening file: \\serv\videos\TV\8 Out of 10 Cats Does Countdown\8ooTC.ts > > File seems to be a transport stream, enabling TS mode > > Analyzing data in general mode > Creating \\serv\videos\TV\8 Out of 10 Cats Does Countdown\8ooTC.srt > > Number of NAL_type_7: 0 > Number of VCL_HRD: 0 > Number of NAL HRD: 0 > Number of jump-in-frames: 0 > Number of num_unexpected_sei_length: 0 > > Min PTS: 00:00:00:246 > Max PTS: 00:00:44:219 > Length: 00:00:43:973 > > Done, processing time = 0 seconds > > No captions were found in input. > Issues? Open a ticket here > https://github.com/CCExtractor/ccextractor/issues > Any help would be appreciated - cheers :).

claunia closed this issue

2026-01-29 16:39:46 +00:00

claunia commented

2026-01-29 16:39:46 +00:00

@Izaron commented on GitHub (Feb 19, 2017):

Without -haup I get (well, on linux) this output - link
Can you uncheck "Hauppage" switch button and try one time more?

@Izaron commented on GitHub (Feb 19, 2017): Without **-haup** I get (well, on linux) this output - [link](https://paste.fedoraproject.org/paste/8PDvHFUkAnO-XiyXPVIuIF5M1UNdIGYhyRLivL9gydE=/raw) Can you uncheck "Hauppage" switch button and try one time more?

claunia commented

2026-01-29 16:39:47 +00:00

@akiller commented on GitHub (Feb 19, 2017):

That's interesting that it works for you.
I wonder if it's a Windows issue? Without using Haup I get the same with a 0kb srt:

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc

Input: C:\Temp\8ooTC.ts
[Extract: 1] [Stream mode: Transport]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]

Opening file: C:\Temp\8ooTC.ts

Analyzing data in general mode
Creating C:\Temp\8ooTC.srt

Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0

Min PTS: 00:00:00:246
Max PTS: 00:00:44:219
Length: 00:00:43:973

Done, processing time = 1 seconds

No captions were found in input.
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

@akiller commented on GitHub (Feb 19, 2017): That's interesting that it works for you. I wonder if it's a Windows issue? Without using Haup I get the same with a 0kb srt: > CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke. > Teletext portions taken from Petr Kutalek's telxcc > -------------------------------------------------------------------------- > Input: C:\Temp\8ooTC.ts > [Extract: 1] [Stream mode: Transport] > [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] > [Timing mode: Auto] [Debug: No] [Buffer input: Yes] > [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] > [Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No] > [Add font color data: Yes] [Add font typesetting: Yes] > [Convert case: No] [Video-edit join: No] > [Extraction start time: not set (from start)] > [Extraction end time: not set (to end)] > [Live stream: No] [Clock frequency: 90000] > [Teletext page: Autodetect] > [Start credits text: None] > > > ----------------------------------------------------------------- > > Opening file: C:\Temp\8ooTC.ts > > Analyzing data in general mode > Creating C:\Temp\8ooTC.srt > > Number of NAL_type_7: 0 > Number of VCL_HRD: 0 > Number of NAL HRD: 0 > Number of jump-in-frames: 0 > Number of num_unexpected_sei_length: 0 > > Min PTS: 00:00:00:246 > Max PTS: 00:00:44:219 > Length: 00:00:43:973 > > Done, processing time = 1 seconds > > No captions were found in input. > Issues? Open a ticket here > https://github.com/CCExtractor/ccextractor/issues

claunia commented

2026-01-29 16:39:47 +00:00

@cfsmp3 commented on GitHub (Feb 20, 2017):

Are you using the full version (which includes the OCR), not the compact
one?

On Sun, Feb 19, 2017 at 11:45 AM, Andrew Killer notifications@github.com
wrote:

That's interesting that it works for you.

I wonder if it's a Windows issue? Without using Haup I get the same with a
0kb SRT:

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc

Input: C:\Temp\8ooTC.ts
[Extract: 1] [Stream mode: Transport]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]

Opening file: C:\Temp\8ooTC.ts

Analyzing data in general mode
Creating C:\Temp\8ooTC.srt

Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0

Min PTS: 00:00:00:246
Max PTS: 00:00:44:219
Length: 00:00:43:973

Done, processing time = 1 seconds

No captions were found in input.
Issues? Open a ticket here
https://github.com/CCExtractor/ccextractor/issues

—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/691#issuecomment-280943176,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AFrJ2cVSJaqz7cxa_zuEarCzkxh14irtks5reJvSgaJpZM4MEslM
.

@cfsmp3 commented on GitHub (Feb 20, 2017): Are you using the full version (which includes the OCR), not the compact one? On Sun, Feb 19, 2017 at 11:45 AM, Andrew Killer <notifications@github.com> wrote: > That's interesting that it works for you. > > I wonder if it's a Windows issue? Without using Haup I get the same with a > 0kb SRT: > > CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke. > Teletext portions taken from Petr Kutalek's telxcc > > Input: C:\Temp\8ooTC.ts > [Extract: 1] [Stream mode: Transport] > [Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto] > [Timing mode: Auto] [Debug: No] [Buffer input: Yes] > [Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No] > [Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No] > [Add font color data: Yes] [Add font typesetting: Yes] > [Convert case: No] [Video-edit join: No] > [Extraction start time: not set (from start)] > [Extraction end time: not set (to end)] > [Live stream: No] [Clock frequency: 90000] > [Teletext page: Autodetect] > [Start credits text: None] > ------------------------------ > > Opening file: C:\Temp\8ooTC.ts > > Analyzing data in general mode > Creating C:\Temp\8ooTC.srt > > Number of NAL_type_7: 0 > Number of VCL_HRD: 0 > Number of NAL HRD: 0 > Number of jump-in-frames: 0 > Number of num_unexpected_sei_length: 0 > > Min PTS: 00:00:00:246 > Max PTS: 00:00:44:219 > Length: 00:00:43:973 > > Done, processing time = 1 seconds > > No captions were found in input. > Issues? Open a ticket here > https://github.com/CCExtractor/ccextractor/issues > > — > You are receiving this because you are subscribed to this thread. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/691#issuecomment-280943176>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/AFrJ2cVSJaqz7cxa_zuEarCzkxh14irtks5reJvSgaJpZM4MEslM> > . >

claunia commented

2026-01-29 16:39:48 +00:00

@akiller commented on GitHub (Feb 20, 2017):

Hi Carlos,

I was using the full version. I was also using the GUI. When I ran ccextractorwinfull.exe manually I noticed some output which wasn't in the output from the GUI:

Opening file: 8ooTC.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Error opening data file \temp\cce\tessdata/eng.traineddata
Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.
Failed loading language 'eng'
Tesseract couldn't load any languages!
Creating 8ooTC.srt

Downloading eng.traineddata from https://github.com/tesseract-ocr/tessdata and putting it in tessdata/eng.traineddata now seems to have solved the problem for both command line and the GUI.

It may be worth updating the GUI to make this clear?

Thanks for your help.

@akiller commented on GitHub (Feb 20, 2017): Hi Carlos, I was using the full version. I was also using the GUI. When I ran ccextractorwinfull.exe manually I noticed some output which wasn't in the output from the GUI: > Opening file: 8ooTC.ts > File seems to be a transport stream, enabling TS mode > Analyzing data in general mode > **Error opening data file \temp\cce\tessdata/eng.traineddata** > **Please make sure the TESSDATA_PREFIX environment variable is set to the parent directory of your "tessdata" directory.** > Failed loading language 'eng' > Tesseract couldn't load any languages! > Creating 8ooTC.srt Downloading eng.traineddata from https://github.com/tesseract-ocr/tessdata and putting it in tessdata/eng.traineddata now seems to have solved the problem for both command line and the GUI. It may be worth updating the GUI to make this clear? Thanks for your help.

claunia referenced this issue

2026-01-29 17:17:59 +00:00

[PR #857] [MERGED] [Fix] Fix OCR issue caused by separated dvb subtitle regions #1688

Sign in to join this conversation.

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/ccextractor#279

Freeview UK - no subtitles detected #279

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc

Freeview UK - no subtitles detected #279

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc

CCExtractor 0.85, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc