mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-15 05:26:07 +00:00
Humax HDR Fox-T2 UK DVB-T2 can't extract subs #181
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Vindictor on GitHub (Sep 15, 2016).
Originally assigned to: @canihavesomecoffee, @Abhinav95 on GitHub.
As per #388
I've been using this PVR for many years, but recently wanted to begin extracting .srt subtitles and then remux the file to make subs more compatible with different players.
I've tried a few different files, but they all come back with the same error.
I just had a look in my local archive and the smallest file I have right now is 2gb.
I'll upload it to Google Drive, but if it's too large for you to deal with, I'll especially record something shorter and replace it later.
https://drive.google.com/file/d/0B0DIrRkpdn12Y2hOc25rTzBTRG8/view?usp=sharing
This is the output I see from CCExtractor for this untouched recording.
CCExtractor 0.82, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: \Mcp\mcp 2\Media\TV Recorded\Horizon_ Are Video Games Really That____20150916_2000.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
Opening file: \Mcp\mcp 2\Media\TV Recorded\Horizon_ Are Video Games Really That____20150916_2000.ts
Detected MP4 box with name: free
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Premature end of file (incomplete TS packer header, expected 4 bytes, got 3).
ATTENTION!!!!!!
In switch_to_next_file(): Processing of \Mcp\mcp 2\Media\TV Recorded\Horizon_ Are Video Games Really That____20150916_2000.ts 0 ended prematurely 3 < 2258808832, please send bug report.
Done, processing time = 0 seconds
This is beta software. Report issues to carlos at ccextractor org...
@cfsmp3 commented on GitHub (Sep 15, 2016):
Fun sample.
@Vindictor commented on GitHub (Sep 15, 2016):
I do apologise. The "sample" is rather on the long side.
If you'd rather I record a random 5 min segment and upload that, I will.
Just to be clear, this is in no way related to my previous VideoRedo issues post.
This post is about raw recordings from my Humax HDR Fox-t2.
The videoredo output was originally from my dvb-s tvheadend setup.
Two separate issues.
Thanks.
@cfsmp3 commented on GitHub (Sep 15, 2016):
OK then I'm missing things up. My comment was about this sample.
We'll go over both things anyway.
Long samples are OK; we can always split them ourselves.
@Vindictor commented on GitHub (Sep 15, 2016):
It's ok. We're on the same page.
When you put fun sample, I thought you were making a comment about the length... not example a "sample". ;)
Then I was trying to avoid confusion by stating that both posts are separate because the recordings are from totally different sources. I didn't want anyone to think that the .ts I'm discussing here came from the same source as the original videoredo recording.
Instead of making things clear, I've made them less so.
I'll shut up and let you guys work your magic. :)
I'd really appreciate being able to extract subs from recordings, made on this PVR.
Thanks.
@cfsmp3 commented on GitHub (Sep 16, 2016):
Fun sample as we'll have fun with it :-)
Feel free to submit more links in both tickets. The more we have the easier
in general it will be for us to figure out what's going on.
On Thu, Sep 15, 2016 at 1:42 PM, Vindictor notifications@github.com wrote:
@Vindictor commented on GitHub (Sep 16, 2016):
OK, thank you. I will do.
It also just occurred to me that, because I rarely record SD content, I should try a recording some from this PVR to see if CCExtractor can get the subs from that, which still uses MPEG2 in this country.
If you guys have downloaded that large sample file then I should probably consider removing it for potential copyright issues! ;)
Thanks again for your help.
Perhaps I'll set a timer to simultaneously record a programme both from a HD and SD channel for analysis.
@Vindictor commented on GitHub (Sep 16, 2016):
OK, so as I type this a timer should just be completing on my Humax HDR Fox-T2.
As I mentioned above, I have recorded the same show on both an SD and HD channel.
I shall retrieve the recordings from the box shortly, upload to Google Drive, and then edit this post and paste the links below. Actually I shall try both in CCExtractor here first, also.
I hope you have fun with those too!
Thanks once again
That's interesting. I hadn't noticed that when I set these two timers, there was already a previous timer set for another HD channel. So this PVR recorded three channels simultaneously! Two HD channels and one SD channel. I thought that was pretty cool. Now transferring the two recordings for you off the PVR while the other HD channel still continues to record. Hopefully the recording will remain watchable ;)
Here are your Google Drive links.
SD
https://drive.google.com/file/d/0B0DIrRkpdn12Q3preXdGUGxjWkU/view?usp=sharing
HD
https://drive.google.com/file/d/0B0DIrRkpdn12TUJnTjlHVUVtTFU/view?usp=sharing
I ran both SD and HD versions of this broadcast through CCExtractor and got the same error as before. I'll post the SD version output below, but I'm sure you'll see for yourselves when you test.
As usual, dvb subs play fine in players such as MPC-HC.
Many thanks again for your help.
CCExtractor 0.82, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: F:\PVR\SD.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
Opening file: F:\PVR\SD.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Premature end of file (incomplete TS packer header, expected 4 bytes, got 3).
ATTENTION!!!!!!
In switch_to_next_file(): Processing of F:\PVR\SD.ts 0 ended prematurely 3 < 811925504, please send bug report.
Done, processing time = 0 seconds
This is beta software. Report issues to carlos at ccextractor org...
@Vindictor commented on GitHub (Sep 22, 2016):
Hi there,
Does everybody who requires a copy of the original large sample file have a copy?
I'm going to take it down again soon, so hopefully you have it now.
I recorded a documentary last night, which I'd have liked to have extracted the subtitles from.
I could make that available if requested? The original large sample would need to come down, first, though.
This new documentary is recorded on the same PVR and exhibits the same errors as before, so I'm not sure if any further examples of similarly recorded .ts files would help?
@Abhinav95 commented on GitHub (Sep 22, 2016):
Hi @Vindictor
You can go ahead and delete the large sample file. I have downloaded a copy.
It would be nice if you could upload a smaller file for testing as well.
@Vindictor commented on GitHub (Oct 12, 2016):
Hi there @Abhinav95
I'm very sorry for the long delay. I never received a notification, I shall have to check spam folder settings.
Hopefully you also found the two shorter samples above, where I recorded the same show at the same time on both SD and HD channels to try with CCExtractor? (they both failed)
Today I also have a new question.
Does CCExtractor work at all with UK DVB-T / Freeview dvbsub streams?
The reason I ask is this:
For quite a while I've used a Linux TVHeadend system for my satellite recording duties.
CCExtractor can read TVHeadend DVB-S .ts recordings without issue, even after editing.
Because CCExtractor cannot extract subs from DVB-T recordings on my Humax box as described above, today I decided to build a new system here, connecting a DVB-T2 tuner to another TVHeadend system.
I've recorded a couple of test recordings, BUT, CCExtractor is unable to extract subtitles from these .ts files either.
So now I've tried two completely separate systems. A standalone Humax PVR, and a Linux based TVHeadend install, and CCExtractor fails to extract subtitles from either stream.
However, using TVHeadend for DVB-S recording works perfectly. So I hope you can understand why I wonder if CCExtractor can extract from UK DVB-T recordings at all.
I will upload a sample shortly. I'm currently recording another programme to test. I shall delete the large sample from above, and upload a new sample from TVHeadend DVB-T.
Many thanks for your help.
I actually built the DVB-T TVHeadend system today for CCExtractor, assuming that CCExtractor would be compatible with the .ts files it outputs, because it works with my DVB-S TVHeadend system.
I hope a solution can be found!
Kind regards, I shall post a sample file shortly.
Thanks.
EDIT: I've just tried my new DVB-T sample recordings in CCExtractor and sure enough, it doesn't work.
I'm uploading the recording to Google Drive to share shortly, here is the log from that file that I get in CCExtractor.
I had also recorded the exact same show, from the same channel, but on the DVB-S system, and the subtitles extracted perfectly.
CCExtractor 0.82, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: F:\PVR\BBC TWO HD-Coast_ The Great Guide.2016-10-12.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
Opening file: F:\PVR\BBC TWO HD-Coast_ The Great Guide.2016-10-12.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
DVB subtitles detected, OCR subsystem not present. Use -out=spupng for graphic output
Creating F:\PVR\BBC TWO HD-Coast_ The Great Guide.2016-10-12.srt
Changed fps using NAL to: 25.000000
Found large gap(3221670) in PTS! Trying to recover ...
Found large gap(3221662) in PTS! Trying to recover ...
Found large gap(3221658) in PTS! Trying to recover ...
Found large gap(3221656) in PTS! Trying to recover ...
Found large gap(3221660) in PTS! Trying to recover ...
Found large gap(3221666) in PTS! Trying to recover ...
Found large gap(3221664) in PTS! Trying to recover ...
Found large gap(3221668) in PTS! Trying to recover ...
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Changed fps using NAL to: 25.000000
Number of NAL_type_7: 4652
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 126
Number of num_unexpected_sei_length: 0
Total frames time: 01:13:28:600 (110215 frames at 25.00fps)
Min PTS: 17:53:52:158
Max PTS: 19:07:21:878
Length: 01:13:29:720
Done, processing time = 181 seconds
This is beta software. Report issues to carlos at ccextractor org...
@Vindictor commented on GitHub (Oct 12, 2016):
@Abhinav95 Here is the new TvHeadend DVB-T recording as described in my previous post, with which I am also unable to use CCExtractor.
Thank you for your continued help
https://drive.google.com/file/d/0B0DIrRkpdn12NkdmTW9FR3NrMlk/view?usp=sharing
@Abhinav95 commented on GitHub (Oct 13, 2016):
Hi @Vindictor
With the new recording, I am able to extract the subtitles just fine.
I see that in your log, you have a message that says "DVB subtitles detected, OCR subsystem not present. Use -out=spupng for graphic output". This tells me that you do not have Tesseract OCR properly set up to work with CCExtractor. If you are on Linux, make sure that
pkg-config --libs tesseractandpkg-config --libs leptrun without error on your shell. Also, be sure that you are compiling CCExtractor withmake ENABLE_OCR=yes, for the OCR subsystem to work. If you are on Windows, with the GUI version, all of this should be taken care of automatically.For the other three files, the error about ending prematurely persists, and I'll look into it, but the new file works fine. Please let me know if you are able to get it running.
Attached is the complete output I get:-
BBC-greatguide.txt
@Vindictor commented on GitHub (Oct 13, 2016):
Hi there @Abhinav95
Thanks for that.
Yes, I'm running the Windows GUI version with auto / defaults on.
I shall attempt to find the tesseract OCR option in the menu and enable it.
As I said before, though, this does work with the exact same programmes recorded from my DVB-S tuner, which uses DVBsubs, and it correctly converts the DVBsubs to .srt without issue.
I set an identical timer for the same channel on my DVB-S TvHeadend setup, and the subs extracted immediately.
Well, I can't get it to work!
I even downloaded the file again from the link I provided to see if it got altered in any way by google drive.
I get the same error as I originally posted.
This is using the Windows 0.82 binaries. Maybe an older version would work better? Are you using 0.82?
I do see a folder called tessdata in my CCExtractor folder, and as I said, this works fine with dvb-s recorded .ts files.
That's really weird how it works for you.
Just out of interest, in the GUI I selected to output SPUPNG.
This appeared to do work. It created a subfolder and filled it up with .png files, and made an .xml file.
I tried a few different text outputs, but it won't decode the text.
It did still give an error message while outputting spupng, which I shall post below.
So like I said, I'm running the Windows 0.82 binaries version, not Linux and I didn't compile it myself. Why do you think I can't decode the subs?
Thanks again for the assist.
Here is the log when outputting as spupng. It does appear to get to 100% and work correctly.
CCExtractor 0.82, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: G:\Temp\BBC.TWO.HD-Coast_The.Great.Guide.2016-10-12-TvHeadend.DVB-T.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .xml] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
Opening file: G:\Temp\BBC.TWO.HD-Coast_The.Great.Guide.2016-10-12-TvHeadend.DVB-T.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Creating G:\Temp\BBC.TWO.HD-Coast_The.Great.Guide.2016-10-12-TvHeadend.DVB-T.xml
dvbsub_decode: incomplete, broken or empty packet
Number of NAL_type_7: 0
Number of VCL_HRD: 0
Number of NAL HRD: 0
Number of jump-in-frames: 0
Number of num_unexpected_sei_length: 0
Min PTS: 17:53:56:349
Max PTS: 19:07:23:259
Length: 01:13:26:910
Done, processing time = 10 seconds
This is beta software. Report issues to carlos at ccextractor org...
@Vindictor commented on GitHub (Oct 13, 2016):
Hi there again @Abhinav95
I already have CCExtractor 0.82 Windows GUI installed on another PC here, so I just booted that and tried the same file from Google Drive, and saw the exact same error messages.
I was also trying some different input options to see if it would help.
I thought perhaps I'd try the large gops option, but I just get an Error: Parameter -largegops not understood.
Perhaps you could tell me which options are being detected by your CCExtractor so I can try them on the file here?
Thank you.
EDIT: The command line being used by my GUI version is
ccextractorwin.exe --gui_mode_reports -autoprogram -out=srt -bom -latin1 [+input files]
@Vindictor commented on GitHub (Oct 13, 2016):
OK, I downloaded CCExtractor 0.81 to try that instead.
I gave it the same file.
It still didn't work, but instead of a long log file about changing the FPS to 25, I just got this, and an empty .srt file
CCExtractor 0.81, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
Input: G:\K-Temp\BBC.TWO.HD-Coast_The.Great.Guide.2016-10-12-TvHeadend.DVB-T.ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
Opening file: G:\K-Temp\BBC.TWO.HD-Coast_The.Great.Guide.2016-10-12-TvHeadend.DVB-T.ts
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Creating G:\K-Temp\BBC.TWO.HD-Coast_The.Great.Guide.2016-10-12-TvHeadend.DVB-T.srt
That's it, then it stops!
Perhaps I'll now try CCExtractor 0.80 instead.....
same output as 0.81.
@Vindictor commented on GitHub (Oct 13, 2016):
Sorry to keep posting, I just realised something.
I kept the DVB-S TvHeadend recording of the same programme.
I loaded this into CCExtractor and of course, it works.
Then it occurred to me, perhaps the DVB-S stream also carries teletext subtitles.
IT DOES!
That's why it works.
In the decoders menu I selected "prefer DVB over teletext subtitles if both are present" and I got exactly the same "Changed fps using NAL to: 25.000000" error messages as above with the DVB-T recording.
So that explains why the DVB-S recording works.
As you say, it appears for some reason that the OCR decoder isn't working correctly for me.
All I've done is download the CCExtractor zip file, unpack it to a folder and run from there.
As I mentioned previously, I do see a tessdata subfolder in my CCExtractor folder.
It would appear I'm missing the OCR subsystem that you mentioned for the Linux version.
@Abhinav95 commented on GitHub (Oct 13, 2016):
I just tried downloading the 0.82 Windows binaries, and I get the same error that you do, not just for this but other DVB samples too. There seems to be an issue with the OCR library on the Windows platform, but it works fine on Linux. We'll try and fix this asap.
@cfsmp3 commented on GitHub (Oct 13, 2016):
Hopefully fixed. @Vindictor try overwriting your ccextractorwin.exe with this one.
ccextractorwin.0.83.alpha1-windows.binary.zip
@Vindictor commented on GitHub (Oct 14, 2016):
@cfsmp3 Thanks for that
I excitedly tried it this morning, however I was presented with two error popups.
I loaded the .ts file as provided to @Abhinav95 above, here
https://drive.google.com/file/d/0B0DIrRkpdn12NkdmTW9FR3NrMlk/view?usp=sharing
I loaded the file into CCExtractorGUI, but after clicking on start I got....
Firstly this

and then after pressing OK
Then CCExtractor closed and I was back at the desktop.
EDIT: I looked back into my CCExtractor 0.82 subfolder and noticed there is a libtesseract304.dll file.
I've renamed it to libtesseract304d.dll to match the error message.
I also have a file in the folder called msvcr120.dll, so I shall rename that to msvcr120D.dll
OK, after having renamed those files, when I now launch CCExtractor and load the .ts file and press start ,I see this new error
Also, if I just run the file ccextractorwin.exe I immediately get this error message. I even tried to run as admin.
Do you have a copy of the proper libtesseract304d.dll and msvcr120D.dll files with which to overwrite my renamed copies?
@cfsmp3 commented on GitHub (Oct 14, 2016):
OK, let's give it another go. This .zip includes the external DLLs for the OCR system.
The msvcr... is a microsoft library which you can download from here: https://www.microsoft.com/en-us/download/details.aspx?id=40784
ccextractorwin.0.83.alpha2-windows.binary.zip
@Vindictor commented on GitHub (Oct 15, 2016):
@cfsmp3 Thanks for those.
I downloaded the microsoft link, but it's already installed.
I suppose I could try a "repair"
Also, the file that you provided in the zip is still called libtesseract304.dll, whereas the error message was complaining that it couldn't find libtesseract304d.dll, which is what I thought was the issue.
(if you look at the filename in the error message screen caps above you'll see there's an extra "d" in the filename, 304d.dll not 304.dll.
It's the same with msvcr120.dll. The error message I'm getting is that it cannot find msvcr120D.dll.
I have msvcr120.dll in the folder, but the error is for msvcr120D.dll.
I did try renaming the msvcr120.dll to msvcr120d.dll but then I got the other error above "The procedure entry point...."
@cfsmp3 commented on GitHub (Oct 15, 2016):
Make sure you use the new ccextractorwin.exe (in the .zip file) as well.
On Sat, Oct 15, 2016 at 3:50 AM, Vindictor notifications@github.com wrote:
@Vindictor commented on GitHub (Oct 16, 2016):
@cfsmp3
Hi there,
Yes, I am using the new ccextractorwin.exe (0.83 alpha 2) in the zip file.
In fact, I just created a new test folder, unzipped the contents of the entire "alpha2" archive directly into this folder, ran ccextratorwin.exe and got the cannot find msvcr120d.dll error.
So
(1) I copied the 0.82 archive msvcr120.dll file into the new folder and ran it. As you'd expect, I got the usual cannot find msvcr120d.dll error
(2) I searched my entire drive C for a file called "msvcr120d.dll" I found one in my Nvidia installer folder (where the Nvidia driver setup program unzips its contents before running). I copied this to my new folder, I ran ccextractorwin.exe again, and this time got
The version of the msvcr120d.dll file in the Nvidia folder is

(3) Just in case it was an issue with running ccextratorwin.exe directly, I copied the CCExtractorGUI.exe from the 0.82 archive, along with the tessdata folder.
(4) I tried to run CCExtractorGUI.exe and gave it the .ts file to extract, but as soon as I pressed start, I got the error message as shown in point 2.
(5) Just out of interest, while writing this message I went back to this new folder, renamed the old mscvcr120.dll to msvcr120d.dll to see what would happen with the new alpha, but I got the "Entry point not found" error that I got previously.
I had a look in my Windows\SysWOW64 folder to see if there was an msvcr120d.dll file in there, but I could only find an msvcr120.dll file, along with one called msvcr120_clr0400.dll.
EDIT
I FIXED IT!
OK, after a bit more searching around, I ended up at this website
https://www.dll-files.com/msvcr120d.dll.html
You'll notice there are a few different versions of the file available. Both a 32bit and 64bit version that matches the version number of my Nvidia dll. Then there's a newer version, 32bit only.
I downloaded this 12.0.20827.3 32bit version, copied the file to my new 0.83alpha 2 folder, ran it.... and it worked!
I'm not sure if my currently installed dll files are 64bit, and CCExtractor is 32bit and this caused the issue, but whatever.... this combination of your 0.83Alpha 2, with 0.82 GUI+ tessdata folder + msvcr120d.dll 12.0.20827.3 32bit seems to work!
This is the version of the 32bit dll that works

EDIT: I have made a zip file of the folder with the working combination of files. I was going to upload them here, but I aren't sure if you'd want me to because of the msvcr120d.dll? The 0.82 binaries archive does come with the older dll, but I'll leave it for now.
@Vindictor commented on GitHub (Oct 16, 2016):
OK, the original reason for this thread was that CCExtractor cannot read .ts files recorded on my Humax HDR-Fox T2.
Obviously it wouldn't have worked anyway, without the OCR part working.
I thought I'd try the "large sample" Humax HDR-Fox T2 .ts file from the top of this thread in this new CCExtractor 0.83Alpha 2.
IT NEARLY WORKS!!
Firstly, I got an error (different to the previous error with Humax files)
Opening file: \Mcp\mcp 2\Media\TV Recorded\Horizon_ Are Video Games Really That____20150916_2000.ts
Detected MP4 box with name: free
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Premature end of file (incomplete TS packer header, expected 4 bytes, got 3).
ATTENTION!!!!!!
In switch_to_next_file(): Processing of \Mcp\mcp 2\Media\TV Recorded\Horizon_ Are Video Games Really That____20150916_2000.ts 0 ended prematurely 3 < 2258808832, please send bug report.
Done, processing time = 0 seconds
This is beta software. Report issues to carlos at ccextractor org...
I noticed that it mentioned detecting an .mp4 box, so I went back to the CCExtractorGUI interface and specified the input type as .ts
I ran the file again.
The log spewed out a load of errors, too big to paste here, but I'll attach as a .txt file, but it did generate an .srt file!
The OCR has worked, the subtitles are there, the timing information is simply off.
This is still a big step forwards... at least it decoded the text.
I'll have another go later and see if I can manually specify a framerate, or try some other settings to see if I can get the subtitles in sync.
Please find attached the log file from forcing input as .ts
I can also attach the generated .srt file.
Horizon_ Are Video Games Really That____20150916_2000.zip
and by the way, THANK YOU guys VERY MUCH for your help with this. I really appreciate it.
CCExtractor 0.83Alpha 2 Humax log.txt
@Vindictor commented on GitHub (Oct 16, 2016):
OK, a further update.
I tried remuxing a couple of Humax recorded .ts files, into new .ts files.
I then gave these newly remuxed .ts files to CCExtractor and it processed the files without any errors at all, however the timing info is still wrong.
Actually what appears to be happening is that the subtitles in the .srt file start at 00:00:00,
whereas on the recording it's still showing the channel ident.
On the recording I just tried, the actual talking begins approx 6 seconds in, after a short intro, but the .srt file starts the subs right at the beginning. It looks like the timing info is correct , it's just that all the subs are 6 seconds early.
I'll try again with the "big sample" from above, as you already have that one.
I'll remux it with my videoeditor
Run it through CCExtractor
That's interesting, the subs are, once again, almost exactly 6 seconds early.
About 6 seconds into the programme the narrator says "But few predicted they heralded a revolution in entertainment" however in the .srt file I see
00:00:00,001 --> 00:00:02,079
But few predicted they heralded
a revolution in entertainment.
so that's two for two, both running 6 seconds early.
If I use an .srt editor and set it to run all subs 6 seconds late I may end up with something usable here.
YES!
I used "Subtitle Adjuster" to set the subtitles 6 seconds forward, and they remained in sync for the rest of the programme.
So at least I can now generate usable subs using your software.
I just need to first remux the file (no big deal, I generally edit them anyway) and then need to manually adjust the subs 6-6.5 seconds late. I'll experiment more with time.
@Vindictor commented on GitHub (Oct 19, 2016):
@cfsmp3 OK, as requested, another sample file from my Humax HDR Fox T2.
Same error as previous samples, SD or HD. I also tried this in the 0.83 alpha 2.
I hope this helps.
Thanks for all your help.
https://drive.google.com/open?id=0B0DIrRkpdn12WDRzY3JsS2piTHM
Input: F:\PVR\Paxman on Trump v Clinton_ Divided____20161017_2100[Humax_HDR_Fox-T2].ts
[Extract: 1] [Stream mode: Autodetect]
[Program : Auto ] [Hauppage mode: No] [Use MythTV code: Auto]
[Timing mode: Auto] [Debug: No] [Buffer input: Yes]
[Use pic_order_cnt_lsb for H.264: No] [Print CC decoder traces: No]
[Target format: .srt] [Encoding: Latin-1] [Delay: 0] [Trim lines: No]
[Add font color data: Yes] [Add font typesetting: Yes]
[Convert case: No] [Video-edit join: No]
[Extraction start time: not set (from start)]
[Extraction end time: not set (to end)]
[Live stream: No] [Clock frequency: 90000]
[Teletext page: Autodetect]
[Start credits text: None]
Opening file: F:\PVR\Paxman on Trump v Clinton_ Divided____20161017_2100[Humax_HDR_Fox-T2].ts
Detected MP4 box with name: free
File seems to be a transport stream, enabling TS mode
Analyzing data in general mode
Premature end of file (incomplete TS packer header, expected 4 bytes, got 3).
ATTENTION!!!!!!
In switch_to_next_file(): Processing of F:\PVR\Paxman on Trump v Clinton_ Divided____20161017_2100[Humax_HDR_Fox-T2].ts 0 ended prematurely 3 < 2308812800, please send bug report.
Done, processing time = 0 seconds
This is beta software. Report issues to carlos at ccextractor org...
EDIT:
But as previously, if I remux that raw recording from my Humax HDR Fox-T2, and then load this new .ts file into CCExtractor 0.83 alpha 2 then it does extract subs, using OCR on the DVB-Subs. :)
Wow, that's interesting. The subtitles are updated one word at a time. It's no issue with CCExtractor, the DVBSUBS are the same. Anyway, it worked.
The subtitles do begin at 00:00 again, however, because of the different "per word" timing formatting in this file, the first word remains on the screen from the start of the recording until the first words are spoken (about 21s), then it clears and goes pretty much in time. Cool. I can remove that with Subtitle Adjuster anyway.
So this isn't a normal example of how DVBSUBS are broadcast here (I didn't realise until after I'd uploaded the file), but CCExtractor did manage to decode them, AFTER I'd had to remux the file.
Having to remux isn't such a big deal, though :)
Please find attached my resulting .srt file for this recording.
Paxman on Trump v Clinton_ Divided____20161017_2100[Humax_HDR_Fox-T2].zip
@cfsmp3 commented on GitHub (Oct 19, 2016):
ccextractorwin.0.83.alpha3-windows.binary.zip
That file should solve the problem processing the file as is.
If it works for you please close this ticket. Of course the timing issue remains but that should go into a different ticket (there's already one about timing though so check it out and feel free to comment on it, add a link to this or other files, etc).
@Vindictor commented on GitHub (Oct 20, 2016):
I've just tried the new binary on the Paxman on Trump file above.
No more error! It's running the OCR now while I type this.
It hasn't finished, but the fact it's running at all is enough.
So I'd like to thank you again for your time and patience.
I've really appreciated it.
Keep up the great work.