mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-15 13:35:30 +00:00
Cannot extract DVB subtitles from the TVEHD channel #112
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @ruralman on GitHub (Feb 10, 2016).
hola a todos,
A pesar que el mediainfo dice:
Video:
PID: 0x12D(H.264)
Audio:
PID: 0x12E(spa)[PD]
PID: 0x12F(qaa)[PD]
PID: 0x130(spa)[PD]
Teletexto:
Vacío
Subtítulos DVB:
PID: 0x131(spa_0x14_p1_a2 )
No consigo extraerlos con el ccextractor(con subtitle edit ,logro visualizarlos)
adjunto archivos de ejemplo:
https://drive.google.com/file/d/0B_gQLghxiDPxakJOUkNLSkl2Z1U/view?usp=drive_web
https://drive.google.com/file/d/0B_gQLghxiDPxTDRseU5SMUpKVVk/view?usp=drive_web
@anshul1912 commented on GitHub (Feb 14, 2016):
are you able to see subtitles with any other video player
@ruralman commented on GitHub (Feb 29, 2016):
si,los subtitulos se ven (mplayer),pero no se pueden extraer.
2016-02-14 12:09 GMT+01:00 Anshul Maheshwari notifications@github.com:
@vinayakathavale commented on GitHub (Mar 18, 2016):
@ruralman you need to use -out=spupng as one of the arguments
refer link for details.
@cfsmp3 commented on GitHub (Jul 5, 2016):
This file: 000prueba brujula.ts
Causes a crash which is definitely related to the OCR (with no tesseract the spupng files are generated just fine, even though of course we don't have the text in the XML file).
With tesseract there's a quick crash. VS doesn't say where though, so I assume it's inside one of the libraries.
If exporting to .srt the generated file before the crash contains garbage:
1
00:00:00,001 --> 00:00:00,000
I-Ian: rnllnhnc Ilniunr-can
I I“, IIIIpI\lII\I§ IpIIIIVGI§\I§
r'7-""*1-~1
Assigning to Anshul (sorry) as it's DVB-OCR related.
@Abhinav95 commented on GitHub (Aug 10, 2016):
The problem with these two files is that the image which we are doing OCR on is not created properly. I'm taking a deeper look into it.
@cfsmp3 commented on GitHub (Nov 7, 2016):
@ruralman do you still have this problem with the current version?
@Izaron commented on GitHub (Jan 2, 2017):
I researched this issue.
Firstly, we have subtitles, but the display time starts from 1590 minutes. Something is broken at this point anyway.
Secondly, we get errors because 'start_y' or 'end_y' was uninitialized. I initialized them to 0.
Anyway was other bugs:
After patching, I received such output:
And .srt file - hercules.srt What have start time 26:30:43.
I would venture to suggest that the error in the .ts file recording.
@cfsmp3 commented on GitHub (Jan 20, 2017):
prueba_hercules.ts crashes with current version.
prueba_brujula.ts doesn't crash but output is garbage.
GSoC qualification: This issues gives 3 points.
@ruralman commented on GitHub (Jan 23, 2017):
thanks for your answer
2017-01-20 1:55 GMT+01:00 Carlos Fernandez Sanz notifications@github.com:
@canihavesomecoffee commented on GitHub (Nov 18, 2017):
@ruralman I modified your title to English so that it's a bit clearer :)
Hope you don't mind.
@harrynull commented on GitHub (Dec 26, 2017):
000prueba hercules.tsworks perfectly. Subtitles are extracted correctly.000prueba brujula.ts's output is quite messed up. The cause is that the subtitle is divided into multiple regions, like shown below:The reason
-spupngworking fine is that it merges the regions before saving to file.My suggestion is that we should merge the regions before passing to OCR.
Possible flaw: The positioning of the subtitle could be hard.
Possible solution: Only merge the subtitles that are nearby.
@cfsmp3 commented on GitHub (Dec 26, 2017):
This is quite interesting.
If spupng works well then it's clearly a solvable problem.
Do you want to give it a go?
On Tue, Dec 26, 2017 at 11:21 AM, Null notifications@github.com wrote:
@harrynull commented on GitHub (Dec 26, 2017):
@cfsmp3 It doesn't seem to be easy to fix, but I will give it a try.