No subtitles with ccextractor - but with TS-Doctor #483

Closed
opened 2026-01-29 16:45:06 +00:00 by claunia · 3 comments
Owner

Originally created by @poeeast on GitHub (Feb 8, 2019).

[QUESTION].

CCExtractor version : 0.87

In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):

  • [ x] I have read and understood the contributors guide.
  • [x ] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
  • [x ] I have checked that the issue I'm posting isn't already reported.
  • [x ] I have checked that the issue I'm porting isn't already solved and no duplicates exist in closed issues and in opened issues
  • [x ] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
  • [x ] I have used the latest available version of CCExtractor to verify this issue exists.

My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):

  • I have never used CCExtractor.
  • [ x] I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

Necessary information

  • Is this a regression (did it work before)? [ x] NO | [ ] YES - please specify the last known working version
  • What platform did you use? [x ] Windows - [ ] Linux - [ ] Mac
  • What were the used arguments? -autoprogram

**Video links (replace text below with your links) **
https://mirus.dk/cce/GP.mts

Additional information
https://mirus.dk/cce/GP-LOG.txt
https://mirus.dk/cce/GP-TSD.srt

When running the (4 min, 96 MB) video file GP.mts through ccextractor I get no subtitles (see GP-LOG.txt). I have used standard setup, but have also tried with other combinations. When I run the video file through TS-Doctor I get a file with subtitles (GP-TSD.srt).

PS: Make sure you set an alert in GitHub so you get notifications about your ticket. We may need to ask questions and we do everything inside GitHub's system.

Originally created by @poeeast on GitHub (Feb 8, 2019). [QUESTION]. CCExtractor version : 0.87 **In raising this issue, I confirm the following (please check boxes, eg [X] - and delete unchecked ones):** - [ x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [x ] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present. - [x ] I have checked that the issue I'm posting isn't already reported. - [x ] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues) - [x ] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion. - [x ] I have used the latest available version of CCExtractor to verify this issue exists. **My familiarity with the project is as follows (check one, eg [X] - and delete unchecked ones):** - [ ] I have never used CCExtractor. - [ x] I have used CCExtractor just a couple of times. - [ ] I absolutely love CCExtractor, but have not contributed previously. - [ ] I am an active contributor to CCExtractor. **Necessary information** - Is this a regression (did it work before)? [ x] NO | [ ] YES - *please specify the last known working version* - What platform did you use? [x ] Windows - [ ] Linux - [ ] Mac - What were the used arguments? `-autoprogram` **Video links (replace text below with your links) ** https://mirus.dk/cce/GP.mts **Additional information** https://mirus.dk/cce/GP-LOG.txt https://mirus.dk/cce/GP-TSD.srt When running the (4 min, 96 MB) video file GP.mts through ccextractor I get no subtitles (see GP-LOG.txt). I have used standard setup, but have also tried with other combinations. When I run the video file through TS-Doctor I get a file with subtitles (GP-TSD.srt). PS: Make sure you set an alert in GitHub so you get notifications about your ticket. We may need to ask questions and we do everything inside GitHub's system.
Author
Owner

@thelastpolaris commented on GitHub (Apr 11, 2019):

First of all - do you want to extract Teletext subtitles or DVB ones? If you want to extract DVB subs then you have the following issues:

  1. According to your logs, you don't have Danish language data for Tesseract. You can download it here https://github.com/tesseract-ocr/tessdata_best (dan.traineddata). In Ubuntu I put this file here /usr/share/tesseract-ocr/4.00/tessdata/. This file is needed for Tesseract to be able to recognize Danish characters. Also don't forget to compile CCExtractor with OCR support (if you haven't been doing that already)

  2. Have you tried -codec dvbsub option? The way CCExtractor is working by default is that Teletext is given higher priority, meaning that if CCExtractor finds Teletext track before DVB, then Teletext (if any) will be returned. By specifying -codec dvbsub you explicitly tell CCExtractor to look only for DVB subtitles.

@thelastpolaris commented on GitHub (Apr 11, 2019): First of all - do you want to extract Teletext subtitles or DVB ones? If you want to extract DVB subs then you have the following issues: 1. According to your logs, you don't have Danish language data for Tesseract. You can download it here https://github.com/tesseract-ocr/tessdata_best (dan.traineddata). In Ubuntu I put this file here /usr/share/tesseract-ocr/4.00/tessdata/. This file is needed for Tesseract to be able to recognize Danish characters. Also don't forget to compile CCExtractor with OCR support (if you haven't been doing that already) 2. Have you tried -codec dvbsub option? The way CCExtractor is working by default is that Teletext is given higher priority, meaning that if CCExtractor finds Teletext track before DVB, then Teletext (if any) will be returned. By specifying -codec dvbsub you explicitly tell CCExtractor to look only for DVB subtitles.
Author
Owner

@poeeast commented on GitHub (Apr 27, 2019):

Thank you for the answer.
The fault is probably mine. I used ccextractor without having installed Tesseract. I have no previous knowledge with manipulating video files. So thank you for the help.

@poeeast commented on GitHub (Apr 27, 2019): Thank you for the answer. The fault is probably mine. I used ccextractor without having installed Tesseract. I have no previous knowledge with manipulating video files. So thank you for the help.
Author
Owner

@MatejMecka commented on GitHub (Jun 2, 2019):

Can this issue be closed as it seems that it's fixed?

@MatejMecka commented on GitHub (Jun 2, 2019): Can this issue be closed as it seems that it's fixed?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#483