mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-03 21:23:48 +00:00
[PROPOSAL] Extract subtitles in a Chinese newscast #379
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @Liontooth on GitHub (Jan 24, 2018).
Originally assigned to: @Abhinav95 on GitHub.
The following video file was was recorded in mainland China, using Joker-tv with the DTMB television standard. When I watch it, I'm seeing what looks like subtitles / captions. Can CCExtractor see them? I was not able to get it to work. I did not try OCR, which may be what is required.
http://vrnewsscape.ucla.edu/dropbox/2018-01-09_2033_CN_CCTV1_%e6%96%b0%e9%97%bb1+1.mpg
Cheers,
David
@jimboH commented on GitHub (Feb 20, 2018):
I would like to work on this issue.
@saurabhshri commented on GitHub (Feb 20, 2018):
Sure, just go ahead.
On 20-Feb-2018 10:00 PM, "jimboH" notifications@github.com wrote:
@thealphadollar commented on GitHub (Apr 11, 2018):
@Liontooth @cfsmp3
The subtitles in the given video file can be extracted using hardsubx parameter.
To display them in the video player one requires compatible fonts but they are indeed being extracted and below is an image showing the same.
There are errors in the output due to inaccuracies of the OCR but they are out of the scope of this issue and is a separate GSoC project.
@cfsmp3 commented on GitHub (Apr 11, 2018):
Assigned to @Abhinav95 which in turn will assign it to the GSoC student(s) he sees fit.
@fewwwww commented on GitHub (Mar 8, 2021):
Just browsing GSoC issues, looking to work on the flutter project but I think I can help out a little bit with this issue.
Here is the document for the global standard of DTMB (GB20600-2006), 130 pages all in Chinese:
https://www.doc88.com/p-810688531386.html
Here is a patent of a hardware that is able to separate audio and video signals and display subtitles on DTMB equipment:
https://nxgp.cnki.net/kcms/detail?v=kxaUMs6x7-4I2jr5WTdXti3zQ9F92xu0N5Lim4gHJeVFMNAZBuVUfzvmz2LuJgb7bn8rlgaJH4AQ98pqdK9FqNLQT3L2E_Cs&uniplatform=NZKPT
Here is tons of recordings of DTMB televisions on Bilibili (can be downloaded by tools like you-get):
https://search.bilibili.com/all?keyword=dtmb&from_source=nav_search_new
Feel free to move on or maybe write proposals with all these vital links. I really like to work on this issue, but for this issue, the solver definitely needs to know two languages: C and Chinese. I happen to know about Chinese but not much C.
@cfsmp3 commented on GitHub (Mar 29, 2021):
@fewwwww This is great, thanks!
Let's hope there's a brave student that knows C and feels like doing this with your help :-)
@cfsmp3 commented on GitHub (Mar 22, 2023):
Closing to keep track of our Chinese wishlist here: https://github.com/CCExtractor/ccextractor/issues/224