mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-04 05:44:53 +00:00
[PROPOSAL] Add SCC support to CEA-708 decoder #697
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @PunitLodha on GitHub (Mar 23, 2022).
Originally assigned to: @PunitLodha on GitHub.
Add support for SCC format to CEA-708 decoder.
Currently, only SRT, SAMI and Transcript formats are supported, https://github.com/CCExtractor/ccextractor/blob/master/src/rust/src/decoder/tv_screen.rs#L126-L134
SCC format details :- http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/SCC_FORMAT.HTML
#1423
@voidash commented on GitHub (Mar 26, 2022):
Just to be clear, i looked up similar function

write_sami(). Basically it is writing to a file and the contents should look like the image i have embedded.So if i want to add support for SCC format , then subtitles that are extracted should look like this right
I am working on this problem, and i will be sure to read contributor guidelines and contact you if i get stuck.
@shazbot666 commented on GitHub (Mar 26, 2022):
Here's a sample SCC extract from the sample WhackedOutVideos_short.mov using a commercial tool
sample video:
https://drive.google.com/file/d/13p6HBxGXlm0BGpaS15JwCJjfnBdm_Qbm/view?usp=sharing
Scenarist_SCC V1.0
00:58:56:14 e96e 2043 616e 6164 61ae
00:58:58:19 9426 94ad 9470 4ff2 20e9 7320 f468 e973 2073 796e e368 f2ef 6ee9 7ae5 6480
00:58:59:23 9426 94ad 9470 73f4 e9e3 6b20 70ef 6be9 6e67 bf80
00:59:02:03 9426 94ad 9470 c1e3 f475 61ec ec79 2c20 f468 ef73 e520 61f2 e520 f468 e520 f2ef 6473
00:59:03:09 9426 94ad 9470 f468 e579 2075 73e5 20f4 ef20 ecef e361 f4e5 20ec ef73 f420 70e5 ef70 ece5
00:59:04:29 9426 94ad 9470 eff2 20ef 62ea e5e3 f473 20e9 6e20 7570 20f4 ef20 3132 20e6 e5e5 f480
00:59:06:17 9426 94ad 9470 efe6 2070 eff7 64e5 f2ae
00:59:08:18 9426 94ad 9470 496e 20f4 68e9 7320 e361 73e5 2c20 f468 e579 20e6 e96e 6420 f468 e973
00:59:09:19 9426 94ad 9470 6475 64e5 a773 2076 e964 e5ef 20e3 616d e5f2 61ae
00:59:12:03 9426 94ad 9470 c16e 6420 f468 e520 73f7 e561 f220 62ec e97a 7a61 f264
00:59:13:14 9426 94ad 9470 73f4 61f2 f473 2061 6761 e96e ae80
00:59:14:27 9426 94ad 9470 a862 ece5 e570 e96e 6729
00:59:18:26 9426 94ad 9470 54f7 efad f468 e9f2 6473 20ef e620 f468 e520 f7ef f2ec 6480
00:59:20:22 9426 94ad 9470 e973 20e3 ef76 e5f2 e564 2062 7920 f761 f4e5 f280
00:59:22:16 9426 94ad 9470 616e 6420 f468 e520 f2e5 73f4 20e9 7320 e3ef 76e5 f2e5 6420 6279 2075 73ae
00:59:24:23 9426 94ad 9470 54e9 6de5 20f4 ef20 6761 f468 e5f2 2075 7020 61ec ec20 f468 e520 67ef efe6 7980
00:59:26:04 9426 94ad 9470 67ef e96e 6773 adef 6e20 e6f2
@voidash commented on GitHub (Apr 7, 2022):
I took a shot at adding SCC support for the 708 decoder. I tried adding a function
write_sccontv_screen.rsand here is the commit on my fork: https://github.com/CCExtractor/ccextractor/compare/master...voidash:masteri ran the ccextractor in debug mode with these flags for the video https://drive.google.com/file/d/13p6HBxGXlm0BGpaS15JwCJjfnBdm_Qbm/view.
Here is the complete output: https://pastebin.com/58ieUtfY
Without
-708flag , the output is little different from #1423 . https://pastebin.com/PygNqWRhMy major concern is that
Writerobject is only being created for the last three lines.And for those three lines , the start and end times are same. and the output file
main.scccontainsScenarist_SCC V1.0onlyHowever, the file
main.p0.svc01.scchas those last three lines.Note: i wrote
write_sccfunction by looking howwrite_srtandwrite_transcriptwork. If there is something i need to understand please let me know@cfsmp3 commented on GitHub (Apr 8, 2022):
@PunitLodha can you take a look at @voidash 's work?
@PunitLodha commented on GitHub (Apr 9, 2022):
Yes, I will in some time
@PunitLodha commented on GitHub (Apr 12, 2022):
So, for some reason, mp4 still uses the C decoder. And changing it to rust is not as straightforward. I am working on it.
Meanwhile, @voidash could you replicate the changes in C here, https://github.com/CCExtractor/ccextractor/blob/master/src/lib_ccx/ccx_decoders_708_output.c#L370-L392
@voidash commented on GitHub (Apr 12, 2022):
Ok, i will take a look at it.
@voidash commented on GitHub (Apr 12, 2022):
I tried replicating the changes in C. here is the diff file :
fb5dbe2959Here is the output when i passed the following parameters
-in=mp4 -out=scc -nofc -dru /home/cdjk/Downloads/WhackedOutVideos_short.mov -o /home/cdjk/Downloads/main.scchttps://pastebin.com/VeY4BmbK
The temp file
main.p0.svc01.sccfile is being written and the contents look like this :https://pastebin.com/xq6Jwfuv
but
main.sccis still unwritten. Looking at the console output it looks as if the caption type is roll upAny suggestions on what should i do next?
@PunitLodha commented on GitHub (Apr 12, 2022):
main.sccwill be empty because it is supposed to contain subs for 608, which is not present here.main.p0.svc01.sccis the file which is supposed to have 708 subs. So that is correct.But I can see some issues with the output. One being that there are multiple timestamps on the same line. Other than that, I think the clear caption command is missing, which should be present at end time of each subtitle
@cfsmp3 commented on GitHub (Apr 12, 2022):
The mp4 code has a different flow. We use libgpac to actually open the mp4 file and the entry point into the decoders is different than the usual general loop.
It should be easy to change though and call the rust code.
@voidash commented on GitHub (Apr 13, 2022):
@PunitLodha .
main.p0.svc01.sccnow looks like this.You can take a look at my approach here :
e449557c8cHere is pastebin for
main.p0.svc01.scc: https://pastebin.com/aMiaEStYSo 708 decoder found SCC subs which means
Scenarist_SCC V1.0header should be added on top of themain.p0.svc01.sccand also i guess i should remove the rust code which is just appending last three caption text@cfsmp3 commented on GitHub (Apr 13, 2022):
I'd recommend looking into this - @PunitLodha
6efa41a7e6/src/lib_ccx/mp4.c (L398)If you can just call rust from there you're good to go. After that everything is the same thing.
@PunitLodha commented on GitHub (Apr 13, 2022):
I did look at that. But due to how the code is structured, it's not as easy as just calling the rust function from there. I'll have to change some stuff from the rust side first
@PunitLodha commented on GitHub (Apr 13, 2022):
@voidash
Check out how sami header is added, and do it the same way
The last captions are added by the code which you added in rust. It is called by the flush function. So you should correct the rust code too, and send a PR
@ArchitBhonsle commented on GitHub (Mar 8, 2023):
If this issue has been abandoned, I could start working on this.
Is there a video with 708 captions which is not an MP4? This might help me avoid implementing this in C and/or changing the current MP4 flow.
@cfsmp3 commented on GitHub (Mar 8, 2023):
Sure, go for it.
Yes, almost any US Transport Stream.
You can find plenty on our website.
@PunitLodha commented on GitHub (Mar 14, 2023):
#1499 details the issue with mp4 code flow and how to fix it
@IshanGrover2004 commented on GitHub (Dec 17, 2023):
Hi,
I would like to work on this issue and continue to work on where @voidash left it.
Just wanted to know what is the current progress and what things are needed to fulfil the feature.
And lil bit of how could i resolve it.
If any necessary information i should know, just tell me that as well.
@PunitLodha @cfsmp3