[PROPOSAL] - Structured data JSON output of ccextractor -out=report #670

Open
opened 2026-01-29 16:50:37 +00:00 by claunia · 8 comments
Owner

Originally created by @bbgdzxng1 on GitHub (Dec 7, 2021).

Would you be kind enough to consider extending ccextractor -out=report to export a JSON output so that the output of the ccextractor analyser can be machine readable without a custom parser? This would mean that an external structured data parser such as jq can be used to parse / filter the output.

Similar to:

$ ffprobe -show_streams -print_format json -i INFILE | jq .
$ mediainfo -full --Output=JSON INFILE | jq .

eg something like...

$ ccextractor --output-format json_templatev1 -out=report INFILE | jq .

Obviously, it would require the definition of a schema, and the data structure of all the potential outputs of ccextractor may need to be extended in the future (hence proposal of versioning the schema), but it would seem like useful functionality for anyone who uses ccextractor in an automated environment. I suspect there is already a defined data structure internal to ccextractor anyway.

Something along the lines of...


{
  "stream_mode": "Transport Stream",
  "program_count": 1,
  "programs": [
    {
      "program_number": 2003,
      "pids": [
        { "pid": 3549, "stream_type": "video", "codec": "MPEG-2" },
        { "pid": 3550, "stream_type": "audio", "codec": "AC3" },
        { "pid": 3551, "stream_type": "user_private", "codec": "MPEG-2 User Private" },
        { "pid": 3552, "stream_type": "user_private", "codec": "MPEG-2 User Private" },
        { "pid": 3553, "stream_type": "user_private", "codec": "MPEG-2 User Private" },
        { "pid": 3554, "stream_type": "user_private", "codec": "MPEG-2 User Private" }
      ],
      "services": {
        "dvb_subtitles": false,
        "teletext": false,
        "atsc_closed_caption": true,
        "caption_systems": {
          "eia_608": true,
          "xds": false,
          "cea_708": false
        },
        "cc_channels": {
          "cc1": true,
          "cc2": false,
          "cc3": false,
          "cc4": false
        }
      }
    }
  ]
}

Thanks!

Originally created by @bbgdzxng1 on GitHub (Dec 7, 2021). Would you be kind enough to consider extending `ccextractor -out=report` to export a JSON output so that the output of the ccextractor analyser can be machine readable without a custom parser? This would mean that an external structured data parser such as `jq` can be used to parse / filter the output. Similar to: ``` $ ffprobe -show_streams -print_format json -i INFILE | jq . $ mediainfo -full --Output=JSON INFILE | jq . ``` eg something like... ``` $ ccextractor --output-format json_templatev1 -out=report INFILE | jq . ``` Obviously, it would require the definition of a schema, and the data structure of all the potential outputs of ccextractor may need to be extended in the future (hence proposal of versioning the schema), but it would seem like useful functionality for anyone who uses ccextractor in an automated environment. I suspect there is already a defined data structure internal to ccextractor anyway. Something along the lines of... ``` { "stream_mode": "Transport Stream", "program_count": 1, "programs": [ { "program_number": 2003, "pids": [ { "pid": 3549, "stream_type": "video", "codec": "MPEG-2" }, { "pid": 3550, "stream_type": "audio", "codec": "AC3" }, { "pid": 3551, "stream_type": "user_private", "codec": "MPEG-2 User Private" }, { "pid": 3552, "stream_type": "user_private", "codec": "MPEG-2 User Private" }, { "pid": 3553, "stream_type": "user_private", "codec": "MPEG-2 User Private" }, { "pid": 3554, "stream_type": "user_private", "codec": "MPEG-2 User Private" } ], "services": { "dvb_subtitles": false, "teletext": false, "atsc_closed_caption": true, "caption_systems": { "eia_608": true, "xds": false, "cea_708": false }, "cc_channels": { "cc1": true, "cc2": false, "cc3": false, "cc4": false } } } ] } ``` Thanks!
Author
Owner

@cfsmp3 commented on GitHub (Dec 7, 2021):

We'll be happy to accept pull requests or sponsorship for this to happen :-)

@cfsmp3 commented on GitHub (Dec 7, 2021): We'll be happy to accept pull requests or sponsorship for this to happen :-)
Author
Owner

@IshanGrover2004 commented on GitHub (Dec 29, 2023):

Hey, I wanted to work on this issue. But I'll need a little bit direction regarding this.
Currently ccextractor ~/Downloads/cc/COMEDY.ts -out=report is giving this output

And with comparison with mediainfo -full --Output=JSON INFILE | jq command's output, the CCExtracter output is quite less.

So, I wanted to know that is there any part of code in ccextracter has these type of information provided but not used here, so that i can use that & make a informative report from that.
Or Should i just wrap the current ccextracter output to Json format and print to STDOUT or create it to another file. (that would be quick, if you say so)

@IshanGrover2004 commented on GitHub (Dec 29, 2023): Hey, I wanted to work on this issue. But I'll need a little bit direction regarding this. Currently `ccextractor ~/Downloads/cc/COMEDY.ts -out=report` is giving this [output](https://pastebin.com/KFSxQCpc) And with comparison with `mediainfo -full --Output=JSON INFILE | jq` command's [output](https://pastebin.com/st6tSJS8), the CCExtracter output is quite less. So, I wanted to know that is there any part of code in ccextracter has these type of information provided but not used here, so that i can use that & make a informative report from that. Or Should i just wrap the current ccextracter [output](https://pastebin.com/KFSxQCpc) to Json format and print to STDOUT or create it to another file. (that would be quick, if you say so)
Author
Owner

@Atharva-Kanherkar commented on GitHub (Dec 30, 2024):

Hello! @cfsmp3 Is this issue open to everyone? Can I contribute, in any particular way? Would love to!

@Atharva-Kanherkar commented on GitHub (Dec 30, 2024): Hello! @cfsmp3 Is this issue open to everyone? Can I contribute, in any particular way? Would love to!
Author
Owner

@cfsmp3 commented on GitHub (Dec 30, 2024):

Hello! @cfsmp3 Is this issue open to everyone? Can I contribute, in any particular way? Would love to!

All issues are open to whoever wants to work on them

@cfsmp3 commented on GitHub (Dec 30, 2024): > Hello! @cfsmp3 Is this issue open to everyone? Can I contribute, in any particular way? Would love to! All issues are open to whoever wants to work on them
Author
Owner

@Atharva-Kanherkar commented on GitHub (Dec 31, 2024):

Hello! @cfsmp3 Is this issue open to everyone? Can I contribute, in any particular way? Would love to!

All issues are open to whoever wants to work on them

Okay! I'd start working on this after the new year if you dont mind :) Thank you!

@Atharva-Kanherkar commented on GitHub (Dec 31, 2024): > > Hello! @cfsmp3 Is this issue open to everyone? Can I contribute, in any particular way? Would love to! > > All issues are open to whoever wants to work on them Okay! I'd start working on this after the new year if you dont mind :) Thank you!
Author
Owner

@ptr727 commented on GitHub (Apr 17, 2025):

Hi, following, my interest is in knowing if a stream contains closed captions so that I can remux them out. I used to use ffprobe to determine if the stream contains CC's, but support for CC detection was dropped around Jan 2025. I am looking for alternatives, and a CLI tool that reports on CC presence in e.g. JSON format would be great.

Refer to https://github.com/ptr727/PlexCleaner/issues/497

@ptr727 commented on GitHub (Apr 17, 2025): Hi, following, my interest is in knowing if a stream contains closed captions so that I can remux them out. I used to use ffprobe to determine if the stream contains CC's, but support for CC detection was dropped around Jan 2025. I am looking for alternatives, and a CLI tool that reports on CC presence in e.g. JSON format would be great. Refer to https://github.com/ptr727/PlexCleaner/issues/497
Author
Owner

@Rahul-2k4 commented on GitHub (Dec 6, 2025):

Hi! @ptr727 I’m also trying to understand how to reliably detect closed captions in streams, especially since ffprobe dropped CC detection recently. I’m still a beginner, so I’ve been exploring different tools, and a simple CLI that outputs CC info in JSON would also be really helpful for me.

By the way, I’m interested in contributing to this project. Are there any beginner-friendly issues or small tasks I could start with? I’d love to get involved and learn by helping out.

@Rahul-2k4 commented on GitHub (Dec 6, 2025): Hi! @ptr727 I’m also trying to understand how to reliably detect closed captions in streams, especially since ffprobe dropped CC detection recently. I’m still a beginner, so I’ve been exploring different tools, and a simple CLI that outputs CC info in JSON would also be really helpful for me. By the way, I’m interested in contributing to this project. Are there any beginner-friendly issues or small tasks I could start with? I’d love to get involved and learn by helping out.
Author
Owner

@ruturaj2829 commented on GitHub (Jan 17, 2026):

@canihavesomecoffee i would wish to work on this issue

@ruturaj2829 commented on GitHub (Jan 17, 2026): @canihavesomecoffee i would wish to work on this issue
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#670