[Proposal]:Web-Based Visualization for -out=report Output #825

Open
opened 2026-01-29 16:54:21 +00:00 by claunia · 2 comments
Owner

Originally created by @VarshaSree07 on GitHub (Apr 11, 2025).

Issue:
The current -out=report output is useful but difficult to read, especially for large or complex video files with multiple subtitle pages (like Teletext pages 150, 888, etc.). It lacks a visual interface, making it harder to understand at a glance.

Proposed Solution:
Build a web-based visualization tool that:
*Accepts a report.txt file (output of -out=report)
*Parses the file in-browser
*Displays each subtitle page (e.g., 150, 888) as a section/card
*Shows details like type, timestamps, and subtitle activity
*Optional: add a timeline view or graph

Benefits:
*Helps users quickly understand which subtitle pages exist and when they appear.
*Makes the report accessible to non-technical users.
*Can be hosted as a standalone tool or integrated into the GUI repo in the future.

Implementation Plan:
*Parse uploaded report.txt file
*Extract subtitle pages and types
*Display each page as a visual section
*Add optional timeline or filters
*Host on GitHub Pages

~Planning to use HTML/JS or React for the frontend, possibly with Chart.js for visualization.
~I would be happy to take up the implementation of this feature myself. I’m sharing this proposal first to gather feedback or suggestions before beginning the development process.

Originally created by @VarshaSree07 on GitHub (Apr 11, 2025). Issue: The current -out=report output is useful but difficult to read, especially for large or complex video files with multiple subtitle pages (like Teletext pages 150, 888, etc.). It lacks a visual interface, making it harder to understand at a glance. Proposed Solution: Build a web-based visualization tool that: *Accepts a report.txt file (output of -out=report) *Parses the file in-browser *Displays each subtitle page (e.g., 150, 888) as a section/card *Shows details like type, timestamps, and subtitle activity *Optional: add a timeline view or graph Benefits: *Helps users quickly understand which subtitle pages exist and when they appear. *Makes the report accessible to non-technical users. *Can be hosted as a standalone tool or integrated into the GUI repo in the future. Implementation Plan: *Parse uploaded report.txt file *Extract subtitle pages and types *Display each page as a visual section *Add optional timeline or filters *Host on GitHub Pages ~Planning to use HTML/JS or React for the frontend, possibly with Chart.js for visualization. ~I would be happy to take up the implementation of this feature myself. I’m sharing this proposal first to gather feedback or suggestions before beginning the development process.
Author
Owner

@cfsmp3 commented on GitHub (Dec 30, 2025):

I think this needs more details - where/how is it going to be deployed, for example... is that something for users to run locally?

@cfsmp3 commented on GitHub (Dec 30, 2025): I think this needs more details - where/how is it going to be deployed, for example... is that something for users to run locally?
Author
Owner

@bbgdzxng1 commented on GitHub (Jan 6, 2026):

@VarshaSree07 Would you really want web/http output and for the web-presentation format to be dictated by the ccextractor team? Structured data output into json would be far more flexible. A website could then parse the json (jquery etc) and would leave presentation flexibility open to the web developer. Command-line people could use jq to parse etc.

There were various ideas bounced around on https://github.com/CCExtractor/ccextractor/issues/1399.

Of course, a json schema would need to be defined, but something like the following...

Stream Mode: Transport Stream
Program Count: 1
Program Numbers: 2003 
PID: 3549, Program: 2003, MPEG-2 video
PID: 3550, Program: 2003, AC3 audio
PID: 3551, Program: 2003, MPEG-2 User Private
PID: 3552, Program: 2003, MPEG-2 User Private
PID: 3553, Program: 2003, MPEG-2 User Private
PID: 3554, Program: 2003, MPEG-2 User Private
//////// Program #2003: ////////
DVB Subtitles: No
Teletext: No
ATSC Closed Caption: Yes
EIA-608: Yes
XDS: No
CC1: Yes
CC2: No
CC3: No
CC4: No
CEA-708: No
Width: 528
Height: 480
Aspect Ratio: 02 - 4:3
Frame Rate: 04 - 29.97

Would become something like...


{
  "stream_mode": "Transport Stream",
  "program_count": 1,
  "programs": [
    {
      "program_number": 2003,
      "pids": [
        { "pid": 3549, "stream_type": "video", "codec": "MPEG-2" },
        { "pid": 3550, "stream_type": "audio", "codec": "AC3" },
        { "pid": 3551, "stream_type": "user_private", "codec": "MPEG-2 User Private" },
        { "pid": 3552, "stream_type": "user_private", "codec": "MPEG-2 User Private" },
        { "pid": 3553, "stream_type": "user_private", "codec": "MPEG-2 User Private" },
        { "pid": 3554, "stream_type": "user_private", "codec": "MPEG-2 User Private" }
      ],
      "services": {
        "dvb_subtitles": false,
        "teletext": false,
        "atsc_closed_caption": true,
        "caption_systems": {
          "eia_608": true,
          "xds": false,
          "cea_708": false
        },
        "cc_channels": {
          "cc1": true,
          "cc2": false,
          "cc3": false,
          "cc4": false
        }
      }
    }
  ]
}

Web-developers deal with json data each and every day and could then parse and present it in HTML and JS.

Browsers, IDEs, jq and prettifiers will typically color-code json, making even complex outputs easy to read, at a glance.

Image
@bbgdzxng1 commented on GitHub (Jan 6, 2026): @VarshaSree07 Would you _really_ want web/http output and for the web-presentation format to be dictated by the ccextractor team? Structured data output into json would be far more flexible. A website could then parse the json (jquery etc) and would leave presentation flexibility open to the web developer. Command-line people could use `jq` to parse etc. There were various ideas bounced around on https://github.com/CCExtractor/ccextractor/issues/1399. Of course, a json schema would need to be defined, but something like the following... ```text Stream Mode: Transport Stream Program Count: 1 Program Numbers: 2003 PID: 3549, Program: 2003, MPEG-2 video PID: 3550, Program: 2003, AC3 audio PID: 3551, Program: 2003, MPEG-2 User Private PID: 3552, Program: 2003, MPEG-2 User Private PID: 3553, Program: 2003, MPEG-2 User Private PID: 3554, Program: 2003, MPEG-2 User Private //////// Program #2003: //////// DVB Subtitles: No Teletext: No ATSC Closed Caption: Yes EIA-608: Yes XDS: No CC1: Yes CC2: No CC3: No CC4: No CEA-708: No Width: 528 Height: 480 Aspect Ratio: 02 - 4:3 Frame Rate: 04 - 29.97 ``` Would become something like... ```json { "stream_mode": "Transport Stream", "program_count": 1, "programs": [ { "program_number": 2003, "pids": [ { "pid": 3549, "stream_type": "video", "codec": "MPEG-2" }, { "pid": 3550, "stream_type": "audio", "codec": "AC3" }, { "pid": 3551, "stream_type": "user_private", "codec": "MPEG-2 User Private" }, { "pid": 3552, "stream_type": "user_private", "codec": "MPEG-2 User Private" }, { "pid": 3553, "stream_type": "user_private", "codec": "MPEG-2 User Private" }, { "pid": 3554, "stream_type": "user_private", "codec": "MPEG-2 User Private" } ], "services": { "dvb_subtitles": false, "teletext": false, "atsc_closed_caption": true, "caption_systems": { "eia_608": true, "xds": false, "cea_708": false }, "cc_channels": { "cc1": true, "cc2": false, "cc3": false, "cc4": false } } } ] } ``` Web-developers deal with json data each and every day and could then parse and present it in HTML and JS. Browsers, IDEs, jq and prettifiers will typically color-code json, making even complex outputs easy to read, at a glance. <img width="708" height="637" alt="Image" src="https://github.com/user-attachments/assets/7c9fb085-3c3e-4dab-ae45-946533a27106" />
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#825