[BUG] Incorrect placement of X-TIMESTAMP-MAP in WebVTT #726

Closed
opened 2026-01-29 16:52:06 +00:00 by claunia · 1 comment
Owner

Originally created by @bbgdzxng1 on GitHub (Dec 13, 2022).

Summary

When ccextractor is generating webVTT output, X-TIMESTAMP-MAP does not immediately follow the WEBVTT header.

tl;dr: Propose that 12b9f939fe is reverted.

Reference

According to Roger Pantos, the author of the HLS RFC...

"To be clear, what HLS expects (and what the VTT spec defined prior to that 2016 change) is for the X-TIMESTAMP-MAP line to be among a set of non-blank lines immediately after the WEBVTT header line, followed by two or more line terminators, followed by the rest of the body."

Here is Roger's full statement, clarifying the expected behavior https://mailarchive.ietf.org/arch/msg/hls-interest/4vmLpEsV-EnmkEwMQZkzbGQai_4/ clarifying the background around the . Roger is the authoritative reference on HLS RFC8216.

See also:
https://github.com/w3c/webvtt.js/issues/38
https://github.com/w3c/webvtt/issues/485

Expected Behavior

Note that there is no blank line after the WEBVTT statement.

WEBVTT
X-TIMESTAMP-MAP=MPEGTS:5785169281,LOCAL:00:00:00.000

00:00:06.640 --> 00:00:08.307 line:79.33%
    GEICO has a long history    

00:00:06.640 --> 00:00:08.307 line:84.66%
        of great savings        

00:00:08.342 --> 00:00:09.208 line:84.66%
       and great service.

Current Behavior

WEBVTT

X-TIMESTAMP-MAP=MPEGTS:5785169281,LOCAL:00:00:00.000

00:00:06.640 --> 00:00:08.307 line:79.33%
    GEICO has a long history    

00:00:06.640 --> 00:00:08.307 line:84.66%
        of great savings        

00:00:08.342 --> 00:00:09.208 line:84.66%
       and great service.

Command to replicate...

$ ccextractor "./CNN.ts" -in='ts' -1 -out='vtt' -stdout | head -n10

Where CNN.ts is taken from CNN.ts from "US TV recordings, 10 minutes samples, HDHomeRun" located at https://ccextractor.org/public/general/tvsamples/.

% ccextractor --version
CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke.
Teletext portions taken from Petr Kutalek's telxcc
--------------------------------------------------------------------------
CCExtractor detailed version info
        Version: 0.94
        Git commit: Unknown
        Compilation date: 2021-12-15
        CEA-708 decoder: C
        File SHA256: Could not open file

Details

Here is the pull request where the regression occurred.
https://github.com/CCExtractor/ccextractor/pull/1332/files
12b9f939fe

I'm confident that @emkman99's PR was well-intended, however, the link above from Pantos confirms the expected behavior, with absolute authority. @emkman99 - I hope that the snippet from Pantos is helpful.

Conclusion

12b9f939fe should be reverted to ensure that WebVTT output aligns with HLS RFC.

[ My personal view is that ccextractor should not generate X-TIMESTAMP-MAP by default, but it should be enabled through a --timestamp-map option, but that is a subjective opinion would be a change of functionality. I have tried to limit the bug report to an objective clarification of the standards, quoting the author of the HLS RFC.]

Thanks - I hope this is not a contentious topic.

Originally created by @bbgdzxng1 on GitHub (Dec 13, 2022). ### Summary When ccextractor is generating webVTT output, `X-TIMESTAMP-MAP` does not immediately follow the `WEBVTT` header. tl;dr: Propose that https://github.com/emkman99/ccextractor/commit/12b9f939fe55b34d88b54ec83a09013fad6ee62c is reverted. #### Reference According to Roger Pantos, the author of the HLS RFC... _"To be clear, what HLS expects (and what the VTT spec defined prior to that 2016 change) is for the `X-TIMESTAMP-MAP` line to be among a set of non-blank lines immediately after the `WEBVTT` header line, followed by two or more line terminators, followed by the rest of the body."_ Here is Roger's full statement, clarifying the expected behavior https://mailarchive.ietf.org/arch/msg/hls-interest/4vmLpEsV-EnmkEwMQZkzbGQai_4/ clarifying the background around the . Roger is the authoritative reference on [HLS RFC8216](https://www.rfc-editor.org/rfc/rfc8216). See also: https://github.com/w3c/webvtt.js/issues/38 https://github.com/w3c/webvtt/issues/485 #### Expected Behavior Note that there is no blank line after the WEBVTT statement. ```webvtt WEBVTT X-TIMESTAMP-MAP=MPEGTS:5785169281,LOCAL:00:00:00.000 00:00:06.640 --> 00:00:08.307 line:79.33% GEICO has a long history 00:00:06.640 --> 00:00:08.307 line:84.66% of great savings 00:00:08.342 --> 00:00:09.208 line:84.66% and great service. ``` #### Current Behavior ```webvtt WEBVTT X-TIMESTAMP-MAP=MPEGTS:5785169281,LOCAL:00:00:00.000 00:00:06.640 --> 00:00:08.307 line:79.33% GEICO has a long history 00:00:06.640 --> 00:00:08.307 line:84.66% of great savings 00:00:08.342 --> 00:00:09.208 line:84.66% and great service. ``` Command to replicate... ```bash $ ccextractor "./CNN.ts" -in='ts' -1 -out='vtt' -stdout | head -n10 ``` Where CNN.ts is taken from CNN.ts from _"US TV recordings, 10 minutes samples, HDHomeRun"_ located at https://ccextractor.org/public/general/tvsamples/. ```bash % ccextractor --version CCExtractor 0.94, Carlos Fernandez Sanz, Volker Quetschke. Teletext portions taken from Petr Kutalek's telxcc -------------------------------------------------------------------------- CCExtractor detailed version info Version: 0.94 Git commit: Unknown Compilation date: 2021-12-15 CEA-708 decoder: C File SHA256: Could not open file ``` #### Details Here is the pull request where the regression occurred. https://github.com/CCExtractor/ccextractor/pull/1332/files https://github.com/emkman99/ccextractor/commit/12b9f939fe55b34d88b54ec83a09013fad6ee62c I'm confident that @emkman99's PR was well-intended, however, the link above from Pantos confirms the expected behavior, with absolute authority. @emkman99 - I hope that the snippet from Pantos is helpful. #### Conclusion https://github.com/emkman99/ccextractor/commit/12b9f939fe55b34d88b54ec83a09013fad6ee62c should be reverted to ensure that WebVTT output aligns with HLS RFC. [ My personal view is that ccextractor should not generate X-TIMESTAMP-MAP by default, but it should be enabled through a `--timestamp-map` option, but that is a subjective opinion would be a change of functionality. I have tried to limit the bug report to an objective clarification of the standards, quoting the author of the HLS RFC.] Thanks - I hope this is not a contentious topic.
Author
Owner

@bbgdzxng1 commented on GitHub (Dec 20, 2022):

Closing the end-user facing ticket, because of the awesome work included in https://github.com/CCExtractor/ccextractor/pull/1464 will now track it. You guys don't want open tickets hanging around.

As ever, many thanks @emkman99 for the very sensible enhancement and @cfsmp3 for the project.

@bbgdzxng1 commented on GitHub (Dec 20, 2022): Closing the end-user facing ticket, because of the awesome work included in https://github.com/CCExtractor/ccextractor/pull/1464 will now track it. You guys don't want open tickets hanging around. As ever, many thanks @emkman99 for the very sensible enhancement and @cfsmp3 for the project.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#726