Unknown four-byte data inserted in WEBVTT files before the timestamp #197

Closed
opened 2026-01-29 16:37:38 +00:00 by claunia · 1 comment
Owner

Originally created by @atrottmann on GitHub (Nov 1, 2016).

When generating WEBVTT data from a MPEGTS stream, I get what appears to be four bytes of binary data immediately before the timestamp on every line.

out-vtt.zip

The attached file (zipped, because github didn't let me upload the raw .vtt) shows this: After the WEBVTT<0x0d><0x0a> header there are the bytes 0x50 0xf9 0xd9 0x01 before the text-form timestamp 00:00:15.120

I do not understand the purpose of those bytes and suspect a bug.

The source file ./lib_ccx/ccx_encoders_webvtt.c contains the following code at the beginning of write_stringz_as_webvtt:

    used = encode_line(context, context->buffer, (unsigned char *)timeline);
    written = write(context->out->fh, context->buffer, used);
    if (written != used)
            return -1;

This appears to be a duplicate of the code that runs right afterwards, after the timestamp has been sprintf'd, and I do not find a purpose for it. If I correctly understand the code, this just outputs some uninitialized data, which results in the four bytes of apparent garbage that I saw in the generated WEBVTT file.

If i comment this out, it appears to create correct WEBVTT files.

Kind regards,

Andreas Trottmann

Originally created by @atrottmann on GitHub (Nov 1, 2016). When generating WEBVTT data from a MPEGTS stream, I get what appears to be four bytes of binary data immediately before the timestamp on every line. [out-vtt.zip](https://github.com/CCExtractor/ccextractor/files/564176/out-vtt.zip) The attached file (zipped, because github didn't let me upload the raw .vtt) shows this: After the WEBVTT<0x0d><0x0a> header there are the bytes 0x50 0xf9 0xd9 0x01 before the text-form timestamp 00:00:15.120 I do not understand the purpose of those bytes and suspect a bug. The source file ./lib_ccx/ccx_encoders_webvtt.c contains the following code at the beginning of write_stringz_as_webvtt: used = encode_line(context, context->buffer, (unsigned char *)timeline); written = write(context->out->fh, context->buffer, used); if (written != used) return -1; This appears to be a duplicate of the code that runs right afterwards, after the timestamp has been sprintf'd, and I do not find a purpose for it. If I correctly understand the code, this just outputs some uninitialized data, which results in the four bytes of apparent garbage that I saw in the generated WEBVTT file. If i comment this out, it appears to create correct WEBVTT files. Kind regards, Andreas Trottmann
Author
Owner

@december-soul commented on GitHub (Nov 3, 2016):

found same issue.
Here is my fix

From 901f6364e1ca94e9d510ae7ca4b0fae0a786816a Mon Sep 17 00:00:00 2001
From: Patrick Fischer 
Date: Thu, 3 Nov 2016 11:58:32 +0100
Subject: [PATCH] memset

---
 src/lib_ccx/ccx_encoders_webvtt.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/src/lib_ccx/ccx_encoders_webvtt.c b/src/lib_ccx/ccx_encoders_webvtt.c
index a0ada30..6b37705 100644
--- a/src/lib_ccx/ccx_encoders_webvtt.c
+++ b/src/lib_ccx/ccx_encoders_webvtt.c
@@ -15,6 +15,7 @@ int write_stringz_as_webvtt(char *string, struct encoder_ctx *context, LLONG ms_
        unsigned h2, m2, s2, ms2;
        int written;
        char timeline[128];
+       memset(timeline, 0, sizeof(timeline));

        mstotime(ms_start, &h1, &m1, &s1, &ms1);
        mstotime(ms_end - 1, &h2, &m2, &s2, &ms2); // -1 To prevent overlapping with next line.
-- 
2.7.4
@december-soul commented on GitHub (Nov 3, 2016): found same issue. Here is my fix ``` From 901f6364e1ca94e9d510ae7ca4b0fae0a786816a Mon Sep 17 00:00:00 2001 From: Patrick Fischer Date: Thu, 3 Nov 2016 11:58:32 +0100 Subject: [PATCH] memset --- src/lib_ccx/ccx_encoders_webvtt.c | 1 + 1 file changed, 1 insertion(+) diff --git a/src/lib_ccx/ccx_encoders_webvtt.c b/src/lib_ccx/ccx_encoders_webvtt.c index a0ada30..6b37705 100644 --- a/src/lib_ccx/ccx_encoders_webvtt.c +++ b/src/lib_ccx/ccx_encoders_webvtt.c @@ -15,6 +15,7 @@ int write_stringz_as_webvtt(char *string, struct encoder_ctx *context, LLONG ms_ unsigned h2, m2, s2, ms2; int written; char timeline[128]; + memset(timeline, 0, sizeof(timeline)); mstotime(ms_start, &h1, &m1, &s1, &ms1); mstotime(ms_end - 1, &h2, &m2, &s2, &ms2); // -1 To prevent overlapping with next line. -- 2.7.4 ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#197