[BUG] SMPTE Timed Text contains unclosed p-sections with teletext input #600

Closed
opened 2026-01-29 16:48:46 +00:00 by claunia · 7 comments
Owner

Originally created by @akkermansadriaan on GitHub (Aug 24, 2020).

CCExtractor version: CCExtractor 0.88

Necessary information

  • What platform did you use? Mac
  • What were the used arguments? -out=smptett

Video links

I've sent a private invitation containing the original transport stream that I used. It includes the result .srt and .ttml files too.

Additional information

Converting teletext to smptett results an addition unclosed p-section for every valid p-section.

<p begin="00:00:21.000" end="00:00:24.320">
text block 1
</p>
<p begin="00:00:24.320">

<p begin="00:00:25.000" end="00:00:27.480">
text block 2
</p>
<p begin="00:00:27.480">
Originally created by @akkermansadriaan on GitHub (Aug 24, 2020). CCExtractor version: CCExtractor 0.88 # Necessary information - What platform did you use? Mac - What were the used arguments? `-out=smptett` # Video links I've sent a private invitation containing the original transport stream that I used. It includes the result .srt and .ttml files too. # Additional information Converting teletext to smptett results an addition unclosed p-section for every valid p-section. ``` <p begin="00:00:21.000" end="00:00:24.320"> text block 1 </p> <p begin="00:00:24.320"> <p begin="00:00:25.000" end="00:00:27.480"> text block 2 </p> <p begin="00:00:27.480"> ```
claunia added the difficulty: easyHacktoberfest labels 2026-01-29 16:48:46 +00:00
Author
Owner

@utkarsh147-del commented on GitHub (Nov 2, 2020):

Please assign this issue to me.I want to do this

@utkarsh147-del commented on GitHub (Nov 2, 2020): Please assign this issue to me.I want to do this
Author
Owner

@canihavesomecoffee commented on GitHub (Nov 2, 2020):

We don't assign issues to someone, you can just start working on the issue if you want :)

@canihavesomecoffee commented on GitHub (Nov 2, 2020): We don't assign issues to someone, you can just start working on the issue if you want :)
Author
Owner

@utkarsh147-del commented on GitHub (Nov 2, 2020):

Ok thankyou sir for this reply

On Mon, Nov 2, 2020 at 7:04 PM Willem notifications@github.com wrote:

We don't assign issues to someone, you can just start working on the issue
if you want :)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/CCExtractor/ccextractor/issues/1278#issuecomment-720475157,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/ARS4LCLKAVGUM4WUCRDQC6TSN2YPZANCNFSM4QJDZFQA
.

@utkarsh147-del commented on GitHub (Nov 2, 2020): Ok thankyou sir for this reply On Mon, Nov 2, 2020 at 7:04 PM Willem <notifications@github.com> wrote: > We don't assign issues to someone, you can just start working on the issue > if you want :) > > — > You are receiving this because you commented. > Reply to this email directly, view it on GitHub > <https://github.com/CCExtractor/ccextractor/issues/1278#issuecomment-720475157>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/ARS4LCLKAVGUM4WUCRDQC6TSN2YPZANCNFSM4QJDZFQA> > . >
Author
Owner

@SuvigyaJain1 commented on GitHub (Apr 30, 2021):

I'd like to work on this
I had a quick look through the files and just have one question.
in the ccx_encoders_smtett.c file I follow the code perfectly until the closing p tag. But I cannot understand why the following code proceeds to open another tag with the "ending time" of the first tag as the "begin time" for this new tag. Is this a specification in smptett or probably just something caused by an accidental use of copy paste :P

the code I'm referring to:

	write_wrapped(context->out->fh, context->buffer, used); // (prints the closing p to the file)
        
// CODE AFTER THIS IS WHERE THE ISSUE LIES
	sprintf((char *)str, "<p begin=\"%02u:%02u:%02u.%03u\">\n\n", h2, m2, s2, ms2); // (???)
	
	if (context->encoding != CCX_ENC_UNICODE)
	{
		dbg_print(CCX_DMT_DECODER_608, "\r%s\n", str);
	}
	used = encode_line(context, context->buffer, (unsigned char *)str);
	write_wrapped(context->out->fh, context->buffer, used);
	sprintf((char *)str, "</p>\n");

So I should just remove these redundant lines and open a pr?
P.S. VLC was able to read the .ttml file after removing those lines

@SuvigyaJain1 commented on GitHub (Apr 30, 2021): I'd like to work on this I had a quick look through the files and just have one question. in the ccx_encoders_smtett.c file I follow the code perfectly until the closing ```p``` tag. But I cannot understand why the following code proceeds to open another tag with the "ending time" of the first tag as the "begin time" for this new tag. Is this a specification in smptett or probably just something caused by an accidental use of copy paste :P the code I'm referring to: ```c write_wrapped(context->out->fh, context->buffer, used); // (prints the closing p to the file) // CODE AFTER THIS IS WHERE THE ISSUE LIES sprintf((char *)str, "<p begin=\"%02u:%02u:%02u.%03u\">\n\n", h2, m2, s2, ms2); // (???) if (context->encoding != CCX_ENC_UNICODE) { dbg_print(CCX_DMT_DECODER_608, "\r%s\n", str); } used = encode_line(context, context->buffer, (unsigned char *)str); write_wrapped(context->out->fh, context->buffer, used); sprintf((char *)str, "</p>\n"); ``` So I should just remove these redundant lines and open a pr? P.S. VLC was able to read the .ttml file after removing those lines
Author
Owner

@SuvigyaJain1 commented on GitHub (Apr 30, 2021):

P.P.S Or should I close the new opened <p > tag and leave the contents empty

@SuvigyaJain1 commented on GitHub (Apr 30, 2021): P.P.S Or should I close the new opened ```<p >``` tag and leave the contents empty
Author
Owner

@cfsmp3 commented on GitHub (Apr 30, 2021):

@SuvigyaJain1

in the ccx_encoders_smtett.c file I follow the code perfectly until the closing p tag. But I cannot understand why the following code

OK so you understand the code that works but don't understand the code that doesn't work :-)

I'd say - just fix the problem and have no mercy with buggy or unreadable code.

Take a look at the official specs:

https://www.w3.org/TR/ttml1/

(by the way the current version is newer than our code, so it's worth taking a read anyway).

Producing compliant output should be reasonable straightforward since it's a lot of boilerplate stuff that has the subtitled embedded (and that part is already in our code).

So by all means revise that code and send a PR :-) Feel free to rewrite anything that looks dodgy.

@cfsmp3 commented on GitHub (Apr 30, 2021): @SuvigyaJain1 > in the ccx_encoders_smtett.c file I follow the code perfectly until the closing `p` tag. But I cannot understand why the following code OK so you understand the code that works but don't understand the code that doesn't work :-) I'd say - just fix the problem and have no mercy with buggy or unreadable code. Take a look at the official specs: https://www.w3.org/TR/ttml1/ (by the way the current version is newer than our code, so it's worth taking a read anyway). Producing compliant output should be reasonable straightforward since it's a lot of boilerplate stuff that has the subtitled embedded (and that part is already in our code). So by all means revise that code and send a PR :-) Feel free to rewrite anything that looks dodgy.
Author
Owner

@yashsinghcodes commented on GitHub (Jan 12, 2024):

Hi @cfsmp3 looks like this issue has been solved as, I just went through the code and it seems to be updated. I think its best to close it for any future confusion.

@yashsinghcodes commented on GitHub (Jan 12, 2024): Hi @cfsmp3 looks like this issue has been solved as, I just went through the code and it seems to be updated. I think its best to close it for any future confusion.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#600