[PROPOSAL] When converting roll-up captions using "don't duplicate lines…", add an option to start a new caption when the speaker changes. #562

Closed
opened 2026-01-29 16:47:47 +00:00 by claunia · 2 comments
Owner

Originally created by @The-Bart-The on GitHub (Mar 1, 2020).

CCExtractor version: this would affect versions after 0.88

Necessary information

  • Is this a regression (i.e. did it work before)? NO
  • What platform did you use? Windows
  • What were the used arguments? --norollup

Basically the title.

Simple (?): Look for signs the speaker has changed; like ">" or ">>>" or "Narrator:" or "Oprah:" any of which preceded by a new line, and start the next caption there.

Less Simple: Check if the captions are probably synchronized, and if so, also start a caption after a long pause at the end of a sentence. (Maybe detect uneven timing of characters, as if typed by a stenographer to distinguish live captioning from scripted. Scripted roll-up captions have accurate timing and are sometimes used in news magazine programs and weekly documentary series like Frontline.)

Thanks again, everyone, for all your hard work.

Originally created by @The-Bart-The on GitHub (Mar 1, 2020). CCExtractor version: this would affect versions after 0.88 # Necessary information - Is this a regression (i.e. did it work before)? NO - What platform did you use? Windows - What were the used arguments? --norollup Basically the title. Simple (?): Look for signs the speaker has changed; like ">" or ">>>" or "Narrator:" or "Oprah:" any of which preceded by a new line, and start the next caption there. Less Simple: Check if the captions are probably synchronized, and if so, also start a caption after a long pause at the end of a sentence. (Maybe detect uneven timing of characters, as if typed by a stenographer to distinguish live captioning from scripted. Scripted roll-up captions have accurate timing and are sometimes used in news magazine programs and weekly documentary series like Frontline.) Thanks again, everyone, for all your hard work.
Author
Owner

@cfsmp3 commented on GitHub (Mar 2, 2020):

That's probably best done by a post-processing program to be honest. There's no need to add those kind of features directly into the main program.

After all, C is not really a convenient language for string manipulation.

However in Python that would take just a few lines of code.

@cfsmp3 commented on GitHub (Mar 2, 2020): That's probably best done by a post-processing program to be honest. There's no need to add those kind of features directly into the main program. After all, C is not really a convenient language for string manipulation. However in Python that would take just a few lines of code.
Author
Owner

@cfsmp3 commented on GitHub (Apr 7, 2020):

Closing because: won't-fix (just feel like CCExtractor itself is not the right place).

@cfsmp3 commented on GitHub (Apr 7, 2020): Closing because: won't-fix (just feel like CCExtractor itself is not the right place).
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#562