[PR #1521] [MERGED] [FIX] #1520 keep webvtt-full formatting in sync #2244

Closed
opened 2026-01-29 17:21:06 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/CCExtractor/ccextractor/pull/1521
Author: @dhouck
Created: 3/27/2023
Status: Merged
Merged: 3/28/2023
Merged by: @cfsmp3

Base: masterHead: webvtt-full-unicode


📝 Commits (1)

📊 Changes

2 files changed (+14 additions, -41 deletions)

View changed files

📝 docs/CHANGES.TXT (+1 -0)
📝 src/lib_ccx/ccx_encoders_webvtt.c (+13 -41)

📄 Description

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

Fixes #1520 so that style and text donʼt get out of sync.

I can see a few other issues (mentioned below) which I donʼt have sample files for so I can neither submit full issues nor personally fix, and fixing them might mean a rewrite of the entire loop or even the function:

  • It can still have mis-nested HTML tags (it can output normal<i>italics<u>both</i>underlined, or even worse with colors)
  • Even without --nohtmlescape it does not escape unsafe characters (which SRT also doesnʼt).

Fixing those probably requires a much bigger refactor than I am able to do at the moment, and probably for most cases the first wonʼt be an issue (I expect most parsers to handle it properly but havenʼt checked), and unless the subtitles contain HTML the second probably isnʼt an issue either.

The reason I made buf five bytes is in case Iʼm wrong about it needing a full rewrite; this way it can fit all of &amp;, although it will never contain it as things are now.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/CCExtractor/ccextractor/pull/1521 **Author:** [@dhouck](https://github.com/dhouck) **Created:** 3/27/2023 **Status:** ✅ Merged **Merged:** 3/28/2023 **Merged by:** [@cfsmp3](https://github.com/cfsmp3) **Base:** `master` ← **Head:** `webvtt-full-unicode` --- ### 📝 Commits (1) - [`b622b18`](https://github.com/CCExtractor/ccextractor/commit/b622b18f3cfbf2d3d2019a8bef7a2c8fa70271bd) [FIX] #1520 keep webvtt-full formatting in sync ### 📊 Changes **2 files changed** (+14 additions, -41 deletions) <details> <summary>View changed files</summary> 📝 `docs/CHANGES.TXT` (+1 -0) 📝 `src/lib_ccx/ccx_encoders_webvtt.c` (+13 -41) </details> ### 📄 Description <!-- Please prefix your pull request with one of the following: **[FEATURE]** **[FIX]** **[IMPROVEMENT]**. --> **In raising this pull request, I confirm the following (please check boxes):** - [x] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md). - [x] I have checked that another pull request for this purpose does not exist. - [x] I have considered, and confirmed that this submission will be valuable to others. - [x] I accept that this submission may not be used, and the pull request closed at the will of the maintainer. - [x] I give this submission freely, and claim no ownership to its content. - [x] **I have mentioned this change in the [changelog](https://github.com/CCExtractor/ccextractor/blob/master/docs/CHANGES.TXT).** **My familiarity with the project is as follows (check one):** - [ ] I have never used CCExtractor. - [X] I have used CCExtractor just a couple of times. - [ ] I absolutely love CCExtractor, but have not contributed previously. - [ ] I am an active contributor to CCExtractor. --- Fixes #1520 so that style and text donʼt get out of sync. I can see a few other issues (mentioned below) which I donʼt have sample files for so I can neither submit full issues nor personally fix, and fixing them might mean a rewrite of the entire loop or even the function: * It can still have mis-nested HTML tags (it can output `normal<i>italics<u>both</i>underlined`, or even worse with colors) * Even without `--nohtmlescape` it does not escape unsafe characters (which SRT also doesnʼt). Fixing those probably requires a much bigger refactor than I am able to do at the moment, and probably for most cases the first wonʼt be an issue (I expect most parsers to handle it properly but havenʼt checked), and unless the subtitles contain HTML the second probably isnʼt an issue either. The reason I made `buf` five bytes is in case Iʼm wrong about it needing a full rewrite; this way it can fit all of `&amp;`, although it will never contain it as things are now. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 17:21:06 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#2244