3307 Commits

Author SHA1 Message Date
Dhanush
395f9b3213 fix: MKV subtitle track .(null) extension for KATE and unknown codec IDs (#2250)
* fix: MKV subtitle track .(null) extension for KATE and unknown codec IDs

The matroska_track_text_subtitle_id_extensions array had 7 entries for
an 8-value enum, leaving MATROSKA_TRACK_SUBTITLE_CODEC_ID_KATE (index 7)
out of bounds. On most platforms this read NULL, which then caused
strlen(NULL) UB and snprintf to emit .(null) in the output filename.

Two fixes:
- Add "kate" at index 7 in the extensions array so KATE tracks
  produce correct .kate output filenames
- Add a NULL guard in generate_filename_from_track() so any future
  unknown codec ID safely falls back to .bin instead of crashing or
  producing .(null)

Fixes #972

* fix: MKV subtitle track .(null) extension for KATE and unknown codec IDs

The matroska_track_text_subtitle_id_extensions array had 7 entries for
an 8-value enum, leaving MATROSKA_TRACK_SUBTITLE_CODEC_ID_KATE (index 7)
out of bounds. On most platforms this read NULL, which then caused
strlen(NULL) UB and snprintf to emit .(null) in the output filename.

Two fixes:
- Add "kate" at index 7 in the extensions array so KATE tracks
  produce correct .kate output filenames
- Add a NULL guard in generate_filename_from_track() so any future
  unknown codec ID safely falls back to .bin instead of crashing or
  producing .(null)

Fixes #972

* fix: MKV subtitle track .(null) extension for KATE and unknown codec IDs

The matroska_track_text_subtitle_id_extensions array had 7 entries for
an 8-value enum, leaving MATROSKA_TRACK_SUBTITLE_CODEC_ID_KATE (index 7)
out of bounds. On most platforms this read NULL, which then caused
strlen(NULL) UB and snprintf to emit .(null) in the output filename.

Two fixes:
- Add "kate" at index 7 in the extensions array so KATE tracks
  produce correct .kate output filenames
- Add a NULL guard in generate_filename_from_track() so any future
  unknown codec ID safely falls back to .bin instead of crashing or
  producing .(null)

Fixes #972

---------

Co-authored-by: Dhanush Varma <your@email.com>
2026-04-04 14:28:48 -07:00
Abhijeet Kumar
65df24e6bc Fix integer overflow in ccxr_process_cc_data(): cast cc_count to usize before multiply (#2241)
cc_count * 3 used i32 arithmetic with no upper-bound check. For cc_count
> i32::MAX / 3 (~715 million), debug builds panic on overflow detection
and release builds silently wrap around to a negative range, discarding
all CC data for the frame. Both are triggerable from malformed media files.

Fix:
1. Cast cc_count to usize immediately after the existing <= 0 guard,
   before any arithmetic — eliminates the overflow entirely
2. Add MAX_CC_COUNT = 31 upper-bound guard — CEA-708/ATSC A/53 encodes
   cc_count in a 5-bit bitstream field (0x1F mask in avc_functions.c:514),
   making 31 the spec-defined per-frame maximum; this value is also
   independently documented in es/userdata.rs ("Maximum cc_count is 31").
   Returns -1 with a warn!() log for out-of-range values, consistent with
   existing error-handling style in the function.
3. Remove the now-redundant `x as usize` cast in the map closure since
   the range is already usize..usize

Fixes #2234
2026-04-03 20:04:24 -07:00
Abhijeet Kumar
92dc785435 Fix panic in process_page() on negative teletext PTS timestamps (#2240)
show_timestamp.to_srt_time().expect() and hide_timestamp.to_srt_time().expect()
in TeletextContext::process_page() panicked for any negative Timestamp value.
Negative timestamps are common in broadcast captures with wrap-around or
uninitialized PTS — crashing after potentially processing an entire file.

to_srt_time() → as_hms_millis() → i64::try_into::<u64>() returns
OutOfRangeError for negative values; .expect() made this fatal.

Fix: process_page() already returns Option<Subtitle>, so replace both
.expect() calls with .ok()? — silently skipping the subtitle when the
timestamp is out of range, matching the function's existing None-on-empty
contract.

Fixes #2233
2026-04-03 19:50:14 -07:00
ahmedbektic
d56a6be9e4 [FIX] clean up rust TODO fix bad AVC SEI payload run (#2235)
* fix avc sei payload

* fix failed formatting test
2026-03-28 13:03:31 -07:00
Rizky Mirzaviandy Priambodo
47ad8388b1 [FIX] Route --hardsubx --tickertext through subtitle encoder (#2230)
* fix(hardsubx): route --tickertext through subtitle encoder (#2229)

* chore(pr): drop changelog entry for bug fix
2026-03-28 13:00:20 -07:00
Abhijeet Kumar
1b4123b302 Fix heap OOB read in switch_to_next_file(): sync inputfile array with num_input_files (#2218)
The Rust FFI function copy_from_rust() computed num_input_files by filtering
empty strings from the inputfiles Vec, but passed an unfiltered clone to
string_to_c_chars() to build the C inputfile[] array. This mismatch made the
C array length and num_input_files disagree: switch_to_next_file() could index
inputfile[current_file] where current_file < num_input_files but >= array size,
reading one slot past the end of the allocated array — confirmed by
AddressSanitizer (heap-buffer-overflow at file_functions.c:183).

The same count/size mismatch also caused free_rust_c_string_array() to
reconstruct the Vec with an incorrect capacity, producing heap corruption on
every clean shutdown.

Fix: filter empty strings into a single Vec<String> first, then derive both
num_input_files (filtered.len()) and the C array (string_to_c_chars(filtered))
from that same source, eliminating the mismatch entirely.

Fixes #2182

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 12:32:41 -07:00
Carlos Fernandez Sanz
c4573bdced style: add missing braces to strdup NULL checks (#2213 follow-up) (#2237) 2026-03-28 12:08:31 -07:00
Abhijeet Kumar
8e4bcfc0cb Fix unchecked strdup() return values in lib_ccx (#2213)
Two related strdup bugs across multiple lib_ccx files:

1. strdup(variable) return not checked for NULL — use after potential
   NULL dereference causes undefined behavior / segfault on OOM.
   Fixed by adding NULL check + fatal(EXIT_NOT_ENOUGH_MEMORY, ...).

2. strdup("literal") in get_buffer_type_str returned directly as
   function result — unchecked and leaks memory on every call since
   the function has no callers that free it.  Fixed by removing strdup
   and returning string literals directly; return type changed from
   char * to const char * (no callers exist, no header declaration).

Files changed:
  src/lib_ccx/ccx_common_common.c
  src/lib_ccx/ccx_encoders_common.c
  src/lib_ccx/ccx_encoders_helpers.c
  src/lib_ccx/configuration.c
  src/lib_ccx/hardsubx.c
  src/lib_ccx/hardsubx_decoder.c
  src/lib_ccx/ocr.c
  src/lib_ccx/output.c
  src/lib_ccx/ts_functions.c

Fixes #2194

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-28 11:59:35 -07:00
Carlos Fernandez Sanz
159b2193f4 style: add missing braces in general_loop.c (#2210 follow-up) (#2236)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-28 11:42:02 -07:00
Dhanush
ab109cf316 fix: crash when reading from stdin with no data (#2210)
Two bugs: (1) general_loop.c dereferences NULL dec_ctx in the
live_stream progress code when no data was processed. (2) Rust
switch_to_next_file calls demux_ctx.open() for stdin, triggering
null pointer dereference via ptr::null().

Fix: add NULL check for dec_ctx in general_loop.c, and return
early from switch_to_next_file for stdin/network/tcp sources
instead of calling demux_ctx.open().

Co-authored-by: Dhanush Varma <your@email.com>
2026-03-28 11:40:48 -07:00
Navdeep Kaur
c20b408527 [FIX] Add NULL checks for fopen() and alloc_demuxer_data() in process_hex() (#2202)
* fix(general_loop): Add NULL checks for fopen() and alloc_demuxer_data() in process_hex()

* docs: update changelog for process_hex NULL checks fix

* fix: use EXIT_READ_ERROR for fopen failure and remove CHANGES.TXT entry
2026-03-28 11:25:40 -07:00
dependabot[bot]
74e3842ed0 chore(deps): bump microsoft/setup-msbuild from 2.0.0 to 3.0.0 (#2224)
Bumps [microsoft/setup-msbuild](https://github.com/microsoft/setup-msbuild) from 2.0.0 to 3.0.0.
- [Release notes](https://github.com/microsoft/setup-msbuild/releases)
- [Commits](https://github.com/microsoft/setup-msbuild/compare/v2.0.0...v3.0.0)

---
updated-dependencies:
- dependency-name: microsoft/setup-msbuild
  dependency-version: 3.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-24 17:16:41 -07:00
pszemus
03ad9e8e02 [FEATURE] Allow output \0 terminated frames (for WebSocket streaming support) (#2105)
* [feat] Allow output \0 terminated frames

* Fix rust `FromCType`

* use encoded_end_frame for text-based captions

* add changelog entry

* fix CEA-708 Rust decoder

* fix Rust formating

* remove unused `crlf` field - satisfy clippy function argument limit

* silence clippy function argument limit in `Writer`

* Fix writing frame end with multiline captions

* fix formatting errors
2026-03-18 18:16:43 -07:00
Atul Chahar
9f250b144d fix(cea708): use dynamic current_fps instead of hardcoded 29.97 in SCC frame delays (#2173)
Replace all 6 hardcoded 1000/29.97 frame delay calculations in
dtvcc_write_scc() with 1000/current_fps so that CEA-708 SCC output
uses the actual stream framerate instead of assuming NTSC 29.97.

Fixes #2172
2026-03-17 20:20:17 -07:00
Carlos Fernandez Sanz
0b1a967b73 fix: prevent stream detection from corrupting current_file index (#2209)
detect_stream_type() reads up to 1MB (STARTBYTESLENGTH) via
buffered_read_opt() for format detection. For input files smaller
than 1MB, the read hits EOF and—because binary_concat defaults to
enabled—buffered_read_opt() calls switch_to_next_file(). This
increments current_file past the valid range and closes the file
descriptor, leaving format-specific handlers (matroska_loop, MP4,
etc.) to crash when they access inputfile[current_file].

Fix: temporarily disable binary_concat around detect_stream_type()
so that hitting EOF during detection never triggers file switching.

Fixes the root cause of the crash reported in PR #2206 (which
proposed a band-aid of using current_file-1).

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 20:12:20 -07:00
Carlos Fernandez Sanz
52b5385c2a ci: add dual build artifacts for Windows (min-rust vs migrations) (#2208)
Add $(ExtraDefines) to PreprocessorDefinitions in all 4 configurations
of the vcxproj. This allows passing /p:ExtraDefines=DISABLE_RUST from
the MSBuild command line to use C code paths for switchable modules.

The Windows CI now produces two Release artifacts per architecture:
- "CCExtractor Windows x64 Release build" — min Rust (DISABLE_RUST)
- "CCExtractor Windows x64 Release build (with migrations)" — max Rust

The migrations build uses /t:Rebuild to do a clean rebuild without
DISABLE_RUST after the min-rust build completes.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-17 08:48:02 -07:00
Carlos Fernandez Sanz
92389cff84 ci: add dual build artifacts to compare C vs Rust code paths (#2207)
* fix: flush pending EIT sections in EPG_free() before freeing buffers

* ci: add dual build artifacts to compare C vs Rust code paths

Add -min-rust flag to linux/build that passes -DDISABLE_RUST to gcc,
causing switchable modules (DTVCC, demuxer, AVC, networking, hex utils)
to use their C implementations instead of Rust. The Rust library still
compiles since many modules are Rust-only.

The Linux CI now produces two artifacts:
- "CCExtractor Linux build" — min Rust (C paths where available)
- "CCExtractor Linux build (with migrations)" — max Rust

Both should produce identical output on the sample platform. If they
diverge, it means a Rust port introduced a behavioral difference.

The sample platform will need a corresponding update to recognize and
test the new "with migrations" artifact.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Varadraj75 <agrawalvaradraj2007@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 21:47:59 -07:00
Abhijith
578abcaf3b [FIX] Fix ownership of default Matroska track language (#2193)
* Fix ownership of default Matroska track language

* matroska: restore lang_ietf preference in filename gen and lang matching

Restore the working IETF language preference that was accidentally
removed in the previous commit:

- generate_filename_from_track(): use lang_ietf ? lang_ietf : lang
  for buffer sizing and both snprintf calls
- save_vobsub_track(): same lang_tag pattern for base filename
- matroska_save_all(): check lang_ietf first (BCP-47), fall back to
  lang (ISO-639-2) when --mkvlang filter is active; remove now-unused
  char *match variable

The strdup NULL check and broken switch-case removal from the prior
commit are unchanged.

* matroska: fix clang-format style

* matroska: fix filename underscores, LLD macro, braces, param name
2026-03-15 20:09:52 -07:00
Dhanush
5c87a33a8a fix: add link_directories for tesseract/leptonica on macOS (#2186)
pkg_check_modules provides library names without paths.
Without link_directories, the linker cannot find tesseract
and leptonica on systems where they are not in default
search paths (e.g. Homebrew on macOS arm64).

Co-authored-by: Dhanush Varma <your@email.com>
2026-03-15 11:05:57 -07:00
Carlos Fernandez Sanz
6281a481d0 chore: update PR template to set clearer expectations (#2203)
Require repro instructions and samples, discourage AI-generated PRs
for theoretical issues, clarify changelog and C code policies.
2026-03-14 12:58:41 -07:00
Chandragupt Singh
ed7f544e10 [FEATURE] Add guarded ASS/SSA \pos positioning for CEA-608 captions (#1885)
* feat(ssa): add guarded ASS \pos positioning for CEA-608 captions

* fix(ssa): correct ASS positioning anchor, validate row adjacency, and clean up variable placement

* fix(ssa): adjust top margin to prevent clipping of top-positioned CEA-608 captions

* ssa: map CEA-608 row+col to ASS coords using FFmpeg safe-area formula and fix \an2→\an7 anchor
2026-03-14 10:13:40 -07:00
Apoorv Darshan
538e39db67 Fix null pointer dereference in Matroska parser on file open failure (#2171)
create_file() returns the result of fopen() which can be NULL if the
file cannot be opened. matroska_loop() never checked this, passing
the NULL pointer into matroska_parse() where it is immediately used
in feof(), causing a crash.

Add a NULL check that calls fatal(EXIT_READ_ERROR, ...) on failure,
consistent with other file-open error handling in the codebase.
2026-03-14 10:03:39 -07:00
Arun
af53968611 Fix: Heap OOB buffer over-read in MP4 atom parsing (#2179) (#2180)
* Fix: Heap OOB buffer over-read in MP4 atom parsing (#2179)

* formatting

---------

Co-authored-by: Arun kumar <arunkumar@Aruns-MacBook-Air.local>
2026-03-12 20:54:28 -07:00
Carlos Fernandez Sanz
0f41b70a6e style: add missing braces to if blocks in lib_ccx.c (#2147 follow-up) (#2197)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-12 18:56:52 -07:00
Atul Chahar
58a8ded621 Fix MSVC cross-CRT invalid free on output_filename (#2147) 2026-03-12 18:54:47 -07:00
Carlos Fernandez Sanz
ee57fb46f3 chore: clean up #2168 merge — drop internal CHANGES.TXT entry, fix whitespace (#2196)
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-12 18:49:44 -07:00
Varad Raj Agrawal
dc1d8d9592 fix: move VBI_DEBUG to CMake opt-in, fix MSVC empty struct error (#2168)
- Remove unconditional #define VBI_DEBUG from ccx_decoders_vbi.h
- Add CMake option VBI_DEBUG (OFF by default) in src/CMakeLists.txt
- Use #ifdef VBI_DEBUG / #else for debug_file_name vs reserved member,
  preventing MSVC C2016 empty struct error in non-debug builds
- Add changelog entry in docs/CHANGES.TXT under 0.96.7 unreleased

Fixes #2167
2026-03-12 18:47:43 -07:00
dependabot[bot]
d0c73362ed chore(deps): bump docker/build-push-action from 6 to 7 (#2181)
Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 6 to 7.
- [Release notes](https://github.com/docker/build-push-action/releases)
- [Commits](https://github.com/docker/build-push-action/compare/v6...v7)

---
updated-dependencies:
- dependency-name: docker/build-push-action
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-08 13:28:04 -07:00
dependabot[bot]
80ed678f98 chore(deps): bump docker/setup-buildx-action from 3 to 4 (#2178)
Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 3 to 4.
- [Release notes](https://github.com/docker/setup-buildx-action/releases)
- [Commits](https://github.com/docker/setup-buildx-action/compare/v3...v4)

---
updated-dependencies:
- dependency-name: docker/setup-buildx-action
  dependency-version: '4'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-07 23:16:15 -08:00
Varad Raj Agrawal
90128d8c28 fix: memory leaks and invalid CSS in WebVTT encoder (#2164)
* fix: memory leaks and invalid CSS in WebVTT encoder

- Remove 6 unnecessary strdup() calls on string literals in
  write_cc_buffer_as_webvtt() — literals are passed directly to
  write_wrapped() which takes void*, no heap allocation needed.
  This runs in a per-character inner loop and leaked on every
  styled subtitle in a broadcast.
- Fix invalid CSS: rgba(0, 256, 0, 0.5) -> rgba(0, 255, 0, 0.5)
  CSS color channels are 0-255; 256 is out of range.
- Fix missing free(unescaped) on write-error path in
  write_stringz_as_webvtt() — matched the existing pattern on
  the adjacent error path which correctly freed both el and unescaped.

Fixes #2154

* fix: move WebVTT changelog entry to unreleased 0.96.7 section
2026-03-07 00:37:12 -08:00
Pranav Sharma
b2c1babf90 Feat(rust): Implement WebVTT-specific timestamp format and layout anchor (#2135)
* Feat(rust): Implement WebVTT-specific timestamp format and layout anchor

* style: apply rustfmt to g608.rs

* fix(rust): use WebVTT-spec dot separator for milliseconds in timestamp line
2026-03-07 00:26:55 -08:00
Anayo Anyafulu
a44db9f617 Port hex_string_to_int from C to Rust and support uppercase hex (#2141) 2026-03-06 23:55:33 -08:00
rhythmcache
e4bcade799 Fix potential out-of-bounds access in write_stringz_as_srt_to_output (#2128)
* Fix loop condition for reading unescaped string

* Fix condition to check for newline escape sequence

* Fix formatting
2026-03-06 23:44:40 -08:00
cheron2000
f377be9578 cmake: guard Unix-only linker flags on non-Windows platforms (#2156) 2026-03-01 12:05:37 -08:00
Carlos Fernandez Sanz
8de778af32 fix(report): NULL guard and deduplicate call in teletext JSON report (#2155)
Follow-up to #2137:
- Add NULL check on private_data in tlt_print_seen_pages_json
- Remove duplicate get_sib_stream_by_type call in print_file_report_json

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 15:56:36 -08:00
Carlos Fernandez Sanz
c3c5d9c0a0 Merge pull request #2045 from pranavshar223/docs-userdata-debug-log
docs/debug: add verbose log when user data is skipped
2026-02-28 15:48:43 -08:00
Gaurav karmakar
02c524f693 [FIX]Fix panic when using --mp4/--mkv without explicit input format (#2107)
* Fix panic when using --mp4/--mkv without explicit input format

* Restored Autodetection Logic &Fixed TCP Input Regressio

* style: apply cargo fmt

* ix(parser): add --mp4/--mkv handling and remove unwrap panic

Properly handle mp4/mkv flags in set_input_format()
and replace args.input.unwrap() with unwrap_or().

* resolve the formatting issue

---------

Co-authored-by: GAURAV KARMAKAR <gaurav.k@graeon.ai>
2026-02-28 15:35:56 -08:00
Carlos Fernandez Sanz
9614f58187 Merge pull request #2137 from ananyaaa66/master
Add teletext pages and PID type tagging to JSON report output (#1399)
2026-02-28 14:25:22 -08:00
Carlos Fernandez Sanz
5de265d64f Merge pull request #2152 from Varadraj75/feat/mkv-mpeg2-cc-extraction
FEATURE: Add V_MPEG2 track support in MKV demuxer for CC extraction
2026-02-28 14:02:09 -08:00
Carlos Fernandez Sanz
d80bf92820 test(dtvcc): add lazy decoder allocation lifecycle test 2026-02-28 13:52:00 -08:00
Carlos Fernandez
6ee370cafe style: cargo fmt
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:18:40 -08:00
Carlos Fernandez
e9a84ac2aa test(dtvcc): add lazy decoder allocation lifecycle test
Verify that CEA-708 service decoders are not allocated at startup
and are only created on first use when data arrives for that service.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:14:25 -08:00
Carlos Fernandez Sanz
ffb380601a Merge pull request #2151 from x15sr71/fix/x86-decoder-alloc-panic
Fix: Lazy CEA-708 service decoder allocation to prevent OOM panic on x86 (32-bit Windows)
2026-02-28 13:08:11 -08:00
Carlos Fernandez
36711b9d3b Merge origin/master into fix/x86-decoder-alloc-panic
Resolve CHANGES.TXT conflict: keep both entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-28 13:07:54 -08:00
Carlos Fernandez Sanz
c919dafd4a Merge pull request #2150 from Varadraj75/fix/scc-framerate-help-and-23.98-alias
[FEATURE] :  Add 23.98fps alias for --scc-framerate and clarify help text
2026-02-28 12:59:17 -08:00
Carlos Fernandez Sanz
af57a0c425 Merge pull request #2148 from CCExtractor/dependabot/github_actions/actions/upload-artifact-7
chore(deps): bump actions/upload-artifact from 6 to 7
2026-02-28 12:55:15 -08:00
Carlos Fernandez Sanz
f457348a43 Merge pull request #2138 from x15sr71/fix/rust-timing-unwrap-panic
FIX(rust): prevent panic when formatting out-of-range timestamps in timing.rs and c_functions.rs
2026-02-28 10:30:05 -08:00
Varadraj75
a87ad2bec7 style: remove BOM from matroska.c 2026-02-28 22:57:58 +05:30
Varadraj75
0cf5abfa9c style: apply clang-format to matroska.c 2026-02-28 22:51:39 +05:30
Varadraj75
934398fc86 feat: support V_MPEG2 tracks in MKV demuxer for CC extraction
MKV files with MPEG-2 video (common in DVD sources) were silently skipped.
Add V_MPEG2 track detection and processing using the existing process_m2v()
infrastructure, matching how mp4.c handles MPEG-2 streams.

Fixes #2149
2026-02-28 21:43:42 +05:30