Commit Graph

3122 Commits

Author SHA1 Message Date
Carlos Fernandez
735a01bf04 refactor(rust): rename parser tests with descriptive names and expand coverage
Replace poorly-named tests (options_1 through options_51, broken_1, etc.)
with 201 descriptively-named tests organized by category:

- Input/output format tests
- Encoding tests
- Stream/program selection tests
- CEA-708 service tests
- Codec selection tests
- Timing option tests
- Debug flag tests
- Teletext option tests
- XMLTV option tests
- Credits option tests
- Buffering option tests
- And more

Each test name now clearly indicates what CLI option is being tested
and what behavior is expected, e.g.:
- test_input_ts_sets_transport_stream_mode
- test_608_enables_decoder_608_debug
- test_service_enables_708_with_single_service

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 13:55:56 -08:00
Carlos Fernandez
3d18b38c32 Revert "Merge pull request #1912 from Rahul-2k4/final"
This reverts commit 2a6d27f9ff, reversing
changes made to 74e64c0421.
2026-01-18 13:28:15 -08:00
Carlos Fernandez Sanz
2a6d27f9ff Merge pull request #1912 from Rahul-2k4/final
Automatic extraction of multiple DVB subtitle streams (--split-dvb-subs) fixes#447 #1864
2026-01-18 13:27:17 -08:00
Carlos Fernandez Sanz
74e64c0421 Merge pull request #2035 from THE-Amrit-mahto-05/fix/mkvlang-params-check
fix mkvlang_params_check: prevent panic on multi-byte characters
2026-01-18 13:07:44 -08:00
Carlos Fernandez Sanz
e7dc4d19f7 Merge pull request #2036 from THE-Amrit-mahto-05/fix/process-word-file-safely
fix: process_word_file propagates errors instead of panicking
2026-01-18 12:51:33 -08:00
Carlos Fernandez Sanz
1fbb51056d Merge pull request #1992 from THE-Amrit-mahto-05/fix/teletext-panic
fix: Teletext decoder panic on malformed BCD data
2026-01-18 12:46:56 -08:00
Carlos Fernandez Sanz
5d9a8cc6f2 Merge pull request #2031 from THE-Amrit-mahto-05/fix/rust-userdata-uaf
Fix use-after-free bugs in Rust userdata handling
2026-01-18 12:24:10 -08:00
Amrit kumar Mahto
17abad79f2 fix: process_word_file propagates errors instead of panicking 2026-01-19 01:53:19 +05:30
Amrit kumar Mahto
707e1f01fe updating 2026-01-19 01:34:41 +05:30
Amrit kumar Mahto
efc8b791e7 fix mkvlang_params_check: prevent panic on multi-byte characters 2026-01-19 01:28:25 +05:30
Carlos Fernandez Sanz
a856bbde10 Merge pull request #2015 from Harsh-Sahu43/tests/validate-cc-pair
[FIX] rust: add defensive length check to validate_cc_pair
2026-01-18 11:52:49 -08:00
Carlos Fernandez Sanz
9390b876fa Merge pull request #2034 from THE-Amrit-mahto-05/fix/parser-atol-bug
Fix atol Parsing Bug in parser.rs for Numeric Values and Suffixes
2026-01-18 11:38:53 -08:00
Amrit kumar Mahto
ead0a4beed little fix 2026-01-19 00:45:30 +05:30
Amrit kumar Mahto
b2e9cb74c1 Fix atol parsing bug for numeric values and K/M/G suffixes 2026-01-19 00:31:25 +05:30
Amrit kumar Mahto
20b194aac4 Consolidate Rust userdata fixes: UAF, bounds checks, and VBI safety 2026-01-18 23:34:43 +05:30
Harsh Sahu
2d9b480972 Merge branch 'CCExtractor:master' into tests/validate-cc-pair 2026-01-18 14:48:46 +05:30
Harsh Sahu
1447b021cb Fixed : formatting 2026-01-18 13:58:31 +05:30
Amrit kumar Mahto
e0ac126cff Fix use-after-free bugs in Rust userdata handling 2026-01-18 05:37:44 +05:30
Carlos Fernandez Sanz
b8019bdb35 [FIX] Resolve output artifact on Linux/WSL (line clearing) 2026-01-17 06:02:59 -08:00
Carlos Fernandez Sanz
9d921dec43 fix(matroska): prevent out-of-bounds NAL parsing in AVC/HEVC blocks 2026-01-17 06:00:12 -08:00
Carlos Fernandez Sanz
3ada2b5002 fix(avc): prevent segfault in report-only mode (-out=report) 2026-01-17 05:58:03 -08:00
Rahul Tripathi
50ec9866db style: Fix clang-format ternary operator alignment 2026-01-17 14:12:59 +05:30
Rahul Tripathi
ce87d01fbd fix: Cap DVB subtitle duration to 10s to prevent 65s page timeout bug
Root cause: When FTS timestamps were invalid due to PTS discontinuities,
the code fell back to DVB page timeout (65 seconds) as subtitle duration.
This caused impossible 65-second subtitle durations in split output.

Fix: Added DVB_MAX_SUBTITLE_DURATION_MS constant (10s) and simplified the
duration capping logic to always enforce reasonable subtitle durations.

Tested with: multiprogram_spain.ts, BBC1.ts, BBC2.ts - all outputs now
have properly capped durations with no timestamps exceeding 10 seconds.
2026-01-17 12:14:12 +05:30
Carlos Fernandez
fecd24d08e fix(avc): prevent segfault in report-only mode (-out=report)
When using -out=report mode, the encoder context (enc_ctx) is NULL
because no output file needs to be created. The Rust FFI function
ccxr_process_avc was dereferencing this NULL pointer, causing a
segmentation fault.

Add NULL pointer checks at the FFI boundary to skip AVC processing
when enc_ctx is NULL. This is safe because report mode only needs
stream analysis, not caption extraction.

Fixes #2023

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 20:50:42 -08:00
Rahul Tripathi
482544c5bf docs: Add DVB deduplication feature and double-free fix to CHANGES.TXT 2026-01-16 16:41:46 +05:30
Rahul Tripathi
84a7a1fb41 style: Fix remaining clang-format indentation issues 2026-01-16 16:34:26 +05:30
Rahul Tripathi
f198bcd2ec style: Fix clang-format issues across modified files 2026-01-16 16:31:09 +05:30
Rahul Tripathi
4b6016ca1c style: Fix clang-format issues in dvb_dedup files 2026-01-16 16:26:28 +05:30
Rahul Tripathi
9c2ea47eda fix: Add dvb_dedup.c to Windows and Mac build systems 2026-01-16 16:24:52 +05:30
Rahul Tripathi
170b466a20 fix: Add dvb_dedup.c to autoconf build for GitHub Actions Linux CI 2026-01-16 16:23:43 +05:30
Rahul Tripathi
2bdcd20115 cleanup: Remove temporary debug, test, and tool artifacts from final branch
Remove 186 unwanted files including:
- Debug logs and diagnostic output (debug_*.log, debug_output/, diagnosis_output/)
- Test artifacts and binaries (linux/alltests_*, test_output/, test_split_verification/)
- Tool state files (.agent/, .claude/, .ralph/, .mcp.json, etc.)
- Root-level scripts and temporary Python utilities
- Working notes and temporary documentation (DVB_SPLIT_*.md, progress.json, etc.)
- Unfinished MCP server (tools/mcp-ccextractor/)
- Project-specific working notes (CLAUDE.md)

Update .gitignore to prevent re-adding unwanted artifacts.

Result: final branch now contains only DVB-split feature implementation
and core project files, matching upstream structure while preserving
all functional changes.
2026-01-16 16:18:02 +05:30
Rahul Tripathi
ab18d234d2 Merge branch 'CCExtractor:master' into final 2026-01-16 16:05:36 +05:30
Rahul Tripathi
3ff02617b0 fix: Resolve double-free crash in DVB split pipeline cleanup
- Remove redundant free() after free_subtitle() in pipeline cleanup
  (free_subtitle already frees the struct via freep(&sub))
- Add ctx->prev = NULL after free_encoder_context in dinit_encoder
- Keep free_encoder_context non-recursive for prev (dinit_encoder owns it)
- Remove debug output from general_loop.c
2026-01-16 16:02:59 +05:30
Rahul Tripathi
c7fad95e24 test: Fix DVB dedup test suite - DVB-005 and DVB-007 corrections
- DVB-005: Changed from Teletext-only file to proper DVB extraction using --program-number 530
- DVB-007: Fixed shell script globbing error and variable parsing for dedup effectiveness check
- All test cases now pass: DVB-004 (multilingual split), DVB-005 (single program), DVB-006 (non-DVB), DVB-007 (dedup check), DVB-008 (no-dedup flag)
- Verified: No 0-byte files, deduplication removes 19-29 duplicate lines per stream
2026-01-16 15:05:35 +05:30
Rahul Tripathi
c018f1f43c docs: Mark DVB-004 through DVB-008 as complete
- All deduplication infrastructure implemented and tested
- Test script validates code paths execute correctly
- Dedup ring buffer integrated into all DVB subtitle processing
- Full validation requires OCR build (-DWITH_OCR=ON)
- Code review confirms all 8 stories are complete
2026-01-16 14:15:44 +05:30
Rahul Tripathi
98b50b2a35 test: Add DVB dedup test suite script
- Created dvb_dedup_test.sh to test DVB-001 through DVB-008
- Tests multilingual split, single stream, non-DVB files
- Tests --no-dvb-dedup flag functionality
- Checks for excessive duplication in output
- Note: Requires OCR (Tesseract) for full validation
- Without OCR, files are empty but dedup logic still executes
2026-01-16 14:15:03 +05:30
Rahul Tripathi
46cee0893a feat: DVB-003 - Add --no-dvb-dedup CLI flag
- Added no_dvb_dedup field to ccx_s_options structure
- Initialized to 0 (deduplication enabled by default)
- Added --no-dvb-dedup CLI flag in Rust args parser
- Added flag to Options struct in lib_ccxr
- Wired flag through Rust-to-C FFI boundary in common.rs
- Modified dvbsub_handle_display_segment to respect flag
- Dedup logic only runs when no_dvb_dedup is false (default)
- Added help text describing flag purpose
2026-01-16 14:11:13 +05:30
Rahul Tripathi
42ad48ca7f feat: DVB-001 - Add per-stream dedup ring buffer
- Created dvb_dedup.h with dedup_entry and dedup_ring structures
- Implemented dvb_dedup.c with init, is_duplicate, and add functions
- Integrated dedup_ring into DVBSubContext structure
- Added deduplication check in dvbsub_handle_display_segment
- Dedup uses PTS + PID + composition_id + ancillary_id as unique key
- 8-slot ring buffer to track recently emitted subtitles
- Prevents duplicate subtitles from propagating to output files
2026-01-16 14:04:00 +05:30
Akhilesh
ed26a595bd style(matroska): apply clang-format 2026-01-14 13:42:22 +05:30
Akhilesh
b1c2aabb22 fix(matroska): prevent out-of-bounds NAL parsing in AVC/HEVC blocks 2026-01-14 13:20:23 +05:30
Rahul Tripathi
bb2ae1e70f Fix DVB subtitle repetition bug and memory safety issues 2026-01-13 20:29:44 +05:30
Rahul Tripathi
6464fa486e Fix DVB Split: Remove forced dirty flag, rely on natural dirty + clear 2026-01-13 18:16:41 +05:30
Rahul Tripathi
5aa747ab33 Fix DVB Split bugs: Prevent subtitle repetition and buffer overflow crash 2026-01-13 17:53:30 +05:30
Rahul Tripathi
39adfa59b0 Fix Bug 1: Clear OCR text leakage preventing subtitle repetition
- Clear enc_ctx->prev->last_str after encode_sub() in dvb_subtitle_decoder.c
- This prevents OCR-recognized text from leaking into subsequent subtitles
- Tested: All subtitle output shows unique text with zero duplicates
2026-01-12 11:00:27 +05:30
Carlos Fernandez Sanz
20287548cb fix: Correct progress time display for multi-program TS files 2026-01-11 20:56:59 +01:00
collectnis
b7b10419ec style: fix formatting alignment 2026-01-11 13:46:00 +00:00
collectnis
8fbfd68426 style: fix formatting alignment 2026-01-11 13:31:55 +00:00
collectnis
7159d0b6d0 fix: resolve merge conflict in changelog 2026-01-11 11:48:58 +00:00
collectnis
c515578e37 docs: update changelog 2026-01-11 11:30:54 +00:00
collectnis
e55b8eb764 [CLI] Fix output artifacts on Linux/WSL by clearing line on \r 2026-01-11 10:34:16 +00:00