Compare commits

..

46 Commits

Author SHA1 Message Date
Carlos Fernandez
1b0e66bc67 feat: Add --input scc option for SCC input format
Add support for `--input scc` command line option to explicitly specify
SCC (Scenarist Closed Caption) input format, for consistency with other
input format options.

Changes:
- Add `Scc` variant to `InFormat` enum in args.rs
- Handle `InFormat::Scc` in parser.rs to set StreamMode::Scc
- Add `StreamMode::Scc` case in print_cfg() in both Rust and C code

Fixes #1972

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 21:45:08 +01:00
Carlos Fernandez Sanz
f5dc1cf467 fix: Make --quiet flag work again 2026-01-02 21:35:42 +01:00
Carlos Fernandez
d31ea87c03 fix: Make --quiet flag work again
The --quiet flag was broken due to two issues:

1. Inverted mapping in Rust FFI: The C→Rust constant mapping was wrong.
   CCX_MESSAGES_QUIET=0, CCX_MESSAGES_STDOUT=1, CCX_MESSAGES_STDERR=2
   but the Rust code mapped 0→Stdout, 1→Stderr, 2→Quiet.

2. Logger initialization timing: The Rust logger was initialized BEFORE
   command-line arguments were parsed, so --quiet had no effect.

Changes:
- Fix the OutputTarget mapping in ccxr_init_basic_logger()
- Add set_target() method to CCExtractorLogger
- Add ccxr_update_logger_target() to update logger after arg parsing
- Call ccxr_update_logger_target() after ccxr_parse_parameters()

Fixes #1956

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 19:49:06 +01:00
Carlos Fernandez Sanz
a5b8bc8bf6 fix(rust): Update palette crate to 0.7 for Fedora compatibility 2026-01-02 10:00:00 +01:00
Carlos Fernandez
ad2ee70743 fix(rust): Update palette crate to 0.7 for Fedora compatibility
The palette crate renamed `to_positive_degrees()` to `into_positive_degrees()`
in version 0.7.0. This was causing build failures on Fedora which uses
system-packaged Rust crates with newer versions.

Changes:
- Update palette dependency from 0.6.1 to 0.7
- Change method call from to_positive_degrees() to into_positive_degrees()

Fixes build failure reported in #1954.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 08:11:47 +01:00
Carlos Fernandez Sanz
562de8893b Merge pull request #1953 from THE-Amrit-mahto-05/fix/ts-heap-overflow
Fix/ts heap overflow
2026-01-02 08:09:39 +01:00
Carlos Fernandez Sanz
12adb5e92b fix(ci): Fix Windows CI cargo build cache path 2026-01-02 08:06:22 +01:00
Carlos Fernandez Sanz
203eb23030 fix(build): Support FFMPEG_INCLUDE_DIR on Linux for hardsubx 2026-01-02 08:02:46 +01:00
Amrit Kumar Mahto
774c3a0d3a Update CHANGES.TXT 2026-01-02 04:31:39 +05:30
Amrit Kumar Mahto
07f1ddc3fe Fix capbufsize and capbuflen assignments to use size_t 2026-01-02 04:26:23 +05:30
Carlos Fernandez
303bec8d5d fix(build): Support FFMPEG_INCLUDE_DIR on Linux for hardsubx
The FFMPEG_INCLUDE_DIR environment variable was only checked inside
the macOS-specific block, so it had no effect on Linux builds.

Changes:
- Move FFMPEG_INCLUDE_DIR check outside platform-specific blocks so
  it works on all platforms
- Add pkg-config fallback on Linux to automatically find FFmpeg
  include paths

This fixes compilation on systems like Fedora where FFmpeg headers
are installed in non-standard locations (e.g., /usr/include/ffmpeg).

Fixes #1954

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 23:24:44 +01:00
Amrit kumar Mahto
e43a6b5ced Fix TS Heap Buffer Overflow in copy_payload_to_capbuf (ts_functions.c) 2026-01-02 00:59:31 +05:30
Amrit kumar Mahto
64484af49e [FIX] Prevent stack buffer overflow in ISDB-CC decoder parse_csi 2026-01-02 00:40:07 +05:30
Amrit kumar Mahto
7526da884c Prevent integer overflow in EIA-608 screen buffer reallocation 2026-01-01 23:20:25 +05:30
Carlos Fernandez Sanz
3529bb29b4 fix(avc): Remove unnecessary TODO for idr_pic_id 2026-01-01 13:02:25 +01:00
Carlos Fernandez
925560f773 fix(avc): Remove unnecessary TODO for idr_pic_id
The idr_pic_id is read to advance the bitstream position (required for
correct parsing of subsequent fields), but the value itself is not
needed for caption extraction. CCExtractor uses pic_order_cnt_lsb for
frame ordering and PTS for timing - idr_pic_id serves no purpose here.

Closes #1895

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 12:58:55 +01:00
Carlos Fernandez
200eb1750a fix(ci): Fix Windows CI cargo build cache path
- Fix cargo build cache path: rust.bat sets CARGO_TARGET_DIR to the
  windows/ directory, which results in artifacts at
  windows/x86_64-pc-windows-msvc/, not windows/target/
- Remove redundant CARGO_TARGET_DIR from build steps since rust.bat
  overrides it anyway

Note: vcpkg.json builtin-baseline intentionally not changed to avoid
breaking transitive dependencies (libxml2 etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 12:44:18 +01:00
Carlos Fernandez Sanz
6dcdb4b2d8 chore: Bump version to 0.96.4 2026-01-01 10:52:36 +01:00
Carlos Fernandez Sanz
a2d2c4f063 Merge branch 'master' into release/0.96.4 2026-01-01 10:39:12 +01:00
Carlos Fernandez
4ab6c83c27 chore: Bump version to 0.96.4
Update version numbers across all packaging and build files for the
0.96.4 release.

Changes in 0.96.4:
- New: Persistent CEA-708 decoder context
- New: OCR character blacklist options
- New: OCR line-split option
- Fix: 32-bit build failures (i686, armv7l)
- Fix: Legacy argument compatibility (-1, -2, -12, --sc, --svc)
- Fix: Prevent heap buffer overflow in Teletext (security)
- Fix: Lazy OCR initialization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 10:17:56 +01:00
Carlos Fernandez Sanz
e66a0183c3 Merge pull request #1941 from Harshdhall01/cleanup-rust-todos
[RUST] Document EIA-708 buffer size and remove debug logging
2026-01-01 09:59:22 +01:00
Carlos Fernandez Sanz
a8ec28630a Merge pull request #1934 from THE-Amrit-mahto-05/fix/teletext-overflow
prevent heap buffer overflow in Teletext demux path
2026-01-01 09:53:01 +01:00
Carlos Fernandez Sanz
432d4237ec ci(windows): Optimize Windows build workflow for faster CI 2026-01-01 09:42:19 +01:00
Carlos Fernandez
e9519c4a67 fix(ci): Remove broken Chocolatey caching for GPAC
The Chocolatey cache only stored package metadata, not the actual
installed SDK files at C:\Program Files\GPAC\sdk\include. This caused
build failures when the cache hit but GPAC headers weren't available.

GPAC install is fast (~30s) so caching isn't worth the complexity.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 09:31:11 +01:00
Carlos Fernandez Sanz
fef005ddaf perf(dvb): Lazy OCR initialization for DVB subtitle decoder 2026-01-01 02:48:22 +01:00
Carlos Fernandez
546c776e57 ci(windows): Optimize Windows build workflow for faster CI
Major optimizations to reduce Windows build time from ~45 min to ~10 min:

1. **Single consolidated job** - Previously two parallel jobs (Release/Debug)
   duplicated the entire 34-minute vcpkg install. Now builds both
   configurations sequentially in one job, sharing all cached dependencies.

2. **lukka/run-vcpkg action** - Replaces manual git clone + bootstrap with
   the official vcpkg action that has built-in caching and better handling.

3. **Cache vcpkg installed packages** - Separately cache the installed/
   directory with hash-based keys for faster cache hits.

4. **Cargo caching** - Add caching for Rust registry and build artifacts,
   similar to the Linux build workflow.

5. **Chocolatey caching** - Cache gpac package to skip download on hits.

6. **Conditional installs** - Skip vcpkg install and choco install when
   cache is available.

7. **Updated Rust toolchain action** - Replace deprecated actions-rs/toolchain
   with dtolnay/rust-toolchain.

Expected improvements:
- Cold build: ~20 minutes (down from ~45 min)
- Warm build (cache hit): ~5-10 minutes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 02:03:35 +01:00
Carlos Fernandez Sanz
daeed5df71 fix(args): Add legacy aliases for backwards compatibility 2026-01-01 01:49:59 +01:00
Carlos Fernandez
b56ab005a8 perf(dvb): Lazy OCR initialization for DVB subtitle decoder
Previously, Tesseract OCR was initialized eagerly when a DVB subtitle
stream was detected in the transport stream. This caused ~10 second
startup overhead even for files that:
- Have DVB streams but no actual bitmap subtitles
- Have DVB streams alongside CEA-608 text captions (which don't need OCR)
- Have DVB streams but the user only wants raw bitmap output

The initialization also created OpenMP worker threads that generated
hundreds of thousands of futex syscalls, causing valgrind tests to
take 15+ minutes instead of seconds.

This change defers OCR initialization until a DVB bitmap region actually
needs to be processed with OCR. Benefits:

- Files with DVB streams but no bitmap content: 10s → 0.1s
- Files with DVB + CEA-608 captions: 10s → 1-3s
- Valgrind test performance: 15+ min → seconds (no thread pool overhead
  when OCR isn't used)

The ocr_initialized flag ensures init_ocr() is called only once, on
first bitmap encounter.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 01:26:27 +01:00
Carlos Fernandez
f1681ee929 fix(args): Add support for legacy -1, -2, -12 numeric options
Map legacy CEA-608 field extraction options to their modern equivalent:
- -1  → --output-field=1 (extract field 1 only)
- -2  → --output-field=2 (extract field 2 only)
- -12 → --output-field=12 (extract both fields)

These options are documented in the help text and were commonly used
but stopped working after the Rust argument parser migration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 01:02:54 +01:00
Carlos Fernandez
031f463b5c fix(args): Add legacy aliases for backwards compatibility
Add aliases for options that were commonly used with single-dash
or without hyphens in older versions of ccextractor:

- --parsePAT: add alias "pat" (for -pat)
- --parsePMT: add alias "pmt" (for -pmt)
- --no-teletext: add alias "noteletext" (for -noteletext)
- --no-rollup: add alias "noru" (for -noru)
- --no-bom: add alias "nobom" (for -nobom)
- --no-autotimeref: add alias "noautotimeref" (for -noautotimeref)
- --no-scte20: add alias "noscte20" (for -noscte20)

These aliases, combined with normalize_legacy_option() which converts
single-dash to double-dash (e.g., -noteletext -> --noteletext), allow
old scripts using legacy syntax to continue working.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 00:42:23 +01:00
Carlos Fernandez Sanz
b23866f5a8 feat(rust): Add persistent DtvccRust context for CEA-708 decoder 2026-01-01 00:21:40 +01:00
Carlos Fernandez
2ec93c3d3d fix(rust): Check dtvcc_rust instead of dtvcc in ccxr_process_cc_data
When Rust CEA-708 decoder is enabled, dec_ctx.dtvcc is set to NULL
and dec_ctx.dtvcc_rust holds the actual DtvccRust context. The null
check was incorrectly checking dtvcc, causing the function to return
early and skip all CEA-708 data processing.

This fixes tests 21, 31, 32, 105, 137, 141-149 which were failing
with exit code 10 (EXIT_NO_CAPTIONS) because no captions were being
extracted from CEA-708 streams.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 19:47:24 +01:00
Harshdhall01
5564aa8a54 Merge upstream/master and resolve CHANGES.TXT conflict 2025-12-31 23:51:24 +05:30
Harshdhall01
868fac5423 Update CHANGES.TXT with Rust documentation improvements 2025-12-31 23:33:49 +05:30
Harshdhall01
9ca26171d6 Document EIA-708 buffer size and remove debug logging
- Added documentation for EIA_708_BUFFER_LENGTH explaining that 2048 bytes
  is 16x the CEA-708 specification minimum of 128 bytes per service
- Removed debug logging of target address from target.rs as per TODO
- References CEA-708-E Section 8.4.3 for buffer specifications

Addresses two TODO items in the Rust codebase cleanup effort.
2025-12-31 23:24:39 +05:30
Carlos
ead4cbb278 fix(rust): remove double-increment of cb_708 counter
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR #1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:18:13 +01:00
Carlos
dfd7101f54 chore: Remove plan file from repo and add plans/ to .gitignore
- Move PLAN_PR1618_REIMPLEMENTATION.md to local plans/ folder
- Add plans/ to .gitignore to keep plans local

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:18:13 +01:00
Carlos
9659d3cf4c fix(rust): Use persistent DtvccRust context in ccxr_process_cc_data
The ccxr_process_cc_data function was still accessing dec_ctx.dtvcc
(which is NULL when Rust is enabled), causing a null pointer panic.

Changed to use dec_ctx.dtvcc_rust (the persistent DtvccRust context)
instead, which fixes the crash when processing CEA-708 data.

Added do_cb_dtvcc_rust() function that works with DtvccRust instead
of the old Dtvcc struct.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:18:13 +01:00
Carlos
34c7cd6d2e style(c): Fix clang-format issues in Phase 3 code
- Remove extra space before comment in ccx_decoders_common.c
- Fix comment indentation in mp4.c

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:16:31 +01:00
Carlos
7448a260c7 feat(c): Use Rust CEA-708 decoder in C code (Phase 3)
- init_cc_decode(): Initialize dtvcc_rust via ccxr_dtvcc_init()
- dinit_cc_decode(): Free dtvcc_rust via ccxr_dtvcc_free()
- flush_cc_decode(): Flush via ccxr_flush_active_decoders()
- general_loop.c: Set encoder via ccxr_dtvcc_set_encoder() (3 locations)
- mp4.c: Use ccxr_dtvcc_set_encoder() and ccxr_dtvcc_process_data()
- Add ccxr_dtvcc_is_active() declaration to ccx_dtvcc.h
- Fix clippy warnings in tv_screen.rs (unused assignments)
- All changes guarded with #ifndef DISABLE_RUST
- Update implementation plan to mark Phase 3 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:16:31 +01:00
Carlos
54236f840c feat(c): Add C header declarations for Rust CEA-708 FFI (Phase 2)
- Add void *dtvcc_rust field to lib_cc_decode struct
- Declare ccxr_dtvcc_init, ccxr_dtvcc_free, ccxr_dtvcc_process_data in ccx_dtvcc.h
- Declare ccxr_dtvcc_set_encoder in lib_ccx.h
- Declare ccxr_flush_active_decoders in ccx_decoders_common.h
- All declarations guarded with #ifndef DISABLE_RUST
- Update implementation plan to mark Phase 2 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:16:31 +01:00
Carlos
6a4a1c97ec fix(rust): Address PR review - use existing DTVCC_MAX_SERVICES constant
- Remove duplicate CCX_DTVCC_MAX_SERVICES constant from decoder/mod.rs
- Import existing DTVCC_MAX_SERVICES from lib_ccxr::common
- Fix clippy uninlined_format_args warnings in avc/core.rs and decoder/mod.rs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:15:29 +01:00
Carlos
f369959096 style(rust): Apply cargo fmt formatting
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:15:29 +01:00
Carlos
1c2bcb5088 feat(rust): Add persistent DtvccRust context for CEA-708 decoder (Phase 1)
This is Phase 1 of the fix for issue #1499. It adds the Rust-side
infrastructure for a persistent CEA-708 decoder context without
modifying any C code, ensuring backward compatibility.

Problem:
The current Rust CEA-708 decoder creates a new Dtvcc struct on every
call to ccxr_process_cc_data(), causing all state to be reset. This
breaks stateful caption processing.

Solution:
Add a new DtvccRust struct that:
- Owns its decoder state (rather than borrowing from C)
- Persists across processing calls
- Is managed via FFI functions callable from C

Changes:
- Add DtvccRust struct in decoder/mod.rs with owned decoders
- Add CCX_DTVCC_MAX_SERVICES constant (63)
- Add FFI functions in lib.rs:
  - ccxr_dtvcc_init(): Create persistent context
  - ccxr_dtvcc_free(): Free context and all owned memory
  - ccxr_dtvcc_set_encoder(): Set encoder (not available at init)
  - ccxr_dtvcc_process_data(): Process CC data
  - ccxr_flush_active_decoders(): Flush all active decoders
  - ccxr_dtvcc_is_active(): Check if context is active
- Add unit tests for DtvccRust
- Use heap allocation for large structs to avoid stack overflow

The existing Dtvcc struct and ccxr_process_cc_data() remain unchanged
for backward compatibility. Phase 2-3 will add C header declarations
and modify C code to use the new functions.

Fixes: #1499 (partial)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:15:29 +01:00
Amrit Kumar Mahto
125c5e8821 Update ts_functions.c 2025-12-30 15:13:19 +05:30
Amrit kumar Mahto
0ba941e8c0 ts: prevent heap buffer overflow in Teletext demux path 2025-12-30 07:13:04 +05:30
33 changed files with 282 additions and 482 deletions

View File

@@ -3,7 +3,6 @@ name: Build CCExtractor on Windows
env:
RUSTFLAGS: -Ctarget-feature=+crt-static
VCPKG_DEFAULT_TRIPLET: x64-windows-static
VCPKG_DEFAULT_BINARY_CACHE: C:\vcpkg\.cache
VCPKG_COMMIT: ab2977be50c702126336e5088f4836060733c899
on:
@@ -25,104 +24,112 @@ on:
- "src/rust/**"
jobs:
build_release:
build:
runs-on: windows-2022
steps:
- name: Check out repository
uses: actions/checkout@v6
- name: Setup MSBuild.exe
uses: microsoft/setup-msbuild@v2.0.0
with:
msbuild-architecture: x64
# Install GPAC (fast, ~30s, not worth caching complexity)
- name: Install gpac
run: choco install gpac --version 2.4.0
run: choco install gpac --version 2.4.0 --no-progress
# Use lukka/run-vcpkg for better caching
- name: Setup vcpkg
run: mkdir C:\vcpkg\.cache
- name: Cache vcpkg
id: cache
uses: lukka/run-vcpkg@v11
id: runvcpkg
with:
vcpkgGitCommitId: ${{ env.VCPKG_COMMIT }}
vcpkgDirectory: ${{ github.workspace }}/vcpkg
vcpkgJsonGlob: 'windows/vcpkg.json'
# Cache vcpkg installed packages separately for faster restores
- name: Cache vcpkg installed packages
id: vcpkg-installed-cache
uses: actions/cache@v5
with:
path: ${{ github.workspace }}/vcpkg/installed
key: vcpkg-installed-${{ runner.os }}-${{ env.VCPKG_COMMIT }}-${{ hashFiles('windows/vcpkg.json') }}
restore-keys: |
vcpkg-installed-${{ runner.os }}-${{ env.VCPKG_COMMIT }}-
- name: Install vcpkg dependencies
if: steps.vcpkg-installed-cache.outputs.cache-hit != 'true'
run: ${{ github.workspace }}/vcpkg/vcpkg.exe install --x-install-root ${{ github.workspace }}/vcpkg/installed/
working-directory: windows
# Cache Rust/Cargo artifacts
- name: Cache Cargo registry
uses: actions/cache@v5
with:
path: |
C:\vcpkg\.cache
key: vcpkg-${{ runner.os }}-${{ env.VCPKG_COMMIT }}
- name: Build vcpkg
run: |
git clone https://github.com/microsoft/vcpkg
./vcpkg/bootstrap-vcpkg.bat
- name: Install dependencies
run: ${{ github.workspace }}/vcpkg/vcpkg.exe install --x-install-root ${{ github.workspace }}/vcpkg/installed/
working-directory: windows
- uses: actions-rs/toolchain@v1
~/.cargo/registry
~/.cargo/git
key: ${{ runner.os }}-cargo-registry-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-registry-
# Cache Cargo build artifacts - rust.bat sets CARGO_TARGET_DIR to windows/
# which results in artifacts at windows/x86_64-pc-windows-msvc/
- name: Cache Cargo build artifacts
uses: actions/cache@v5
with:
toolchain: stable
override: true
path: ${{ github.workspace }}/windows/x86_64-pc-windows-msvc
key: ${{ runner.os }}-cargo-build-${{ hashFiles('**/Cargo.lock') }}-${{ hashFiles('src/rust/**/*.rs') }}
restore-keys: |
${{ runner.os }}-cargo-build-${{ hashFiles('**/Cargo.lock') }}-
${{ runner.os }}-cargo-build-
- name: Setup Rust toolchain
uses: dtolnay/rust-toolchain@stable
- name: Install Win 10 SDK
uses: ilammy/msvc-dev-cmd@v1
- name: build Release-Full
# Build Release-Full
- name: Build Release-Full
env:
LIBCLANG_PATH: "C:\\Program Files\\LLVM\\lib"
LLVM_CONFIG_PATH: "C:\\Program Files\\LLVM\\bin\\llvm-config"
CARGO_TARGET_DIR: "..\\..\\windows"
BINDGEN_EXTRA_CLANG_ARGS: -fmsc-version=0
VCPKG_ROOT: ${{ github.workspace }}/vcpkg
run: msbuild ccextractor.sln /p:Configuration=Release-Full /p:Platform=x64
working-directory: ./windows
- name: Display version information
- name: Display Release version information
run: ./ccextractorwinfull.exe --version
working-directory: ./windows/x64/Release-Full
- uses: actions/upload-artifact@v6
- name: Upload Release artifact
uses: actions/upload-artifact@v6
with:
name: CCExtractor Windows Release build
path: |
./windows/x64/Release-Full/ccextractorwinfull.exe
./windows/x64/Release-Full/*.dll
build_debug:
runs-on: windows-2022
steps:
- name: Check out repository
uses: actions/checkout@v6
- name: Setup MSBuild.exe
uses: microsoft/setup-msbuild@v2.0.0
with:
msbuild-architecture: x64
- name: Install gpac
run: choco install gpac --version 2.4.0
- name: Setup vcpkg
run: mkdir C:\vcpkg\.cache
- name: Cache vcpkg
id: cache
uses: actions/cache@v5
with:
path: |
C:\vcpkg\.cache
key: vcpkg-${{ runner.os }}-${{ env.VCPKG_COMMIT }}
- name: Build vcpkg
run: |
git clone https://github.com/microsoft/vcpkg
./vcpkg/bootstrap-vcpkg.bat
- name: Install dependencies
run: ${{ github.workspace }}/vcpkg/vcpkg.exe install --x-install-root ${{ github.workspace }}/vcpkg/installed/
working-directory: windows
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Install Win 10 SDK
uses: ilammy/msvc-dev-cmd@v1
- name: build Debug-Full
# Build Debug-Full (reuses cached Cargo artifacts)
- name: Build Debug-Full
env:
LIBCLANG_PATH: "C:\\Program Files\\LLVM\\lib"
LLVM_CONFIG_PATH: "C:\\Program Files\\LLVM\\bin\\llvm-config"
CARGO_TARGET_DIR: "..\\..\\windows"
BINDGEN_EXTRA_CLANG_ARGS: -fmsc-version=0
VCPKG_ROOT: ${{ github.workspace }}/vcpkg
run: msbuild ccextractor.sln /p:Configuration=Debug-Full /p:Platform=x64
working-directory: ./windows
- name: Display version information
- name: Display Debug version information
continue-on-error: true
run: ./ccextractorwinfull.exe --version
working-directory: ./windows/x64/Debug-Full
- uses: actions/upload-artifact@v6
- name: Upload Debug artifact
uses: actions/upload-artifact@v6
with:
name: CCExtractor Windows Debug build
path: |

View File

@@ -4,7 +4,7 @@ MAINTAINER = Marc Espie <espie@openbsd.org>
CATEGORIES = multimedia
COMMENT = closed caption subtitles extractor
HOMEPAGE = https://ccextractor.org
V = 0.96.3
V = 0.96.4
DISTFILES = ccextractor.${V:S/.//}-src.zip
MASTER_SITES = ${MASTER_SITE_SOURCEFORGE:=ccextractor/}
DISTNAME = ccextractor-$V

View File

@@ -1,3 +1,16 @@
0.96.4 (2026-01-01)
-------------------
- New: Persistent CEA-708 decoder context - maintains state across multiple calls for proper subtitle continuity
- New: OCR character blacklist options (--ocr-blacklist, --ocr-blacklist-file) for improved accuracy
- New: OCR line-split option (--ocr-splitontimechange) for better subtitle segmentation
- Fix: 32-bit build failures on i686 and armv7l architectures
- Fix: Legacy command-line argument compatibility (-1, -2, -12, --sc, --svc)
- Fix: Prevent heap buffer overflow in Teletext processing (security fix)
- Fix: Prevent integer overflow leading to heap buffer overflow in Transport Stream handling (security fix)
- Fix: Lazy OCR initialization - only initialize when first DVB subtitle is encountered
- Build: Optimized Windows CI workflow for faster builds
- Fix: Updated GUI with version 0.7.1. A blind attempt to fix a hang on start on some Windows.
0.96.3 (2025-12-29)
-------------------
- New: VOBSUB subtitle extraction with OCR support for MP4 files

View File

@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.71])
AC_INIT([CCExtractor], [0.96.3], [carlos@ccextractor.org])
AC_INIT([CCExtractor], [0.96.4], [carlos@ccextractor.org])
AC_CONFIG_AUX_DIR([build-conf])
AC_CONFIG_SRCDIR([../src/ccextractor.c])
AM_INIT_AUTOMAKE([foreign subdir-objects])

View File

@@ -2,7 +2,7 @@
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.71])
AC_INIT([CCExtractor],[0.96.3],[carlos@ccextractor.org])
AC_INIT([CCExtractor],[0.96.4],[carlos@ccextractor.org])
AC_CONFIG_AUX_DIR([build-conf])
AC_CONFIG_SRCDIR([../src/ccextractor.c])
AM_INIT_AUTOMAKE([foreign subdir-objects])

View File

@@ -1,5 +1,5 @@
pkgname=ccextractor
pkgver=0.96.3
pkgver=0.96.4
pkgrel=1
pkgdesc="A closed captions and teletext subtitles extractor for video streams."
arch=('i686' 'x86_64')

View File

@@ -1,5 +1,5 @@
Name: ccextractor
Version: 0.96.3
Version: 0.96.4
Release: 1
Summary: A closed captions and teletext subtitles extractor for video streams.
Group: Applications/Internet

View File

@@ -1,7 +1,7 @@
#!/bin/bash
TYPE="debian" # can be one of 'slackware', 'debian', 'rpm'
PROGRAM_NAME="ccextractor"
VERSION="0.96.3"
VERSION="0.96.4"
RELEASE="1"
LICENSE="GPL-2.0"
MAINTAINER="carlos@ccextractor.org"

View File

@@ -2,7 +2,7 @@
<package xmlns="http://schemas.microsoft.com/packaging/2015/06/nuspec.xsd">
<metadata>
<id>ccextractor</id>
<version>0.96.3</version>
<version>0.96.4</version>
<title>CCExtractor</title>
<authors>CCExtractor Development Team</authors>
<owners>CCExtractor</owners>

View File

@@ -7,7 +7,7 @@ $toolsDir = "$(Split-Path -parent $MyInvocation.MyCommand.Definition)"
$packageArgs = @{
packageName = $packageName
fileType = 'MSI'
url64bit = 'https://github.com/CCExtractor/ccextractor/releases/download/v0.96.3/CCExtractor.0.96.3.msi'
url64bit = 'https://github.com/CCExtractor/ccextractor/releases/download/v0.96.4/CCExtractor.0.96.4.msi'
checksum64 = 'FFCAB0D766180AFC2832277397CDEC885D15270DECE33A9A51947B790F1F095B'
checksumType64 = 'sha256'
silentArgs = '/quiet /norestart'

View File

@@ -1,6 +1,6 @@
# yaml-language-server: $schema=https://aka.ms/winget-manifest.installer.1.9.0.schema.json
PackageIdentifier: CCExtractor.CCExtractor
PackageVersion: 0.96.3
PackageVersion: 0.96.4
Platform:
- Windows.Desktop
MinimumOSVersion: 10.0.0.0
@@ -15,7 +15,7 @@ UpgradeBehavior: install
Installers:
- Architecture: x64
InstallerType: msi
InstallerUrl: https://github.com/CCExtractor/ccextractor/releases/download/v0.96.3/CCExtractor.0.96.3.msi
InstallerUrl: https://github.com/CCExtractor/ccextractor/releases/download/v0.96.4/CCExtractor.0.96.4.msi
InstallerSha256: FFCAB0D766180AFC2832277397CDEC885D15270DECE33A9A51947B790F1F095B
ManifestType: installer
ManifestVersion: 1.9.0

View File

@@ -1,6 +1,6 @@
# yaml-language-server: $schema=https://aka.ms/winget-manifest.defaultLocale.1.9.0.schema.json
PackageIdentifier: CCExtractor.CCExtractor
PackageVersion: 0.96.3
PackageVersion: 0.96.4
PackageLocale: en-US
Publisher: CCExtractor Development
PublisherUrl: https://ccextractor.org

View File

@@ -1,6 +1,6 @@
# yaml-language-server: $schema=https://aka.ms/winget-manifest.version.1.9.0.schema.json
PackageIdentifier: CCExtractor.CCExtractor
PackageVersion: 0.96.3
PackageVersion: 0.96.4
DefaultLocale: en-US
ManifestType: version
ManifestVersion: 1.9.0

View File

@@ -435,6 +435,9 @@ int main(int argc, char *argv[])
int compile_ret = ccxr_parse_parameters(argc, argv);
// Update the Rust logger target after parsing so --quiet is respected
ccxr_update_logger_target();
if (compile_ret == EXIT_NO_INPUT_FILES)
{
print_usage();

View File

@@ -992,9 +992,9 @@ void slice_header(struct encoder_ctx *enc_ctx, struct lib_cc_decode *dec_ctx, un
if (nal_unit_type == 5)
{
// idr_pic_id: Read to advance bitstream position; value not needed for caption extraction
tmp = read_exp_golomb_unsigned(&q1);
dvprint("idr_pic_id= % 4lld (%#llX)\n", tmp, tmp);
// TODO
}
if (dec_ctx->avc_ctx->pic_order_cnt_type == 0)
{

View File

@@ -316,10 +316,20 @@ int write_cc_buffer(ccx_decoder_608_context *context, struct cc_subtitle *sub)
if (!data->empty && context->output_format != CCX_OF_NULL)
{
struct eia608_screen *new_data = (struct eia608_screen *)realloc(sub->data, (sub->nb_data + 1) * sizeof(*data));
size_t new_size;
if (sub->nb_data + 1 > SIZE_MAX / sizeof(struct eia608_screen))
{
ccx_common_logging.log_ftn("Too many screens, cannot allocate more memory.\n");
return 0;
}
new_size = (sub->nb_data + 1) * sizeof(struct eia608_screen);
struct eia608_screen *new_data = (struct eia608_screen *)realloc(sub->data, new_size);
if (!new_data)
{
ccx_common_logging.log_ftn("No Memory left");
ccx_common_logging.log_ftn("Out of memory while reallocating screen buffer\n");
return 0;
}
sub->data = new_data;
@@ -386,10 +396,20 @@ int write_cc_line(ccx_decoder_608_context *context, struct cc_subtitle *sub)
if (!data->empty)
{
struct eia608_screen *new_data = (struct eia608_screen *)realloc(sub->data, (sub->nb_data + 1) * sizeof(*data));
size_t new_size;
if (sub->nb_data + 1 > SIZE_MAX / sizeof(struct eia608_screen))
{
ccx_common_logging.log_ftn("Too many screens, cannot allocate more memory.\n");
return 0;
}
new_size = (sub->nb_data + 1) * sizeof(struct eia608_screen);
struct eia608_screen *new_data = (struct eia608_screen *)realloc(sub->data, new_size);
if (!new_data)
{
ccx_common_logging.log_ftn("No Memory left");
ccx_common_logging.log_ftn("Out of memory while reallocating screen buffer\n");
return 0;
}
sub->data = new_data;

View File

@@ -724,16 +724,17 @@ static int parse_csi(ISDBSubContext *ctx, const uint8_t *buf, int len)
// Copy buf in arg
for (i = 0; *buf != 0x20; i++)
{
if (i >= (sizeof(arg)) + 1)
if (i >= sizeof(arg) - 1)
{
isdb_log("UnExpected CSI %d >= %d", sizeof(arg) + 1, i);
isdb_log("UnExpected CSI: too long");
break;
}
arg[i] = *buf;
buf++;
}
/* ignore terminating 0x20 character */
arg[i] = *buf++;
if (i < sizeof(arg))
arg[i] = *buf++;
switch (*buf)
{

View File

@@ -285,6 +285,9 @@ static void ccx_demuxer_print_cfg(struct ccx_demuxer *ctx)
case CCX_SM_MXF:
mprint("MXF");
break;
case CCX_SM_SCC:
mprint("SCC");
break;
#ifdef WTV_DEBUG
case CCX_SM_HEX_DUMP:
mprint("Hex");

View File

@@ -182,6 +182,7 @@ typedef struct DVBSubContext
LLONG time_out;
#ifdef ENABLE_OCR
void *ocr_ctx;
int ocr_initialized; // Flag to track if OCR has been lazily initialized
#endif
DVBSubRegion *region_list;
DVBSubCLUT *clut_list;
@@ -442,7 +443,11 @@ void *dvbsub_init_decoder(struct dvb_config *cfg)
}
#ifdef ENABLE_OCR
ctx->ocr_ctx = init_ocr(ctx->lang_index);
// Lazy OCR initialization: don't init here, wait until a bitmap actually needs OCR
// This avoids ~10 second Tesseract startup overhead for files that have DVB streams
// but don't actually produce any bitmap subtitles (e.g., files with CEA-608 captions)
ctx->ocr_ctx = NULL;
ctx->ocr_initialized = 0;
#endif
ctx->version = -1;
@@ -1701,6 +1706,12 @@ static int write_dvb_sub(struct lib_cc_decode *dec_ctx, struct cc_subtitle *sub)
// Perform OCR
#ifdef ENABLE_OCR
char *ocr_str = NULL;
// Lazy OCR initialization: only init when we actually have a bitmap to process
if (!ctx->ocr_initialized)
{
ctx->ocr_ctx = init_ocr(ctx->lang_index);
ctx->ocr_initialized = 1; // Mark as initialized even if init_ocr returns NULL
}
if (ctx->ocr_ctx)
{
int ret = ocr_rect(ctx->ocr_ctx, rect, &ocr_str, region->bgcolor, dec_ctx->ocr_quantmode);

View File

@@ -1,7 +1,7 @@
#ifndef CCX_CCEXTRACTOR_H
#define CCX_CCEXTRACTOR_H
#define VERSION "0.96.3"
#define VERSION "0.96.4"
// Load common includes and constants for library usage
#include "ccx_common_platform.h"
@@ -160,6 +160,7 @@ struct lib_ccx_ctx *init_libraries(struct ccx_s_options *opt);
void dinit_libraries(struct lib_ccx_ctx **ctx);
extern void ccxr_init_basic_logger();
extern void ccxr_update_logger_target();
// ccextractor.c
void print_end_msg(void);

View File

@@ -6,6 +6,7 @@
#include "dvb_subtitle_decoder.h"
#include "ccx_decoders_isdb.h"
#include "file_buffer.h"
#include <inttypes.h>
#ifdef DEBUG_SAVE_TS_PACKETS
#include <sys/types.h>
@@ -568,6 +569,13 @@ int copy_capbuf_demux_data(struct ccx_demuxer *ctx, struct demuxer_data **data,
if (cinfo->codec == CCX_CODEC_TELETEXT)
{
if (cinfo->capbuflen > BUFSIZE - ptr->len)
{
fatal(CCX_COMMON_EXIT_BUG_BUG,
"Teletext packet (%" PRId64 ") larger than remaining buffer (%" PRId64 ").\n",
cinfo->capbuflen, (int64_t)(BUFSIZE - ptr->len));
}
memcpy(ptr->buffer + ptr->len, cinfo->capbuf, cinfo->capbuflen);
ptr->len += cinfo->capbuflen;
return CCX_OK;
@@ -662,7 +670,6 @@ void cinfo_cremation(struct ccx_demuxer *ctx, struct demuxer_data **data)
int copy_payload_to_capbuf(struct cap_info *cinfo, struct ts_payload *payload)
{
int newcapbuflen;
if (cinfo->ignore == CCX_TRUE &&
((cinfo->stream != CCX_STREAM_TYPE_VIDEO_MPEG2 &&
@@ -688,17 +695,22 @@ int copy_payload_to_capbuf(struct cap_info *cinfo, struct ts_payload *payload)
}
// copy payload to capbuf
newcapbuflen = cinfo->capbuflen + payload->length;
if (newcapbuflen > cinfo->capbufsize)
if (payload->length > INT64_MAX - cinfo->capbuflen)
{
unsigned char *new_capbuf = (unsigned char *)realloc(cinfo->capbuf, newcapbuflen);
mprint("Error: capbuf size overflow\n");
return -1;
}
int64_t newcapbuflen = (int64_t)cinfo->capbuflen + payload->length;
if (newcapbuflen > (int64_t)cinfo->capbufsize)
{
unsigned char *new_capbuf = (unsigned char *)realloc(cinfo->capbuf, (size_t)newcapbuflen);
if (!new_capbuf)
return -1;
cinfo->capbuf = new_capbuf;
cinfo->capbufsize = newcapbuflen;
cinfo->capbufsize = newcapbuflen; // Note: capbufsize is int in struct cap_info
}
memcpy(cinfo->capbuf + cinfo->capbuflen, payload->start, payload->length);
cinfo->capbuflen = newcapbuflen;
cinfo->capbuflen = newcapbuflen; // Note: capbuflen is int in struct cap_info
return CCX_OK;
}

66
src/rust/Cargo.lock generated
View File

@@ -129,6 +129,26 @@ dependencies = [
"syn 2.0.111",
]
[[package]]
name = "bindgen"
version = "0.72.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "993776b509cfb49c750f11b8f07a46fa23e0a1386ffc01fb1e7d343efc387895"
dependencies = [
"bitflags 2.10.0",
"cexpr",
"clang-sys",
"itertools",
"log",
"prettyplease",
"proc-macro2",
"quote",
"regex",
"rustc-hash 2.1.1",
"shlex",
"syn 2.0.111",
]
[[package]]
name = "bitflags"
version = "1.3.2"
@@ -141,6 +161,12 @@ version = "2.10.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "812e12b5285cc515a9c72a5c1d3b6d46a19dac5acfef5265968c166106e31dd3"
[[package]]
name = "by_address"
version = "1.2.1"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "64fa3c856b712db6612c019f14756e64e4bcea13337a6b33b696333a9eaa2d06"
[[package]]
name = "camino"
version = "1.2.1"
@@ -151,7 +177,7 @@ checksum = "276a59bf2b2c967788139340c9f0c5b12d7fd6630315c15c217e559de85d2609"
name = "ccx_rust"
version = "0.1.0"
dependencies = [
"bindgen 0.64.0",
"bindgen 0.72.1",
"cfg-if",
"clap",
"encoding_rs",
@@ -335,21 +361,18 @@ dependencies = [
"windows-sys 0.61.2",
]
[[package]]
name = "fast-srgb8"
version = "1.0.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "dd2e7510819d6fbf51a5545c8f922716ecfb14df168a3242f7d33e0239efe6a1"
[[package]]
name = "fastrand"
version = "2.3.0"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "37909eebbb50d72f9059c3b6d82c0463f2ff062c9e95845c43a6c9c0355411be"
[[package]]
name = "find-crate"
version = "0.6.3"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "59a98bbaacea1c0eb6a0876280051b892eb73594fd90cf3b20e9c817029c57d2"
dependencies = [
"toml",
]
[[package]]
name = "form_urlencoded"
version = "1.2.2"
@@ -799,26 +822,26 @@ checksum = "384b8ab6d37215f3c5301a95a4accb5d64aa607f1fcb26a11b5303878451b4fe"
[[package]]
name = "palette"
version = "0.6.1"
version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "8f9cd68f7112581033f157e56c77ac4a5538ec5836a2e39284e65bd7d7275e49"
checksum = "4cbf71184cc5ecc2e4e1baccdb21026c20e5fc3dcf63028a086131b3ab00b6e6"
dependencies = [
"approx",
"num-traits",
"fast-srgb8",
"palette_derive",
"phf",
]
[[package]]
name = "palette_derive"
version = "0.6.1"
version = "0.7.6"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "05eedf46a8e7c27f74af0c9cfcdb004ceca158cb1b918c6f68f8d7a549b3e427"
checksum = "f5030daf005bface118c096f510ffb781fc28f9ab6a32ab224d8631be6851d30"
dependencies = [
"find-crate",
"by_address",
"proc-macro2",
"quote",
"syn 1.0.109",
"syn 2.0.111",
]
[[package]]
@@ -1416,15 +1439,6 @@ dependencies = [
"zerovec",
]
[[package]]
name = "toml"
version = "0.5.11"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "f4f7f0dd8d50a853a531c426359045b1998f04219d88799810762cd4ad314234"
dependencies = [
"serde",
]
[[package]]
name = "toml_datetime"
version = "0.7.3"

View File

@@ -13,7 +13,7 @@ crate-type = ["staticlib"]
[dependencies]
log = "0.4.26"
env_logger = "0.8.4"
palette = "0.6.1"
palette = "0.7"
tesseract-sys = { version = "0.5.15", optional = true, default-features = false }
leptonica-sys = { version = "= 0.4.6", optional = true, default-features = false }
clap = { version = "4.5.31", features = ["derive"] }

View File

@@ -84,7 +84,12 @@ fn main() {
{
builder = builder.clang_arg("-DENABLE_HARDSUBX");
// Add FFmpeg include paths for Mac
// Check FFMPEG_INCLUDE_DIR environment variable (works on all platforms)
if let Ok(ffmpeg_include) = env::var("FFMPEG_INCLUDE_DIR") {
builder = builder.clang_arg(format!("-I{}", ffmpeg_include));
}
// Add FFmpeg include paths for Mac (Homebrew)
if cfg!(target_os = "macos") {
// Try common Homebrew paths
if std::path::Path::new("/opt/homebrew/include").exists() {
@@ -110,10 +115,14 @@ fn main() {
}
}
}
}
// Also check environment variable
if let Ok(ffmpeg_include) = env::var("FFMPEG_INCLUDE_DIR") {
builder = builder.clang_arg(format!("-I{}", ffmpeg_include));
// On Linux, try pkg-config to find FFmpeg include paths
if cfg!(target_os = "linux") {
if let Ok(lib) = pkg_config::Config::new().probe("libavcodec") {
for path in lib.include_paths {
builder = builder.clang_arg(format!("-I{}", path.display()));
}
}
}
}

View File

@@ -147,7 +147,11 @@ pub const CCX_DECODER_608_SCREEN_WIDTH: usize = 32;
pub const ONEPASS: usize = 120; // Bytes we can always look ahead without going out of limits
pub const BUFSIZE: usize = 2048 * 1024 + ONEPASS; // 2 Mb plus the safety pass
pub const MAX_CLOSED_CAPTION_DATA_PER_PICTURE: usize = 32;
pub const EIA_708_BUFFER_LENGTH: usize = 2048; // TODO: Find out what the real limit is
/// CEA-708 Service Input Buffer size.
/// Specification minimum is 128 bytes per service, but we use 2048 bytes
/// (16x the minimum) to provide a safety margin for buffer management.
/// Reference: CEA-708-E Section 8.4.3 - Service Input Buffers
pub const EIA_708_BUFFER_LENGTH: usize = 2048;
pub const TS_PACKET_PAYLOAD_LENGTH: usize = 184; // From specs
pub const SUBLINESIZE: usize = 2048; // Max. length of a .srt line - TODO: Get rid of this
pub const STARTBYTESLENGTH: usize = 1024 * 1024;

View File

@@ -82,7 +82,6 @@ impl<'a> SendTarget<'a> {
"Unable to connect, address passed is null\n"
);
}
info!("Target address: {}\n", config.target_addr); // TODO remove this
info!("Target port: {}\n", config.port.unwrap_or(DEFAULT_TCP_PORT));
let tcp_stream = TcpStream::connect((
config.target_addr,

View File

@@ -269,6 +269,11 @@ impl<'a> CCExtractorLogger {
self.target
}
/// Sets the target for logging messages.
pub fn set_target(&mut self, target: OutputTarget) {
self.target = target;
}
/// Check if the messages are intercepted by GUI.
pub fn is_gui_mode(&self) -> bool {
self.gui_mode

View File

@@ -395,10 +395,10 @@ pub struct Args {
/// reference to the received data. Use this parameter if
/// you prefer your own reference. Note: Current this only
/// affects Teletext in timed transcript with --datets.
#[arg(long, verbatim_doc_comment, help_heading=OPTIONS_AFFECTING_INPUT_FILES)]
#[arg(long, alias="noautotimeref", verbatim_doc_comment, help_heading=OPTIONS_AFFECTING_INPUT_FILES)]
pub no_autotimeref: bool,
/// Ignore SCTE-20 data if present.
#[arg(long, verbatim_doc_comment, help_heading=OPTIONS_AFFECTING_INPUT_FILES)]
#[arg(long, alias="noscte20", verbatim_doc_comment, help_heading=OPTIONS_AFFECTING_INPUT_FILES)]
pub no_scte20: bool,
/// Create a separate file for CSS instead of inline.
#[arg(long, verbatim_doc_comment, help_heading=OPTIONS_AFFECTING_INPUT_FILES)]
@@ -453,7 +453,7 @@ pub struct Args {
/// Do not append a BOM (Byte Order Mark) to output
/// files. Note that this may break files when using
/// Windows. This is the default in non-Windows builds.
#[arg(long, verbatim_doc_comment, conflicts_with="bom", help_heading=OUTPUT_AFFECTING_OUTPUT_FILES)]
#[arg(long, alias="nobom", verbatim_doc_comment, conflicts_with="bom", help_heading=OUTPUT_AFFECTING_OUTPUT_FILES)]
pub no_bom: bool,
/// Encode subtitles in Unicode instead of Latin-1.
#[arg(long, verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_OUTPUT_FILES)]
@@ -694,7 +694,7 @@ pub struct Args {
/// If you hate the repeated lines caused by the roll-up
/// emulation, you can have ccextractor write only one
/// line at a time, getting rid of these repeated lines.
#[arg(long, verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_BUFFERING)]
#[arg(long, alias="noru", verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_BUFFERING)]
pub no_rollup: bool,
/// roll-up captions can consist of 2, 3 or 4 visible
/// lines at any time (the number of lines is part of
@@ -823,10 +823,10 @@ pub struct Args {
#[arg(long, verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_DEBUG_DATA)]
pub parsedebug: bool,
/// Print Program Association Table dump.
#[arg(long="parsePAT", verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_DEBUG_DATA)]
#[arg(long="parsePAT", alias="pat", verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_DEBUG_DATA)]
pub parse_pat: bool,
/// Print Program Map Table dump.
#[arg(long="parsePMT", verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_DEBUG_DATA)]
#[arg(long="parsePMT", alias="pmt", verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_DEBUG_DATA)]
pub parse_pmt: bool,
/// Hex-dump defective TS packets.
#[arg(long, verbatim_doc_comment, help_heading=OUTPUT_AFFECTING_DEBUG_DATA)]
@@ -861,7 +861,7 @@ pub struct Args {
/// for video streams that have both teletext packets
/// and CEA-608/708 packets (if teletext is processed
/// then CEA-608/708 processing is disabled).
#[arg(long, verbatim_doc_comment, conflicts_with="teletext", help_heading=TELETEXT_OPTIONS)]
#[arg(long, alias="noteletext", verbatim_doc_comment, conflicts_with="teletext", help_heading=TELETEXT_OPTIONS)]
pub no_teletext: bool,
/// Use the passed format to customize the (Timed) Transcript
/// output. The format must be like this: 1100100 (7 digits).
@@ -990,6 +990,8 @@ pub enum InFormat {
Mkv,
/// Material Exchange Format (MXF).
Mxf,
/// Scenarist Closed Caption (SCC).
Scc,
#[cfg(feature = "wtv_debug")]
// For WTV Debug mode only
Hex,

View File

@@ -327,6 +327,9 @@ impl CcxDemuxer<'_> {
StreamMode::Mxf => {
info!("MXF");
}
StreamMode::Scc => {
info!("SCC");
}
#[cfg(feature = "wtv_debug")]
StreamMode::HexDump => {
info!("Hex");

View File

@@ -12,7 +12,7 @@ pub extern "C" fn rgb_to_hsv(R: f32, G: f32, B: f32, H: &mut f32, S: &mut f32, V
let hsv_rep = Hsv::from_color(rgb);
*H = hsv_rep.hue.to_positive_degrees();
*H = hsv_rep.hue.into_positive_degrees();
*S = hsv_rep.saturation;
*V = hsv_rep.value;
}

View File

@@ -618,12 +618,20 @@ extern "C" fn ccxr_close_handle(handle: RawHandle) {
/// - Double-dash options (e.g., `--quiet`) are left unchanged
/// - Single-letter short options (e.g., `-o`) are left unchanged
/// - Non-option arguments (e.g., `file.ts`) are left unchanged
/// - Numeric options (e.g., `-1`, `-12`) are left unchanged (these are valid short options)
/// - Numeric options `-1`, `-2`, `-12` are converted to `--output-field=N` for CEA-608 field selection
fn normalize_legacy_option(arg: String) -> String {
// Handle legacy numeric options for CEA-608 field extraction
// These map to --output-field which is the modern equivalent
match arg.as_str() {
"-1" => return "--output-field=1".to_string(),
"-2" => return "--output-field=2".to_string(),
"-12" => return "--output-field=12".to_string(),
_ => {}
}
// Check if it's a single-dash option with multiple characters (e.g., -quiet)
// but not a short option with a value (e.g., -o filename)
// Single-letter options like -o, -s should be left unchanged
// Numeric options like -1, -12 should also be left unchanged
if arg.starts_with('-')
&& !arg.starts_with("--")
&& arg.len() > 2
@@ -843,12 +851,18 @@ mod test {
#[test]
fn test_normalize_legacy_option_numeric_options() {
// Numeric options should remain unchanged (these are valid ccextractor options)
assert_eq!(normalize_legacy_option("-1".to_string()), "-1".to_string());
assert_eq!(normalize_legacy_option("-2".to_string()), "-2".to_string());
// Legacy numeric options for CEA-608 field selection are converted to --output-field
assert_eq!(
normalize_legacy_option("-1".to_string()),
"--output-field=1".to_string()
);
assert_eq!(
normalize_legacy_option("-2".to_string()),
"--output-field=2".to_string()
);
assert_eq!(
normalize_legacy_option("-12".to_string()),
"-12".to_string()
"--output-field=12".to_string()
);
}
@@ -878,349 +892,4 @@ mod test {
// Double dash alone (end of options marker)
assert_eq!(normalize_legacy_option("--".to_string()), "--".to_string());
}
// =========================================================================
// FFI Integration Tests
// =========================================================================
//
// These tests verify the FFI boundary - the extern "C" functions that are
// called from C code. They test the actual C→Rust call path with realistic
// struct states.
mod ffi_integration_tests {
use super::*;
use crate::decoder::test::create_test_dtvcc_settings;
use crate::utils::get_zero_allocated_obj;
/// Helper to create a lib_cc_decode struct configured for Rust-enabled path
/// (dtvcc=NULL, dtvcc_rust=valid pointer)
fn create_rust_enabled_decode_ctx() -> (Box<lib_cc_decode>, *mut std::ffi::c_void) {
let mut ctx = get_zero_allocated_obj::<lib_cc_decode>();
// Create the DtvccRust context via the FFI function
let settings = create_test_dtvcc_settings();
let dtvcc_rust = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
// Set up the timing context (required for processing)
let timing = get_zero_allocated_obj::<ccx_common_timing_ctx>();
ctx.timing = Box::into_raw(timing);
// Simulate Rust-enabled mode: dtvcc is NULL, dtvcc_rust is set
ctx.dtvcc = std::ptr::null_mut();
ctx.dtvcc_rust = dtvcc_rust;
(ctx, dtvcc_rust)
}
// -----------------------------------------------------------------
// ccxr_dtvcc_init / ccxr_dtvcc_free lifecycle tests
// -----------------------------------------------------------------
#[test]
fn test_ffi_dtvcc_init_creates_valid_context() {
let settings = create_test_dtvcc_settings();
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
// Should return a valid (non-null) pointer
assert!(!dtvcc_ptr.is_null());
// Verify we can check if it's active
let is_active = unsafe { ccxr_dtvcc_is_active(dtvcc_ptr) };
assert_eq!(is_active, 1);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
#[test]
fn test_ffi_dtvcc_init_with_null_returns_null() {
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(std::ptr::null()) };
assert!(dtvcc_ptr.is_null());
}
#[test]
fn test_ffi_dtvcc_free_with_null_is_safe() {
// Should not crash when called with null
ccxr_dtvcc_free(std::ptr::null_mut());
}
#[test]
fn test_ffi_dtvcc_is_active_with_null_returns_zero() {
let result = unsafe { ccxr_dtvcc_is_active(std::ptr::null_mut()) };
assert_eq!(result, 0);
}
// -----------------------------------------------------------------
// ccxr_dtvcc_set_encoder tests
// -----------------------------------------------------------------
#[test]
fn test_ffi_set_encoder_with_valid_context() {
let settings = create_test_dtvcc_settings();
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
// Create an encoder
let encoder = Box::new(encoder_ctx::default());
let encoder_ptr = Box::into_raw(encoder);
// Set the encoder
unsafe { ccxr_dtvcc_set_encoder(dtvcc_ptr, encoder_ptr) };
// Verify by checking the internal state
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert_eq!(dtvcc.encoder, encoder_ptr);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
unsafe { drop(Box::from_raw(encoder_ptr)) };
}
#[test]
fn test_ffi_set_encoder_with_null_context_is_safe() {
let encoder = Box::new(encoder_ctx::default());
let encoder_ptr = Box::into_raw(encoder);
// Should not crash
unsafe { ccxr_dtvcc_set_encoder(std::ptr::null_mut(), encoder_ptr) };
// Clean up
unsafe { drop(Box::from_raw(encoder_ptr)) };
}
// -----------------------------------------------------------------
// ccxr_dtvcc_process_data tests
// -----------------------------------------------------------------
#[test]
fn test_ffi_process_data_packet_start() {
let settings = create_test_dtvcc_settings();
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
// Process a packet start (cc_type = 3)
unsafe { ccxr_dtvcc_process_data(dtvcc_ptr, 1, 3, 0xC2, 0x00) };
// Verify state changed
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert!(dtvcc.is_header_parsed);
assert_eq!(dtvcc.packet_length, 2);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
#[test]
fn test_ffi_process_data_with_null_is_safe() {
// Should not crash
unsafe { ccxr_dtvcc_process_data(std::ptr::null_mut(), 1, 3, 0xC2, 0x00) };
}
#[test]
fn test_ffi_process_data_state_persists_across_calls() {
// This is THE key test - verifying the fix for issue #1499
let settings = create_test_dtvcc_settings();
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
// First call: start a packet (packet length = 8 bytes)
unsafe { ccxr_dtvcc_process_data(dtvcc_ptr, 1, 3, 0xC4, 0x00) };
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert!(dtvcc.is_header_parsed);
assert_eq!(dtvcc.packet_length, 2);
// Second call: add more data (cc_type = 2)
unsafe { ccxr_dtvcc_process_data(dtvcc_ptr, 1, 2, 0x21, 0x00) };
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert_eq!(dtvcc.packet_length, 4);
// Third call: add more data
unsafe { ccxr_dtvcc_process_data(dtvcc_ptr, 1, 2, 0x00, 0x00) };
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert_eq!(dtvcc.packet_length, 6);
// State persisted across all calls!
assert!(dtvcc.is_header_parsed);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
// -----------------------------------------------------------------
// ccxr_process_cc_data tests (the main FFI entry point from C)
// -----------------------------------------------------------------
#[test]
fn test_ffi_ccxr_process_cc_data_with_null_ctx_returns_error() {
let data: [u8; 3] = [0x97, 0x1F, 0x3C];
let result = unsafe { ccxr_process_cc_data(std::ptr::null_mut(), data.as_ptr(), 1) };
assert_eq!(result, -1);
}
#[test]
fn test_ffi_ccxr_process_cc_data_with_null_data_returns_error() {
let (ctx, dtvcc_ptr) = create_rust_enabled_decode_ctx();
let result =
unsafe { ccxr_process_cc_data(Box::into_raw(ctx), std::ptr::null(), 1) };
assert_eq!(result, -1);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
#[test]
fn test_ffi_ccxr_process_cc_data_with_zero_count_returns_error() {
let (ctx, dtvcc_ptr) = create_rust_enabled_decode_ctx();
let data: [u8; 3] = [0x97, 0x1F, 0x3C];
let result =
unsafe { ccxr_process_cc_data(Box::into_raw(ctx), data.as_ptr(), 0) };
assert_eq!(result, -1);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
#[test]
fn test_ffi_ccxr_process_cc_data_with_null_dtvcc_rust_returns_error() {
let mut ctx = get_zero_allocated_obj::<lib_cc_decode>();
// Set up timing but leave dtvcc_rust as null
let timing = get_zero_allocated_obj::<ccx_common_timing_ctx>();
ctx.timing = Box::into_raw(timing);
ctx.dtvcc_rust = std::ptr::null_mut();
let data: [u8; 3] = [0x97, 0x1F, 0x3C];
let result =
unsafe { ccxr_process_cc_data(Box::into_raw(ctx), data.as_ptr(), 1) };
assert_eq!(result, -1);
}
#[test]
fn test_ffi_ccxr_process_cc_data_processes_708_data() {
let (ctx, dtvcc_ptr) = create_rust_enabled_decode_ctx();
// Set an encoder so processing can complete
let encoder = get_zero_allocated_obj::<encoder_ctx>();
ccxr_dtvcc_set_encoder(dtvcc_ptr, Box::into_raw(encoder));
// CEA-708 packet start (cc_type=3, cc_valid=1)
// Header byte breakdown: cc_valid is bit 2, cc_type is bits 0-1
// For cc_type=3 (bits 0-1 = 11) and cc_valid=1 (bit 2 = 1):
// 0xFF = 11111111 -> cc_valid=1, cc_type=3
let data: [u8; 3] = [0xFF, 0xC2, 0x00];
let ctx_ptr = Box::into_raw(ctx);
let result = ccxr_process_cc_data(ctx_ptr, data.as_ptr(), 1);
// Should return 0 (success) for valid 708 data
assert_eq!(result, 0);
// Verify the context was updated
let ctx = unsafe { &*ctx_ptr };
assert_eq!(ctx.cc_stats[3], 1); // cc_type 3 was processed
assert_eq!(ctx.current_field, 3); // Field set to 3 for 708 data
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
#[test]
fn test_ffi_ccxr_process_cc_data_skips_invalid_pairs() {
let (ctx, dtvcc_ptr) = create_rust_enabled_decode_ctx();
// Invalid pair (cc_valid = 0)
// Header byte: 0xF9 = 11111001 -> cc_valid=0, cc_type=1
let data: [u8; 3] = [0xF9, 0x00, 0x00];
let ctx_ptr = Box::into_raw(ctx);
let result = unsafe { ccxr_process_cc_data(ctx_ptr, data.as_ptr(), 1) };
// Should return -1 (no valid data processed)
assert_eq!(result, -1);
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
// -----------------------------------------------------------------
// ccxr_flush_active_decoders tests
// -----------------------------------------------------------------
#[test]
fn test_ffi_flush_with_null_is_safe() {
// Should not crash
unsafe { ccxr_flush_active_decoders(std::ptr::null_mut()) };
}
#[test]
fn test_ffi_flush_with_valid_context() {
let settings = create_test_dtvcc_settings();
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
// Set an encoder (required for flushing)
let encoder = Box::new(encoder_ctx::default());
unsafe { ccxr_dtvcc_set_encoder(dtvcc_ptr, Box::into_raw(encoder)) };
// Should not crash
unsafe { ccxr_flush_active_decoders(dtvcc_ptr) };
// Clean up
ccxr_dtvcc_free(dtvcc_ptr);
}
// -----------------------------------------------------------------
// Full lifecycle integration test
// -----------------------------------------------------------------
#[test]
fn test_ffi_complete_lifecycle() {
// This test simulates the complete lifecycle as it would be used from C code:
// 1. Initialize context
// 2. Set encoder
// 3. Process multiple CC data packets
// 4. Flush
// 5. Free
// 1. Initialize
let settings = create_test_dtvcc_settings();
let dtvcc_ptr = unsafe { ccxr_dtvcc_init(settings.as_ref()) };
assert!(!dtvcc_ptr.is_null());
// Verify initial state
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert!(dtvcc.is_active, "DtvccRust should be active after init");
assert_eq!(dtvcc.packet_length, 0);
assert!(!dtvcc.is_header_parsed);
// 2. Set encoder
let encoder = get_zero_allocated_obj::<encoder_ctx>();
let encoder_ptr = Box::into_raw(encoder);
ccxr_dtvcc_set_encoder(dtvcc_ptr, encoder_ptr);
// 3. Process a packet start (cc_type=3) - this should set is_header_parsed
// Use 0xC4 for packet header: 0xC4 & 0x3F = 4, so max_len = 4*2 = 8 bytes
// This way the packet won't be completed until we've added enough data
ccxr_dtvcc_process_data(dtvcc_ptr, 1, 3, 0xC4, 0x00);
// Verify packet start was processed
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert!(
dtvcc.is_header_parsed,
"is_header_parsed should be true after packet start"
);
assert_eq!(dtvcc.packet_length, 2, "packet_length should be 2");
// Process packet data (cc_type=2) - packet is not complete yet (need 8 bytes)
ccxr_dtvcc_process_data(dtvcc_ptr, 1, 2, 0x21, 0x00);
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert_eq!(dtvcc.packet_length, 4, "packet_length should be 4");
// Add more data
ccxr_dtvcc_process_data(dtvcc_ptr, 1, 2, 0x00, 0x00);
let dtvcc = unsafe { &*(dtvcc_ptr as *mut DtvccRust) };
assert_eq!(dtvcc.packet_length, 6, "packet_length should be 6");
// 4. Flush
ccxr_flush_active_decoders(dtvcc_ptr);
// 5. Free
ccxr_dtvcc_free(dtvcc_ptr);
unsafe { drop(Box::from_raw(encoder_ptr)) };
}
}
}

View File

@@ -33,10 +33,11 @@ pub unsafe extern "C" fn ccxr_init_basic_logger() {
.unwrap_or(DebugMessageFlag::VERBOSE);
let mask = DebugMessageMask::new(debug_mask, debug_mask_on_debug);
let gui_mode_reports = ccx_options.gui_mode_reports != 0;
// CCX_MESSAGES_QUIET=0, CCX_MESSAGES_STDOUT=1, CCX_MESSAGES_STDERR=2
let messages_target = match ccx_options.messages_target {
0 => OutputTarget::Stdout,
1 => OutputTarget::Stderr,
2 => OutputTarget::Quiet,
0 => OutputTarget::Quiet,
1 => OutputTarget::Stdout,
2 => OutputTarget::Stderr,
_ => OutputTarget::Stderr, // Default to stderr for invalid values
};
let _ = set_logger(CCExtractorLogger::new(
@@ -46,6 +47,28 @@ pub unsafe extern "C" fn ccxr_init_basic_logger() {
));
}
/// Updates the logger target after command-line arguments have been parsed.
/// This is needed because the logger is initialized before argument parsing,
/// and options like --quiet need to be applied afterwards.
///
/// # Safety
///
/// `ccx_options` in C must be properly initialized and the logger must have
/// been initialized via `ccxr_init_basic_logger` before calling this function.
#[no_mangle]
pub unsafe extern "C" fn ccxr_update_logger_target() {
// CCX_MESSAGES_QUIET=0, CCX_MESSAGES_STDOUT=1, CCX_MESSAGES_STDERR=2
let messages_target = match ccx_options.messages_target {
0 => OutputTarget::Quiet,
1 => OutputTarget::Stdout,
2 => OutputTarget::Stderr,
_ => OutputTarget::Stderr,
};
if let Some(mut logger) = logger_mut() {
logger.set_target(messages_target);
}
}
/// Rust equivalent for `verify_crc32` function in C. Uses C-native types as input and output.
///
/// # Safety

View File

@@ -311,6 +311,7 @@ impl OptionsExt for Options {
InFormat::Mp4 => self.demux_cfg.auto_stream = StreamMode::Mp4,
InFormat::Mkv => self.demux_cfg.auto_stream = StreamMode::Mkv,
InFormat::Mxf => self.demux_cfg.auto_stream = StreamMode::Mxf,
InFormat::Scc => self.demux_cfg.auto_stream = StreamMode::Scc,
}
}