Compare commits

...

2084 Commits

Author SHA1 Message Date
Carlos Fernandez Sanz
270c89b7f8 [FEATURE]: Add Snap packaging support with Github workflow 2026-01-31 17:52:06 -08:00
Carlos Fernandez Sanz
032cd1c6b1 Merge pull request #2040 from THE-Amrit-mahto-05/fix/avc-sei-payload-size
Fix SEI payload type handling: changes payload_type and payload_size from i32 to u32 for type safety, keeping as usize casts only where needed for indexing.
2026-01-31 17:35:40 -08:00
Carlos Fernandez Sanz
42e4e9a657 Merge pull request #2049 from THE-Amrit-mahto-05/fix-null-len-guard
Adds defensive null pointer and negative length checks to ccxr_verify_crc32 FFI function to prevent undefined behavior.
2026-01-31 17:18:31 -08:00
Carlos Fernandez Sanz
821e307333 Merge pull request #2076 from THE-Amrit-mahto-05/fix-miri-null-deref
Verified with Miri - fixes undefined behavior when calling dealloc() on null pointer in window row deallocation.
2026-01-31 13:58:48 -08:00
Amrit kumar Mahto
ae81f3ba3d Fix Miri-reported UB in window row deallocation and tests 2026-01-31 00:49:50 +05:30
Carlos Fernandez Sanz
b190751b2c [FIX]macOS: Fix hardsub pipeline failing due to arm64/x86_64 build mismatch 2026-01-28 18:30:38 -08:00
GAURAV KARMAKAR
f1bb0f4dce macOS: Fix hardsub pipeline failing due to arm64/x86_64 build mismatch 2026-01-29 00:12:09 +05:30
Amrit kumar Mahto
f147ac27f8 re running for CI to pass checks 2026-01-27 21:03:19 +05:30
Amrit kumar Mahto
2dfb44d7d4 re running CI 2026-01-27 20:42:53 +05:30
Carlos Fernandez Sanz
580e721dfe fix: prevent heap overflow in parse_PAT/parse_PMT and null deref in processmp4 2026-01-23 23:06:35 -08:00
Carlos Fernandez
d0a82447ff fix(rust): resolve clippy unnecessary_unwrap warnings for Rust 1.93
Use if-let patterns instead of is_some() + unwrap() to satisfy
the stricter clippy::unnecessary_unwrap lint in Rust 1.93.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 20:58:03 -08:00
Carlos Fernandez
5c19c7b932 style: fix Rust formatting in parser.rs test
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 20:14:26 -08:00
Carlos Fernandez
fd7271bae2 fix: prevent heap overflow in parse_PAT/parse_PMT and null deref in processmp4
- parse_PAT: Add bounds check for payload_length >= 8 before accessing
  header fields (fixes #2053)
- parse_PMT: Add ES_info_length validation and 2-byte minimum check
  before reading descriptor_tag and desc_len in PRIVATE_USER_MPEG2
  and teletext parsing loops (fixes #2054)
- processmp4: Add NULL check for file parameter before passing to
  mprint (fixes #2055)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-23 20:12:09 -08:00
Chandragupt Singh
05c68349d5 Merge branch 'master' into feat/snap-distribution-support 2026-01-23 15:26:59 +05:30
Chandragupt Singh
09f21f64e4 fix(snap): resolve GPAC dependency and runtime issues in core22 snap 2026-01-23 15:23:33 +05:30
Carlos Fernandez Sanz
c65fb0874e fix(rust): correct mkvlang test to use MkvLangFilter type 2026-01-19 07:43:15 -08:00
Carlos Fernandez
9db727d593 fix(rust): correct mkvlang test to use MkvLangFilter type
The test_mkvlang_sets_mkv_language test was comparing against
Language::Eng, but the mkvlang field type was changed to MkvLangFilter
when BCP 47 language tag support was added in PR #2038.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-19 07:41:36 -08:00
Amrit kumar Mahto
fe6dad83b7 use u32 for SEI payload type and size 2026-01-19 14:16:50 +05:30
Carlos Fernandez Sanz
d494286082 ci: add workflow to build .deb packages 2026-01-18 20:37:22 -08:00
Carlos Fernandez
259e881483 fix(ci): add missing FFmpeg dependencies to hardsubx .deb packages
Add libavdevice, libswresample, and libavfilter dependencies for
the hardsubx variant on both Ubuntu 24.04 and Debian 13 workflows.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 20:11:10 -08:00
Carlos Fernandez
197069d3b8 ci: add Debian 13 (Trixie) .deb build workflow
Creates .deb packages for Debian 13 using a Docker container.
- Builds GPAC from source (abi-16.4 tag)
- Creates basic and hardsubx variants
- Uses Debian 13's library versions:
  - libtesseract5, libleptonica6
  - libavcodec61, libavformat61, libavutil59, libswscale8

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 20:02:16 -08:00
Carlos Fernandez
7a810d736d fix(ci): add libcurl3t64-gnutls dependency to .deb package
CCExtractor is linked against libcurl-gnutls which requires this
runtime dependency on Ubuntu 24.04.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:55:09 -08:00
Carlos Fernandez
1413c948c4 fix(ci): correct leptonica package name for Ubuntu 24.04
Ubuntu 24.04 uses liblept5, not libleptonica6 (which is Ubuntu 25.04).

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:39:42 -08:00
Carlos Fernandez
bb5385913b fix(ci): use apt install to handle .deb dependencies in test step
apt install automatically resolves and installs dependencies,
unlike dpkg -i which fails if dependencies are missing.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:13:41 -08:00
Carlos Fernandez Sanz
f8981e8e1e refactor(rust): Rename parser tests with descriptive names and expand coverage 2026-01-18 19:12:34 -08:00
Carlos Fernandez
a1871abf04 fix(ci): switch .deb build to Ubuntu 24.04
- Use ubuntu-24.04 runner instead of ubuntu-22.04
- Update dependencies to match Ubuntu 24.04 library versions
  (libtesseract5, libleptonica6, libavcodec60, etc.)
- Update GPAC cache key for new Ubuntu version

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:08:09 -08:00
Carlos Fernandez
20b3773bb9 fix(ci): correct version and add missing dependencies in .deb workflow
- Update CMakeLists.txt version from 0.89 to 0.96 to match lib_ccx.h
- Extract version from lib_ccx.h instead of CMakeLists.txt for accuracy
- Add missing runtime dependencies: libtesseract, libleptonica
- Add FFmpeg dependencies for hardsubx variant

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 19:02:48 -08:00
Carlos Fernandez
8786b4cf75 fix(ci): correct LICENSE filename to LICENSE.txt 2026-01-18 18:08:04 -08:00
Carlos Fernandez
8632ecda5b ci: add workflow to build .deb packages
Add GitHub Actions workflow to build Debian packages (.deb) for Linux.

Features:
- Builds GPAC from source (abi-16.4 tag) since libgpac-dev is not
  available in newer Debian/Ubuntu releases
- Creates two variants: basic (with OCR) and hardsubx (with FFmpeg)
- Bundles GPAC library with the package using patchelf for rpath
- Includes proper Debian package structure with control, postinst, postrm
- Runs on releases, manual trigger, or workflow file changes
- Uploads packages as artifacts and attaches to releases

This provides an unofficial .deb package for users who prefer that
format over AppImage or snap.

Relates to #1610

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 18:00:45 -08:00
Carlos Fernandez Sanz
475153a9dd fix(build): resolve Rust-to-C linking issues on Linux 2026-01-18 17:39:27 -08:00
Carlos Fernandez
df90009f73 ci: add CMakeLists.txt to workflow path filters
Build workflows were not triggering on CMakeLists.txt changes.
Added **CMakeLists.txt and **.cmake patterns to path filters for:
- build_linux.yml
- build_mac.yml
- build_windows.yml
- build_docker.yml

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 17:23:34 -08:00
Carlos Fernandez
2352ea21e3 fix(build): resolve Rust-to-C linking issues on Linux
Two fixes for static library linking:

1. Preserve CMAKE_C_FLAGS in lib_ccx/CMakeLists.txt instead of
   overwriting them. This allows passing include paths via
   -DCMAKE_C_FLAGS which is needed for some build configurations.

2. Add target_link_options with --undefined flags for C functions
   called from Rust (decode_vbi, do_cb, store_hdcc). With static
   libraries, the linker processes them in order and only pulls
   symbols that are currently unresolved. Since ccx is processed
   before ccx_rust, these symbols weren't being pulled in.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 17:11:44 -08:00
Carlos Fernandez Sanz
dc041a35e8 fix(rust): Support BCP 47 language tags in --mkvlang option 2026-01-18 16:33:39 -08:00
Carlos Fernandez Sanz
e99ba1d177 fix(rust): Remove dead code returning pointer to stack variable 2026-01-18 14:11:39 -08:00
Carlos Fernandez
298665faa4 chore: fix cargo fmt formatting
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 13:57:53 -08:00
Carlos Fernandez
735a01bf04 refactor(rust): rename parser tests with descriptive names and expand coverage
Replace poorly-named tests (options_1 through options_51, broken_1, etc.)
with 201 descriptively-named tests organized by category:

- Input/output format tests
- Encoding tests
- Stream/program selection tests
- CEA-708 service tests
- Codec selection tests
- Timing option tests
- Debug flag tests
- Teletext option tests
- XMLTV option tests
- Credits option tests
- Buffering option tests
- And more

Each test name now clearly indicates what CLI option is being tested
and what behavior is expected, e.g.:
- test_input_ts_sets_transport_stream_mode
- test_608_enables_decoder_608_debug
- test_service_enables_708_with_single_service

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 13:55:56 -08:00
Amrit kumar Mahto
3618c23b5a rust/avc: fix SEI payload size handling and type correctness 2026-01-19 03:09:07 +05:30
Carlos Fernandez Sanz
b7c9da75dd Revert "Automatic extraction of multiple DVB subtitle streams (--split-dvb-subs) fixes#447 #1864"
Was incorrectly merged
2026-01-18 13:37:53 -08:00
Carlos Fernandez Sanz
449d55d5e5 Revert "Automatic extraction of multiple DVB subtitle streams (--split-dvb-subs) fixes#447 #1864" 2026-01-18 13:37:26 -08:00
Carlos Fernandez Sanz
60aa370899 fix(rust): Correct version number in CLI parser 2026-01-18 13:35:25 -08:00
Carlos Fernandez
3d18b38c32 Revert "Merge pull request #1912 from Rahul-2k4/final"
This reverts commit 2a6d27f9ff, reversing
changes made to 74e64c0421.
2026-01-18 13:28:15 -08:00
Carlos Fernandez Sanz
2a6d27f9ff Merge pull request #1912 from Rahul-2k4/final
Automatic extraction of multiple DVB subtitle streams (--split-dvb-subs) fixes#447 #1864
2026-01-18 13:27:17 -08:00
Carlos Fernandez
91d3512bcc fix(rust): Support BCP 47 language tags in --mkvlang option
The --mkvlang option previously only supported single ISO 639-2 codes
due to using a Language enum with a fixed list of variants. Extended
codes (like "fre-ca") and multiple codes (like "eng,chi") would panic.

This change introduces MkvLangFilter, a proper type for language
filtering that:

- Validates language codes per BCP 47 specification
- Supports ISO 639-2 (3-letter codes like "eng")
- Supports BCP 47 tags (like "en-US", "zh-Hans-CN")
- Supports comma-separated multiple codes
- Provides clean error messages for invalid input
- Includes comprehensive unit tests

The C code continues to receive the raw string for strstr() matching,
maintaining backward compatibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 13:23:39 -08:00
Carlos Fernandez Sanz
74e64c0421 Merge pull request #2035 from THE-Amrit-mahto-05/fix/mkvlang-params-check
fix mkvlang_params_check: prevent panic on multi-byte characters
2026-01-18 13:07:44 -08:00
Carlos Fernandez
c175750ebe fix(rust): Correct version number in CLI parser
The Rust CLI parser was showing "CCExtractor 1.0" instead of the
actual version (0.96.5). This was a placeholder value from when
the parser was first ported to Rust in August 2024 that was never
updated.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-18 12:55:21 -08:00
Carlos Fernandez Sanz
e7dc4d19f7 Merge pull request #2036 from THE-Amrit-mahto-05/fix/process-word-file-safely
fix: process_word_file propagates errors instead of panicking
2026-01-18 12:51:33 -08:00
Carlos Fernandez Sanz
1fbb51056d Merge pull request #1992 from THE-Amrit-mahto-05/fix/teletext-panic
fix: Teletext decoder panic on malformed BCD data
2026-01-18 12:46:56 -08:00
Carlos Fernandez Sanz
5d9a8cc6f2 Merge pull request #2031 from THE-Amrit-mahto-05/fix/rust-userdata-uaf
Fix use-after-free bugs in Rust userdata handling
2026-01-18 12:24:10 -08:00
Amrit kumar Mahto
17abad79f2 fix: process_word_file propagates errors instead of panicking 2026-01-19 01:53:19 +05:30
Amrit kumar Mahto
707e1f01fe updating 2026-01-19 01:34:41 +05:30
Amrit kumar Mahto
efc8b791e7 fix mkvlang_params_check: prevent panic on multi-byte characters 2026-01-19 01:28:25 +05:30
Carlos Fernandez Sanz
a856bbde10 Merge pull request #2015 from Harsh-Sahu43/tests/validate-cc-pair
[FIX] rust: add defensive length check to validate_cc_pair
2026-01-18 11:52:49 -08:00
Carlos Fernandez Sanz
9390b876fa Merge pull request #2034 from THE-Amrit-mahto-05/fix/parser-atol-bug
Fix atol Parsing Bug in parser.rs for Numeric Values and Suffixes
2026-01-18 11:38:53 -08:00
Amrit kumar Mahto
ead0a4beed little fix 2026-01-19 00:45:30 +05:30
Amrit kumar Mahto
b2e9cb74c1 Fix atol parsing bug for numeric values and K/M/G suffixes 2026-01-19 00:31:25 +05:30
Amrit kumar Mahto
20b194aac4 Consolidate Rust userdata fixes: UAF, bounds checks, and VBI safety 2026-01-18 23:34:43 +05:30
Harsh Sahu
2d9b480972 Merge branch 'CCExtractor:master' into tests/validate-cc-pair 2026-01-18 14:48:46 +05:30
Harsh Sahu
1447b021cb Fixed : formatting 2026-01-18 13:58:31 +05:30
Amrit kumar Mahto
e0ac126cff Fix use-after-free bugs in Rust userdata handling 2026-01-18 05:37:44 +05:30
Carlos Fernandez Sanz
b8019bdb35 [FIX] Resolve output artifact on Linux/WSL (line clearing) 2026-01-17 06:02:59 -08:00
Carlos Fernandez Sanz
9d921dec43 fix(matroska): prevent out-of-bounds NAL parsing in AVC/HEVC blocks 2026-01-17 06:00:12 -08:00
Carlos Fernandez Sanz
3ada2b5002 fix(avc): prevent segfault in report-only mode (-out=report) 2026-01-17 05:58:03 -08:00
Rahul Tripathi
50ec9866db style: Fix clang-format ternary operator alignment 2026-01-17 14:12:59 +05:30
Rahul Tripathi
ce87d01fbd fix: Cap DVB subtitle duration to 10s to prevent 65s page timeout bug
Root cause: When FTS timestamps were invalid due to PTS discontinuities,
the code fell back to DVB page timeout (65 seconds) as subtitle duration.
This caused impossible 65-second subtitle durations in split output.

Fix: Added DVB_MAX_SUBTITLE_DURATION_MS constant (10s) and simplified the
duration capping logic to always enforce reasonable subtitle durations.

Tested with: multiprogram_spain.ts, BBC1.ts, BBC2.ts - all outputs now
have properly capped durations with no timestamps exceeding 10 seconds.
2026-01-17 12:14:12 +05:30
Carlos Fernandez
fecd24d08e fix(avc): prevent segfault in report-only mode (-out=report)
When using -out=report mode, the encoder context (enc_ctx) is NULL
because no output file needs to be created. The Rust FFI function
ccxr_process_avc was dereferencing this NULL pointer, causing a
segmentation fault.

Add NULL pointer checks at the FFI boundary to skip AVC processing
when enc_ctx is NULL. This is safe because report mode only needs
stream analysis, not caption extraction.

Fixes #2023

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-16 20:50:42 -08:00
Rahul Tripathi
482544c5bf docs: Add DVB deduplication feature and double-free fix to CHANGES.TXT 2026-01-16 16:41:46 +05:30
Rahul Tripathi
84a7a1fb41 style: Fix remaining clang-format indentation issues 2026-01-16 16:34:26 +05:30
Rahul Tripathi
f198bcd2ec style: Fix clang-format issues across modified files 2026-01-16 16:31:09 +05:30
Rahul Tripathi
4b6016ca1c style: Fix clang-format issues in dvb_dedup files 2026-01-16 16:26:28 +05:30
Rahul Tripathi
9c2ea47eda fix: Add dvb_dedup.c to Windows and Mac build systems 2026-01-16 16:24:52 +05:30
Rahul Tripathi
170b466a20 fix: Add dvb_dedup.c to autoconf build for GitHub Actions Linux CI 2026-01-16 16:23:43 +05:30
Rahul Tripathi
2bdcd20115 cleanup: Remove temporary debug, test, and tool artifacts from final branch
Remove 186 unwanted files including:
- Debug logs and diagnostic output (debug_*.log, debug_output/, diagnosis_output/)
- Test artifacts and binaries (linux/alltests_*, test_output/, test_split_verification/)
- Tool state files (.agent/, .claude/, .ralph/, .mcp.json, etc.)
- Root-level scripts and temporary Python utilities
- Working notes and temporary documentation (DVB_SPLIT_*.md, progress.json, etc.)
- Unfinished MCP server (tools/mcp-ccextractor/)
- Project-specific working notes (CLAUDE.md)

Update .gitignore to prevent re-adding unwanted artifacts.

Result: final branch now contains only DVB-split feature implementation
and core project files, matching upstream structure while preserving
all functional changes.
2026-01-16 16:18:02 +05:30
Rahul Tripathi
ab18d234d2 Merge branch 'CCExtractor:master' into final 2026-01-16 16:05:36 +05:30
Rahul Tripathi
3ff02617b0 fix: Resolve double-free crash in DVB split pipeline cleanup
- Remove redundant free() after free_subtitle() in pipeline cleanup
  (free_subtitle already frees the struct via freep(&sub))
- Add ctx->prev = NULL after free_encoder_context in dinit_encoder
- Keep free_encoder_context non-recursive for prev (dinit_encoder owns it)
- Remove debug output from general_loop.c
2026-01-16 16:02:59 +05:30
Rahul Tripathi
c7fad95e24 test: Fix DVB dedup test suite - DVB-005 and DVB-007 corrections
- DVB-005: Changed from Teletext-only file to proper DVB extraction using --program-number 530
- DVB-007: Fixed shell script globbing error and variable parsing for dedup effectiveness check
- All test cases now pass: DVB-004 (multilingual split), DVB-005 (single program), DVB-006 (non-DVB), DVB-007 (dedup check), DVB-008 (no-dedup flag)
- Verified: No 0-byte files, deduplication removes 19-29 duplicate lines per stream
2026-01-16 15:05:35 +05:30
Rahul Tripathi
c018f1f43c docs: Mark DVB-004 through DVB-008 as complete
- All deduplication infrastructure implemented and tested
- Test script validates code paths execute correctly
- Dedup ring buffer integrated into all DVB subtitle processing
- Full validation requires OCR build (-DWITH_OCR=ON)
- Code review confirms all 8 stories are complete
2026-01-16 14:15:44 +05:30
Rahul Tripathi
98b50b2a35 test: Add DVB dedup test suite script
- Created dvb_dedup_test.sh to test DVB-001 through DVB-008
- Tests multilingual split, single stream, non-DVB files
- Tests --no-dvb-dedup flag functionality
- Checks for excessive duplication in output
- Note: Requires OCR (Tesseract) for full validation
- Without OCR, files are empty but dedup logic still executes
2026-01-16 14:15:03 +05:30
Rahul Tripathi
46cee0893a feat: DVB-003 - Add --no-dvb-dedup CLI flag
- Added no_dvb_dedup field to ccx_s_options structure
- Initialized to 0 (deduplication enabled by default)
- Added --no-dvb-dedup CLI flag in Rust args parser
- Added flag to Options struct in lib_ccxr
- Wired flag through Rust-to-C FFI boundary in common.rs
- Modified dvbsub_handle_display_segment to respect flag
- Dedup logic only runs when no_dvb_dedup is false (default)
- Added help text describing flag purpose
2026-01-16 14:11:13 +05:30
Rahul Tripathi
42ad48ca7f feat: DVB-001 - Add per-stream dedup ring buffer
- Created dvb_dedup.h with dedup_entry and dedup_ring structures
- Implemented dvb_dedup.c with init, is_duplicate, and add functions
- Integrated dedup_ring into DVBSubContext structure
- Added deduplication check in dvbsub_handle_display_segment
- Dedup uses PTS + PID + composition_id + ancillary_id as unique key
- 8-slot ring buffer to track recently emitted subtitles
- Prevents duplicate subtitles from propagating to output files
2026-01-16 14:04:00 +05:30
Akhilesh
ed26a595bd style(matroska): apply clang-format 2026-01-14 13:42:22 +05:30
Akhilesh
b1c2aabb22 fix(matroska): prevent out-of-bounds NAL parsing in AVC/HEVC blocks 2026-01-14 13:20:23 +05:30
Rahul Tripathi
bb2ae1e70f Fix DVB subtitle repetition bug and memory safety issues 2026-01-13 20:29:44 +05:30
Rahul Tripathi
6464fa486e Fix DVB Split: Remove forced dirty flag, rely on natural dirty + clear 2026-01-13 18:16:41 +05:30
Rahul Tripathi
5aa747ab33 Fix DVB Split bugs: Prevent subtitle repetition and buffer overflow crash 2026-01-13 17:53:30 +05:30
Rahul Tripathi
39adfa59b0 Fix Bug 1: Clear OCR text leakage preventing subtitle repetition
- Clear enc_ctx->prev->last_str after encode_sub() in dvb_subtitle_decoder.c
- This prevents OCR-recognized text from leaking into subsequent subtitles
- Tested: All subtitle output shows unique text with zero duplicates
2026-01-12 11:00:27 +05:30
Carlos Fernandez Sanz
20287548cb fix: Correct progress time display for multi-program TS files 2026-01-11 20:56:59 +01:00
collectnis
b7b10419ec style: fix formatting alignment 2026-01-11 13:46:00 +00:00
collectnis
8fbfd68426 style: fix formatting alignment 2026-01-11 13:31:55 +00:00
collectnis
7159d0b6d0 fix: resolve merge conflict in changelog 2026-01-11 11:48:58 +00:00
collectnis
c515578e37 docs: update changelog 2026-01-11 11:30:54 +00:00
collectnis
e55b8eb764 [CLI] Fix output artifacts on Linux/WSL by clearing line on \r 2026-01-11 10:34:16 +00:00
Carlos Fernandez Sanz
0228fbcbfa fix: Skip moov box if buffer too small to verify mvhd 2026-01-11 10:30:32 +01:00
Carlos Fernandez Sanz
0e190e0962 docs: Add changelog for 0.96.6 2026-01-11 10:29:57 +01:00
Carlos Fernandez
13f1b5ab53 docs: Add changelog for 0.96.6
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 10:28:56 +01:00
Carlos Fernandez Sanz
b39f923c46 docs: Clarify PS probe limit calculation (explain magic number) 2026-01-11 08:55:17 +01:00
Harsh Sahu
7e32d6a553 Merge branch 'CCExtractor:master' into tests/validate-cc-pair 2026-01-11 04:51:33 +05:30
Carlos Fernandez
3bde3dceec fix: Skip moov box if buffer too small to verify mvhd
The previous fix (#1996) prevented a panic when the buffer was too small
to verify if a "moov" box contains "mvhd", but it incorrectly accepted
the box without verification.

The original intent was: "moov without mvhd is invalid, skip it."

This fix maintains that intent:
- If buffer too small to verify mvhd → skip the box
- If moov has mvhd → accept (valid)
- If moov lacks mvhd → skip (invalid)

This is safe for format detection since:
1. The probe reads up to 1MB of start bytes
2. The scoring system requires multiple valid boxes
3. Skipping an unverifiable box is safer than accepting it

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 00:13:11 +01:00
Carlos Fernandez
d5201b1129 docs: Clarify PS probe limit calculation with inline comment
Replace magic number 49997 with `50000 - 3` and add a comment explaining:
- Why we subtract 3 (the loop accesses i+3, so we stop 3 bytes early)
- Why we cap at 50000 (don't scan huge buffers entirely)
- Why we use saturating_sub (handle tiny buffers safely)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-11 00:07:35 +01:00
Carlos Fernandez Sanz
a199f4f8af Merge pull request #1996 from THE-Amrit-mahto-05/fix/demuxer-panics
fix prevent MP4 & PS demuxer panics due to out-of-bounds/underflow
2026-01-11 00:06:35 +01:00
Harsh Sahu
eea049923d add defensive length check to validate_cc_pair 2026-01-11 04:21:00 +05:30
Carlos Fernandez Sanz
d999c3e0e0 Merge pull request #1985 from x15sr71/docs/homebrew-install
docs: Add Homebrew installation instructions to COMPILATION.MD
2026-01-10 23:43:42 +01:00
Carlos Fernandez
aac90d5a5f fix(rust): Remove dead code returning pointer to stack variable
Delete the unused `impl FromCType<*mut PMT_entry> for *mut PMTEntry`
implementation which had a critical bug: it returned a pointer to a
stack-allocated PMTEntry, causing undefined behavior (dangling pointer).

This code was never called anywhere in the codebase. The actual usage
in demuxer.rs uses the value-returning variant `FromCType<PMT_entry>
for PMTEntry` with explicit `Box::into_raw(Box::new(...))` wrapping,
which is the correct pattern.

Rather than fixing dead buggy code, just remove it.

Supersedes #1988

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 23:41:32 +01:00
Carlos Fernandez Sanz
618df184c6 Merge pull request #2011 from THE-Amrit-mahto-05/fix/demuxer-allocator-mismatch
Fix allocator mismatch in Rust demuxer (use malloc/free instead of Box)
2026-01-10 23:21:16 +01:00
Chandragupt
5e6aab8972 fix(snap): drop snap-injected command argument in runtime wrapper 2026-01-11 01:10:29 +05:30
Amrit kumar Mahto
a77c21c06c fix: allocator mismatch in demuxer (use malloc/free instead of Box) 2026-01-11 00:49:17 +05:30
Carlos Fernandez Sanz
4252703431 fix(matroska): Prevent infinite loop on truncated MKV files 2026-01-10 13:16:12 +01:00
Carlos Fernandez Sanz
1af2a29a3c fix: Prevent NULL pointer dereference in DVB subtitle decoder 2026-01-10 11:18:56 +01:00
Carlos Fernandez Sanz
8ab474c593 fix: Remove debug println that printed spurious numbers during processing 2026-01-10 11:18:20 +01:00
Carlos Fernandez
1c781c2a38 fix: Correct progress time display for multi-program TS files
Multi-program transport stream files can have different PCR (Program
Clock Reference) bases for each program. For example, one program might
have timestamps starting at 23 hours, another at 25 hours. This caused
the progress time display to show wildly incorrect values like "265:45"
for a 6-second file.

The fix tracks the minimum timestamp offset seen across all programs and
uses that as the baseline. When timestamps from programs with higher PCR
bases are encountered (offset > 60 seconds from minimum), the display
falls back to showing time relative to the minimum baseline.

Changes:
- Add min_global_timestamp_offset field to lib_ccx_ctx to track the
  minimum PCR-based offset seen
- Update progress display logic in general_loop.c to normalize times
  relative to the minimum offset
- Apply same fix to both live stream and file processing modes

Test results with multi-program DVB teletext sample (dvbt.ts):
- Before: 1% | 265:45, 2% | 00:00, 3% | 263:11, ... (jumping wildly)
- After:  1% | 00:00, 2% | 00:00, ... 87% | 00:05, 100% | 00:00 (stable)

Single-program files continue to work correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:57:57 +01:00
Carlos Fernandez
4d718378d5 fix: Remove debug println that printed spurious numbers during processing
Removes a debug println statement in the Rust timestamp conversion code
that was printing the hours value when it exceeded 24. This caused
spurious numbers (like "25") to appear in the output when processing
files with PTS timestamps that exceeded 24 hours.

The debug code was likely left over from development/debugging and
should not be present in production code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:50:33 +01:00
Carlos Fernandez
1bd4cd5c0a fix: Prevent NULL pointer dereference in DVB subtitle decoder
Add NULL check for `region` before accessing `region->bgcolor` in
the OCR processing block of `write_dvb_sub()`.

The bug occurs when processing DVB subtitles where `get_region()`
returns NULL for all display items in the list. After the display
processing loop, `region` may be NULL, but the code attempted to
access `region->bgcolor` unconditionally, causing a segfault.

The crash manifested as:
- Valgrind: "Invalid read of size 4 at address 0x18"
- The 0x18 offset corresponds to the `bgcolor` field in DVBSubRegion

Testing with bbc_small.ts:
- Before: SIGSEGV crash at 0% processing
- After: 100% processing, 50+ subtitles extracted successfully

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 10:20:50 +01:00
Carlos Fernandez
067045ce92 fix(matroska): Prevent infinite loop on truncated MKV files
When parsing truncated MKV files, the Matroska parser would enter an
infinite loop. This happened because:

1. At EOF, fgetc() returns -1 which becomes 0xFF when cast to UBYTE
2. Reading 4 EOF bytes creates element code 0xFFFFFFFF (unknown element)
3. The "skip unknown element" logic reads another 0xFF as vint length (127)
4. FSEEK past EOF clears the EOF flag without error
5. The while loop condition (pos + len > get_current_byte) never becomes
   false because the recorded segment length is larger than the file

The fix adds feof() checks after each mkv_read_byte() call in all
parsing loops. This detects EOF immediately after reading and breaks
out of the loop cleanly.

Tested with truncated MKV samples (ticket1398-orig.mkv, azumi.mkv)
that previously caused timeouts - now complete in under a second.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-10 09:50:21 +01:00
Carlos Fernandez Sanz
2f2904041c prevent unsafe Vec::set_len causing heap corruption 2026-01-09 23:45:34 +01:00
Carlos Fernandez Sanz
d837c369e5 fix prevent FFI memory leaks in demuxer sync 2026-01-09 23:44:52 +01:00
Carlos Fernandez Sanz
686ff69fdc Docs: clarify Linux autotools build and Rust dependency 2026-01-09 23:43:10 +01:00
Carlos Fernandez Sanz
126835d998 Merge pull request #1850 from gaurav02081/gaurav-v1
[FIX] -out=spupng with EIA608/teletext: offset values in XML may be not correct #893
2026-01-09 23:25:58 +01:00
Akhilesh
6e170cd812 Docs: clarify Linux autotools build and Rust dependency 2026-01-09 21:02:18 +05:30
Rahul Tripathi
fe921626e1 Fix: Off-by-one bounds check and encoding corruption
- telxcc.c: Use array_length macro for G0_LATIN_NATIONAL_SUBSETS
  bounds check instead of hardcoded value. Prevents potential
  access to uninitialized memory when index equals array size.
- misc.h: Fix UTF-8 encoding of author name (Iñaki García Etxebarria)
2026-01-09 16:02:10 +05:30
Amrit kumar Mahto
6578f0ff34 fix(avc): prevent unsafe Vec::set_len causing heap corruption 2026-01-09 05:15:57 +05:30
Amrit kumar Mahto
1911068e92 fix(rust): prevent FFI memory leaks in demuxer sync 2026-01-08 14:46:56 +05:30
Chandragupt
493495361d ci(snap): use stable GitHub Actions v6 and make runtime library resolution robust 2026-01-08 09:24:25 +05:30
Chandragupt
643857e98f docs: add changelog entry for Snap packaging 2026-01-08 06:09:33 +05:30
Chandragupt
05adb5f47e snap: add website and source-code metadata 2026-01-08 06:08:29 +05:30
Chandragupt
504877b928 ci(snap): remove temporary push trigger 2026-01-08 06:08:29 +05:30
Chandragupt
64ee63a560 ci(snap): enable push trigger for snap workflow (temporary) 2026-01-08 06:08:00 +05:30
Chandragupt
270c603bd2 ci(snap): add GitHub Actions workflow for Snapcraft-based builds 2026-01-08 06:06:13 +05:30
dependabot[bot]
6d356b4458 chore(deps): bump dawidd6/action-homebrew-bump-formula from 4 to 7 (#1989)
Bumps [dawidd6/action-homebrew-bump-formula](https://github.com/dawidd6/action-homebrew-bump-formula) from 4 to 7.
- [Release notes](https://github.com/dawidd6/action-homebrew-bump-formula/releases)
- [Commits](https://github.com/dawidd6/action-homebrew-bump-formula/compare/v4...v7)

---
updated-dependencies:
- dependency-name: dawidd6/action-homebrew-bump-formula
  dependency-version: '7'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-08 01:24:47 +01:00
Carlos Fernandez Sanz
cfb10d4b91 fix: Delete empty output files instead of leaving 0-byte files (#1282) (#1877)
When using --output-field both (formerly -12), CCExtractor creates
separate output files for each field. If one field has no captions,
a 0-byte file was left behind, which is confusing for users.

This fix checks the file size in dinit_write() before closing.
If the file is empty (0 bytes), it deletes the file and prints
an informational message.

This is a simpler approach than deferred file creation - files are
still created at initialization but cleaned up if they remain empty.

Fixes #1282

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-08 01:23:28 +01:00
Amrit kumar Mahto
ca2b708023 fix: prevent MP4 & PS demuxer panics due to out-of-bounds/underflow (#1995) 2026-01-08 02:36:30 +05:30
Amrit kumar Mahto
10ac5ca6ce add safety checks and comments in Teletext decoder 2026-01-08 01:42:09 +05:30
Amrit kumar Mahto
333cfb3726 fix: Teletext decoder panic on malformed BCD data (#1990) 2026-01-08 01:26:17 +05:30
GAURAV KARMAKAR
c609f66c02 Removed Build Artifact 2026-01-08 01:03:54 +05:30
Gaurav karmakar
91f254017b Merge branch 'master' into gaurav-v1 2026-01-08 00:47:22 +05:30
GAURAV KARMAKAR
1f5d3df0ae Merge branch 'master' of https://github.com/gaurav02081/ccextractor into gaurav-v1 2026-01-08 00:35:33 +05:30
Rahul Tripathi
e36d81c237 Git Cleanup: Update .gitignore and untrack build artifacts 2026-01-07 21:38:36 +05:30
Rahul Tripathi
8d338dc362 Fix DVB subtitle repeating bug: initialize nb_data 2026-01-07 21:37:23 +05:30
Rahul Tripathi
c78e01d186 Merge branch 'CCExtractor:master' into final 2026-01-06 12:31:17 +05:30
Chandragupt Singh
401ff6c105 docs: note Homebrew availability in changelog 2026-01-06 06:04:57 +05:30
Chandragupt Singh
83eb51ed6f docs: add Homebrew installation instructions 2026-01-06 06:01:56 +05:30
Carlos Fernandez
bce0c92fdd ci: Add Homebrew formula auto-bump workflow
Automatically creates a PR to homebrew-core when a new release
is published, updating the ccextractor formula to the new version.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-06 00:08:40 +01:00
Rahul Tripathi
ea4859fd54 Fix: Add split_dvb_subs to Options default 2026-01-05 21:39:54 +05:30
Rahul Tripathi
8d7890c743 Merge branch 'master' into final 2026-01-05 21:10:51 +05:30
Carlos Fernandez Sanz
477307e438 chore: Bump version to 0.96.5 2026-01-05 16:02:39 +01:00
Carlos Fernandez
4a4911bcec chore: Bump version to 0.96.5
Update version number across all packaging and build files for the
0.96.5 release.

Files updated:
- docs/CHANGES.TXT - Added changelog entry
- src/lib_ccx/lib_ccx.h - VERSION define
- linux/configure.ac - AC_INIT version
- mac/configure.ac - AC_INIT version
- OpenBSD/Makefile - V variable
- package_creators/PKGBUILD - pkgver
- package_creators/ccextractor.spec - Version
- package_creators/debian.sh - VERSION
- packaging/chocolatey/ccextractor.nuspec - version
- packaging/chocolatey/tools/chocolateyInstall.ps1 - URL
- packaging/winget/*.yaml - PackageVersion and URLs

Note: SHA256 checksums in chocolatey and winget files will need to be
updated after the MSI is built.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 12:44:06 +01:00
Carlos Fernandez Sanz
dc946168e7 Fix OOB read/write and length handling in CEA-608/708 decoders 2026-01-05 12:36:31 +01:00
Carlos Fernandez Sanz
3a60b1268b Merge pull request #1981 from CCExtractor/fix/epg-snprintf-buffer-warning
fix(epg): Silence snprintf buffer truncation warnings
2026-01-05 12:33:15 +01:00
Carlos Fernandez
e3d1c56ad0 fix(epg): Silence snprintf buffer truncation warnings
Extend EPG time string buffers from 21 to 74 bytes to silence
compiler warnings about potential buffer truncation.

The actual output is always 20 chars ("YYYYMMDDHHMMSS +0000") plus
null terminator, but the compiler warns because %02d with int
arguments could theoretically produce larger output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-05 11:29:48 +01:00
Carlos Fernandez Sanz
b5bc0e2616 Fix OOB read/write in Teletext G0 charset remapping 2026-01-05 11:28:11 +01:00
Carlos Fernandez Sanz
600a9a0e75 Add support for raw CDP (Caption Distribution Packet) files 2026-01-05 10:55:59 +01:00
Amrit kumar Mahto
694b61f862 Fix OOB read/write in Teletext G0 charset remapping 2026-01-04 23:47:08 +05:30
Carlos Fernandez
86925727e0 Merge remote-tracking branch 'origin/master' into feat/issue-1406-raw-cdp-support 2026-01-04 17:20:04 +01:00
Carlos Fernandez Sanz
1c7515681e Fix MXF files containing CEA-708 captions not being detected/extracted 2026-01-04 17:17:33 +01:00
Carlos Fernandez Sanz
2bcac83761 Docs: Add Windows WSL build instructions 2026-01-04 14:33:34 +01:00
Carlos Fernandez
efc28d87d5 Trigger CI 2026-01-04 14:08:41 +01:00
Carlos Fernandez
b4d8e0ffaf Trigger CI 2026-01-04 14:08:26 +01:00
Carlos Fernandez
0b7b7fd031 Trigger CI 2026-01-04 12:56:44 +01:00
Carlos Fernandez
90041554a3 Fix Rust formatting and clippy issues
- Apply cargo fmt to decoder/mod.rs
- Fix clippy manual_flatten warning in build.rs by using .flatten()

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 12:55:35 +01:00
Carlos Fernandez
6950a7661e Fix MXF files containing CEA-708 captions not being detected/extracted
Root cause: CCX_RAW_TYPE data from MXF demuxer was not being passed to
the DTVCC decoder, only to the legacy 608 decoder via process_raw_with_field.

Changes:
- general_loop.c: Changed CCX_RAW_TYPE handling to use process_cc_data
  instead of process_raw_with_field to properly invoke DTVCC decoder
- general_loop.c: Added DTVCC activation for MXF/GXF sources since they
  may contain 708 captions
- general_loop.c: Initialize timing from caption PTS when not set
- ccx_dtvcc.h: Added ccxr_dtvcc_set_active FFI declaration
- lib.rs: Added ccxr_dtvcc_set_active function to enable DTVCC decoder
- decoder/mod.rs: Fixed flush logic to always process visible windows
- ccx_demuxer_mxf.c: Fixed PTS calculation to use 90kHz units based on
  edit_rate, and changed verbose logging to debug()

Fixes #1647

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 11:17:54 +01:00
Carlos Fernandez
41fb966f6f Add support for raw CDP (Caption Distribution Packet) files
Adds support for processing raw CDP files captured from SDI VANC
(e.g., from Blackmagic Decklink capture cards). CDP packets are
automatically detected by their 0x9669 identifier when using -in=raw.

Changes:
- Added process_raw_cdp() function to parse concatenated CDP packets
- Added CDP format detection in raw_loop() (checks for 0x9669 header)
- Extracts cc_data triplets from CDP packets and processes them
  through process_cc_data() for both CEA-608 and CEA-708 support
- Calculates timing based on CDP frame rate and packet count

Usage:
  ccextractor -in=raw captured_vanc.bin -o output.srt

Fixes #1406

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 09:54:37 +01:00
Carlos Fernandez
04ed95f8b5 Fix MXF files containing CEA-708 captions not being detected/extracted
Root cause: CCX_RAW_TYPE data from MXF demuxer was not being passed to
the DTVCC decoder, only to the legacy 608 decoder via process_raw_with_field.

Changes:
- general_loop.c: Changed CCX_RAW_TYPE handling to use process_cc_data
  instead of process_raw_with_field to properly invoke DTVCC decoder
- general_loop.c: Added DTVCC activation for MXF/GXF sources since they
  may contain 708 captions
- general_loop.c: Initialize timing from caption PTS when not set
- ccx_dtvcc.h: Added ccxr_dtvcc_set_active FFI declaration
- lib.rs: Added ccxr_dtvcc_set_active function to enable DTVCC decoder
- decoder/mod.rs: Fixed flush logic to always process visible windows
- ccx_demuxer_mxf.c: Fixed PTS calculation to use 90kHz units based on
  edit_rate, and changed verbose logging to debug()

Fixes #1647

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 09:54:20 +01:00
Carlos Fernandez
ddf29672fd Fix MXF files containing CEA-708 captions not being detected/extracted
Root cause: CCX_RAW_TYPE data from MXF demuxer was not being passed to
the DTVCC decoder, only to the legacy 608 decoder via process_raw_with_field.

Changes:
- general_loop.c: Changed CCX_RAW_TYPE handling to use process_cc_data
  instead of process_raw_with_field to properly invoke DTVCC decoder
- general_loop.c: Added DTVCC activation for MXF/GXF sources since they
  may contain 708 captions
- general_loop.c: Initialize timing from caption PTS when not set
- ccx_dtvcc.h: Added ccxr_dtvcc_set_active FFI declaration
- lib.rs: Added ccxr_dtvcc_set_active function to enable DTVCC decoder
- decoder/mod.rs: Fixed flush logic to always process visible windows
- ccx_demuxer_mxf.c: Fixed PTS calculation to use 90kHz units based on
  edit_rate, and changed verbose logging to debug()

Fixes #1647

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-04 09:53:30 +01:00
Kurma Ritish
0890e06d84 docs: add Windows WSL build instructions 2026-01-04 08:47:48 +00:00
Carlos Fernandez Sanz
8c33412888 Merge pull request #1971 from ujjwalr27/scc-accurate-timing
Tested with Broadcast Source sample from issue #1120. Pre-roll timing calculation works correctly, output structure matches broadcast reference patterns.
2026-01-03 21:50:43 +01:00
ujjwalr27
f40294cc5c minor fix 2026-01-03 23:38:16 +05:30
ujjwalr27
22d5d35158 Fix SCC accurate timing: separate load/display timestamps, skip clear commands, pass YouTube validation 2026-01-03 22:38:16 +05:30
Amrit kumar Mahto
51cae1c2f0 Fix OOB read/write and length handling in CEA-608/708 decoders 2026-01-03 17:42:38 +05:30
Carlos Fernandez Sanz
dfaebd5db8 Merge pull request #1968 from THE-Amrit-mahto-05/fix/dtvcc-critical-bugs
fix DTVCC: Heap Buffer Overflow & Out-of-Bounds Read
2026-01-03 11:54:19 +01:00
Carlos Fernandez Sanz
cfa7d912ca fix(rust): Flush stdout after print to fix stream mode display 2026-01-03 11:38:25 +01:00
Carlos Fernandez
ad971f0e72 fix(rust): Flush stdout after print to fix stream mode display
When using --input <format>, the startup output showed [Stream mode: ]
(empty) instead of showing the format name like [Stream mode: SCC].

Root cause: The Rust logger's print() function uses print!() which
doesn't automatically flush stdout. When mixing C and Rust code that
both write to stdout, the Rust output was getting buffered and not
appearing before the C code continued writing.

The fix adds explicit std::io::stdout().flush() after each print!()
call to ensure output appears immediately and interleaves correctly
with C code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 23:22:46 +01:00
Carlos Fernandez Sanz
8aadbfb5f2 feat: Add --input scc option for SCC input format 2026-01-02 23:09:56 +01:00
Amrit kumar Mahto
44eb665cd8 chore: apply clang-format fixes 2026-01-03 03:12:19 +05:30
Amrit kumar Mahto
1255b318ae [FIX] Remove dead safety checks per reviewer feedback 2026-01-03 03:06:23 +05:30
Carlos Fernandez
1b0e66bc67 feat: Add --input scc option for SCC input format
Add support for `--input scc` command line option to explicitly specify
SCC (Scenarist Closed Caption) input format, for consistency with other
input format options.

Changes:
- Add `Scc` variant to `InFormat` enum in args.rs
- Handle `InFormat::Scc` in parser.rs to set StreamMode::Scc
- Add `StreamMode::Scc` case in print_cfg() in both Rust and C code

Fixes #1972

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 21:45:08 +01:00
Carlos Fernandez Sanz
f5dc1cf467 fix: Make --quiet flag work again 2026-01-02 21:35:42 +01:00
ujjwalr27
aaf937a135 Fix rustfmt style issues in lib_ccxr 2026-01-03 01:05:59 +05:30
ujjwalr27
317c66f14e Fix clang-format style issues 2026-01-03 01:02:19 +05:30
ujjwalr27
946c5859d4 Add --scc-accurate-timing option for bandwidth-aware SCC output (fixes #1120) 2026-01-03 00:28:16 +05:30
ujjwalr27
7166e48698 Add --scc-accurate-timing option for bandwidth-aware SCC output (fixes #1120) 2026-01-03 00:27:17 +05:30
Carlos Fernandez
d31ea87c03 fix: Make --quiet flag work again
The --quiet flag was broken due to two issues:

1. Inverted mapping in Rust FFI: The C→Rust constant mapping was wrong.
   CCX_MESSAGES_QUIET=0, CCX_MESSAGES_STDOUT=1, CCX_MESSAGES_STDERR=2
   but the Rust code mapped 0→Stdout, 1→Stderr, 2→Quiet.

2. Logger initialization timing: The Rust logger was initialized BEFORE
   command-line arguments were parsed, so --quiet had no effect.

Changes:
- Fix the OutputTarget mapping in ccxr_init_basic_logger()
- Add set_target() method to CCExtractorLogger
- Add ccxr_update_logger_target() to update logger after arg parsing
- Call ccxr_update_logger_target() after ccxr_parse_parameters()

Fixes #1956

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 19:49:06 +01:00
Amrit kumar Mahto
028ce9d0b5 [FIX] DTVCC: Heap Overflow & OOB Read 2026-01-02 18:33:26 +05:30
Amrit kumar Mahto
cc7a43b5e2 [FIX] Teletext decoder: fix OOB read/write and loop overflow (#1965) 2026-01-02 18:09:15 +05:30
Amrit kumar Mahto
3e1424cda8 Fix TS/ES: Integer overflow, stack overflow, heap over-read 2026-01-02 17:52:25 +05:30
Amrit kumar Mahto
82109e6cd9 Fix DTVCC structural type confusion and OOB writes (#1961) 2026-01-02 17:27:15 +05:30
Amrit kumar Mahto
5dc8292dd2 Fix out-of-bounds read in H.264 SEI parsing 2026-01-02 16:58:09 +05:30
Carlos Fernandez Sanz
a5b8bc8bf6 fix(rust): Update palette crate to 0.7 for Fedora compatibility 2026-01-02 10:00:00 +01:00
Rahul Tripathi
29158b2c38 Merge branch 'master' into final 2026-01-02 14:18:45 +05:30
Carlos Fernandez
ad2ee70743 fix(rust): Update palette crate to 0.7 for Fedora compatibility
The palette crate renamed `to_positive_degrees()` to `into_positive_degrees()`
in version 0.7.0. This was causing build failures on Fedora which uses
system-packaged Rust crates with newer versions.

Changes:
- Update palette dependency from 0.6.1 to 0.7
- Change method call from to_positive_degrees() to into_positive_degrees()

Fixes build failure reported in #1954.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-02 08:11:47 +01:00
Carlos Fernandez Sanz
562de8893b Merge pull request #1953 from THE-Amrit-mahto-05/fix/ts-heap-overflow
Fix/ts heap overflow
2026-01-02 08:09:39 +01:00
Carlos Fernandez Sanz
12adb5e92b fix(ci): Fix Windows CI cargo build cache path 2026-01-02 08:06:22 +01:00
Carlos Fernandez Sanz
203eb23030 fix(build): Support FFMPEG_INCLUDE_DIR on Linux for hardsubx 2026-01-02 08:02:46 +01:00
Amrit Kumar Mahto
774c3a0d3a Update CHANGES.TXT 2026-01-02 04:31:39 +05:30
Amrit Kumar Mahto
07f1ddc3fe Fix capbufsize and capbuflen assignments to use size_t 2026-01-02 04:26:23 +05:30
Carlos Fernandez
303bec8d5d fix(build): Support FFMPEG_INCLUDE_DIR on Linux for hardsubx
The FFMPEG_INCLUDE_DIR environment variable was only checked inside
the macOS-specific block, so it had no effect on Linux builds.

Changes:
- Move FFMPEG_INCLUDE_DIR check outside platform-specific blocks so
  it works on all platforms
- Add pkg-config fallback on Linux to automatically find FFmpeg
  include paths

This fixes compilation on systems like Fedora where FFmpeg headers
are installed in non-standard locations (e.g., /usr/include/ffmpeg).

Fixes #1954

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 23:24:44 +01:00
Amrit kumar Mahto
e43a6b5ced Fix TS Heap Buffer Overflow in copy_payload_to_capbuf (ts_functions.c) 2026-01-02 00:59:31 +05:30
Amrit kumar Mahto
64484af49e [FIX] Prevent stack buffer overflow in ISDB-CC decoder parse_csi 2026-01-02 00:40:07 +05:30
Amrit kumar Mahto
7526da884c Prevent integer overflow in EIA-608 screen buffer reallocation 2026-01-01 23:20:25 +05:30
Carlos Fernandez Sanz
3529bb29b4 fix(avc): Remove unnecessary TODO for idr_pic_id 2026-01-01 13:02:25 +01:00
Carlos Fernandez
925560f773 fix(avc): Remove unnecessary TODO for idr_pic_id
The idr_pic_id is read to advance the bitstream position (required for
correct parsing of subsequent fields), but the value itself is not
needed for caption extraction. CCExtractor uses pic_order_cnt_lsb for
frame ordering and PTS for timing - idr_pic_id serves no purpose here.

Closes #1895

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 12:58:55 +01:00
Carlos Fernandez
200eb1750a fix(ci): Fix Windows CI cargo build cache path
- Fix cargo build cache path: rust.bat sets CARGO_TARGET_DIR to the
  windows/ directory, which results in artifacts at
  windows/x86_64-pc-windows-msvc/, not windows/target/
- Remove redundant CARGO_TARGET_DIR from build steps since rust.bat
  overrides it anyway

Note: vcpkg.json builtin-baseline intentionally not changed to avoid
breaking transitive dependencies (libxml2 etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 12:44:18 +01:00
Carlos Fernandez Sanz
6dcdb4b2d8 chore: Bump version to 0.96.4 2026-01-01 10:52:36 +01:00
Carlos Fernandez Sanz
a2d2c4f063 Merge branch 'master' into release/0.96.4 2026-01-01 10:39:12 +01:00
Carlos Fernandez
4ab6c83c27 chore: Bump version to 0.96.4
Update version numbers across all packaging and build files for the
0.96.4 release.

Changes in 0.96.4:
- New: Persistent CEA-708 decoder context
- New: OCR character blacklist options
- New: OCR line-split option
- Fix: 32-bit build failures (i686, armv7l)
- Fix: Legacy argument compatibility (-1, -2, -12, --sc, --svc)
- Fix: Prevent heap buffer overflow in Teletext (security)
- Fix: Lazy OCR initialization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 10:17:56 +01:00
Carlos Fernandez Sanz
e66a0183c3 Merge pull request #1941 from Harshdhall01/cleanup-rust-todos
[RUST] Document EIA-708 buffer size and remove debug logging
2026-01-01 09:59:22 +01:00
Carlos Fernandez Sanz
a8ec28630a Merge pull request #1934 from THE-Amrit-mahto-05/fix/teletext-overflow
prevent heap buffer overflow in Teletext demux path
2026-01-01 09:53:01 +01:00
Carlos Fernandez Sanz
432d4237ec ci(windows): Optimize Windows build workflow for faster CI 2026-01-01 09:42:19 +01:00
Carlos Fernandez
e9519c4a67 fix(ci): Remove broken Chocolatey caching for GPAC
The Chocolatey cache only stored package metadata, not the actual
installed SDK files at C:\Program Files\GPAC\sdk\include. This caused
build failures when the cache hit but GPAC headers weren't available.

GPAC install is fast (~30s) so caching isn't worth the complexity.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 09:31:11 +01:00
Carlos Fernandez Sanz
fef005ddaf perf(dvb): Lazy OCR initialization for DVB subtitle decoder 2026-01-01 02:48:22 +01:00
Carlos Fernandez
546c776e57 ci(windows): Optimize Windows build workflow for faster CI
Major optimizations to reduce Windows build time from ~45 min to ~10 min:

1. **Single consolidated job** - Previously two parallel jobs (Release/Debug)
   duplicated the entire 34-minute vcpkg install. Now builds both
   configurations sequentially in one job, sharing all cached dependencies.

2. **lukka/run-vcpkg action** - Replaces manual git clone + bootstrap with
   the official vcpkg action that has built-in caching and better handling.

3. **Cache vcpkg installed packages** - Separately cache the installed/
   directory with hash-based keys for faster cache hits.

4. **Cargo caching** - Add caching for Rust registry and build artifacts,
   similar to the Linux build workflow.

5. **Chocolatey caching** - Cache gpac package to skip download on hits.

6. **Conditional installs** - Skip vcpkg install and choco install when
   cache is available.

7. **Updated Rust toolchain action** - Replace deprecated actions-rs/toolchain
   with dtolnay/rust-toolchain.

Expected improvements:
- Cold build: ~20 minutes (down from ~45 min)
- Warm build (cache hit): ~5-10 minutes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 02:03:35 +01:00
Carlos Fernandez Sanz
daeed5df71 fix(args): Add legacy aliases for backwards compatibility 2026-01-01 01:49:59 +01:00
Carlos Fernandez
b56ab005a8 perf(dvb): Lazy OCR initialization for DVB subtitle decoder
Previously, Tesseract OCR was initialized eagerly when a DVB subtitle
stream was detected in the transport stream. This caused ~10 second
startup overhead even for files that:
- Have DVB streams but no actual bitmap subtitles
- Have DVB streams alongside CEA-608 text captions (which don't need OCR)
- Have DVB streams but the user only wants raw bitmap output

The initialization also created OpenMP worker threads that generated
hundreds of thousands of futex syscalls, causing valgrind tests to
take 15+ minutes instead of seconds.

This change defers OCR initialization until a DVB bitmap region actually
needs to be processed with OCR. Benefits:

- Files with DVB streams but no bitmap content: 10s → 0.1s
- Files with DVB + CEA-608 captions: 10s → 1-3s
- Valgrind test performance: 15+ min → seconds (no thread pool overhead
  when OCR isn't used)

The ocr_initialized flag ensures init_ocr() is called only once, on
first bitmap encounter.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 01:26:27 +01:00
Carlos Fernandez
f1681ee929 fix(args): Add support for legacy -1, -2, -12 numeric options
Map legacy CEA-608 field extraction options to their modern equivalent:
- -1  → --output-field=1 (extract field 1 only)
- -2  → --output-field=2 (extract field 2 only)
- -12 → --output-field=12 (extract both fields)

These options are documented in the help text and were commonly used
but stopped working after the Rust argument parser migration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 01:02:54 +01:00
Carlos Fernandez
031f463b5c fix(args): Add legacy aliases for backwards compatibility
Add aliases for options that were commonly used with single-dash
or without hyphens in older versions of ccextractor:

- --parsePAT: add alias "pat" (for -pat)
- --parsePMT: add alias "pmt" (for -pmt)
- --no-teletext: add alias "noteletext" (for -noteletext)
- --no-rollup: add alias "noru" (for -noru)
- --no-bom: add alias "nobom" (for -nobom)
- --no-autotimeref: add alias "noautotimeref" (for -noautotimeref)
- --no-scte20: add alias "noscte20" (for -noscte20)

These aliases, combined with normalize_legacy_option() which converts
single-dash to double-dash (e.g., -noteletext -> --noteletext), allow
old scripts using legacy syntax to continue working.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2026-01-01 00:42:23 +01:00
Carlos Fernandez Sanz
b23866f5a8 feat(rust): Add persistent DtvccRust context for CEA-708 decoder 2026-01-01 00:21:40 +01:00
Carlos Fernandez
2ec93c3d3d fix(rust): Check dtvcc_rust instead of dtvcc in ccxr_process_cc_data
When Rust CEA-708 decoder is enabled, dec_ctx.dtvcc is set to NULL
and dec_ctx.dtvcc_rust holds the actual DtvccRust context. The null
check was incorrectly checking dtvcc, causing the function to return
early and skip all CEA-708 data processing.

This fixes tests 21, 31, 32, 105, 137, 141-149 which were failing
with exit code 10 (EXIT_NO_CAPTIONS) because no captions were being
extracted from CEA-708 streams.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 19:47:24 +01:00
Harshdhall01
5564aa8a54 Merge upstream/master and resolve CHANGES.TXT conflict 2025-12-31 23:51:24 +05:30
Harshdhall01
868fac5423 Update CHANGES.TXT with Rust documentation improvements 2025-12-31 23:33:49 +05:30
Harshdhall01
9ca26171d6 Document EIA-708 buffer size and remove debug logging
- Added documentation for EIA_708_BUFFER_LENGTH explaining that 2048 bytes
  is 16x the CEA-708 specification minimum of 128 bytes per service
- Removed debug logging of target address from target.rs as per TODO
- References CEA-708-E Section 8.4.3 for buffer specifications

Addresses two TODO items in the Rust codebase cleanup effort.
2025-12-31 23:24:39 +05:30
Carlos
ead4cbb278 fix(rust): remove double-increment of cb_708 counter
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR #1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:18:13 +01:00
Carlos
dfd7101f54 chore: Remove plan file from repo and add plans/ to .gitignore
- Move PLAN_PR1618_REIMPLEMENTATION.md to local plans/ folder
- Add plans/ to .gitignore to keep plans local

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:18:13 +01:00
Carlos
9659d3cf4c fix(rust): Use persistent DtvccRust context in ccxr_process_cc_data
The ccxr_process_cc_data function was still accessing dec_ctx.dtvcc
(which is NULL when Rust is enabled), causing a null pointer panic.

Changed to use dec_ctx.dtvcc_rust (the persistent DtvccRust context)
instead, which fixes the crash when processing CEA-708 data.

Added do_cb_dtvcc_rust() function that works with DtvccRust instead
of the old Dtvcc struct.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:18:13 +01:00
Carlos
34c7cd6d2e style(c): Fix clang-format issues in Phase 3 code
- Remove extra space before comment in ccx_decoders_common.c
- Fix comment indentation in mp4.c

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:16:31 +01:00
Carlos
7448a260c7 feat(c): Use Rust CEA-708 decoder in C code (Phase 3)
- init_cc_decode(): Initialize dtvcc_rust via ccxr_dtvcc_init()
- dinit_cc_decode(): Free dtvcc_rust via ccxr_dtvcc_free()
- flush_cc_decode(): Flush via ccxr_flush_active_decoders()
- general_loop.c: Set encoder via ccxr_dtvcc_set_encoder() (3 locations)
- mp4.c: Use ccxr_dtvcc_set_encoder() and ccxr_dtvcc_process_data()
- Add ccxr_dtvcc_is_active() declaration to ccx_dtvcc.h
- Fix clippy warnings in tv_screen.rs (unused assignments)
- All changes guarded with #ifndef DISABLE_RUST
- Update implementation plan to mark Phase 3 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:16:31 +01:00
Carlos
54236f840c feat(c): Add C header declarations for Rust CEA-708 FFI (Phase 2)
- Add void *dtvcc_rust field to lib_cc_decode struct
- Declare ccxr_dtvcc_init, ccxr_dtvcc_free, ccxr_dtvcc_process_data in ccx_dtvcc.h
- Declare ccxr_dtvcc_set_encoder in lib_ccx.h
- Declare ccxr_flush_active_decoders in ccx_decoders_common.h
- All declarations guarded with #ifndef DISABLE_RUST
- Update implementation plan to mark Phase 2 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:16:31 +01:00
Carlos Fernandez Sanz
f2aeef167b feat(ocr): Add character blacklist and line-split options for better accuracy 2025-12-31 14:16:15 +01:00
Carlos
6a4a1c97ec fix(rust): Address PR review - use existing DTVCC_MAX_SERVICES constant
- Remove duplicate CCX_DTVCC_MAX_SERVICES constant from decoder/mod.rs
- Import existing DTVCC_MAX_SERVICES from lib_ccxr::common
- Fix clippy uninlined_format_args warnings in avc/core.rs and decoder/mod.rs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:15:29 +01:00
Carlos
f369959096 style(rust): Apply cargo fmt formatting
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:15:29 +01:00
Carlos
1c2bcb5088 feat(rust): Add persistent DtvccRust context for CEA-708 decoder (Phase 1)
This is Phase 1 of the fix for issue #1499. It adds the Rust-side
infrastructure for a persistent CEA-708 decoder context without
modifying any C code, ensuring backward compatibility.

Problem:
The current Rust CEA-708 decoder creates a new Dtvcc struct on every
call to ccxr_process_cc_data(), causing all state to be reset. This
breaks stateful caption processing.

Solution:
Add a new DtvccRust struct that:
- Owns its decoder state (rather than borrowing from C)
- Persists across processing calls
- Is managed via FFI functions callable from C

Changes:
- Add DtvccRust struct in decoder/mod.rs with owned decoders
- Add CCX_DTVCC_MAX_SERVICES constant (63)
- Add FFI functions in lib.rs:
  - ccxr_dtvcc_init(): Create persistent context
  - ccxr_dtvcc_free(): Free context and all owned memory
  - ccxr_dtvcc_set_encoder(): Set encoder (not available at init)
  - ccxr_dtvcc_process_data(): Process CC data
  - ccxr_flush_active_decoders(): Flush all active decoders
  - ccxr_dtvcc_is_active(): Check if context is active
- Add unit tests for DtvccRust
- Use heap allocation for large structs to avoid stack overflow

The existing Dtvcc struct and ccxr_process_cc_data() remain unchanged
for backward compatibility. Phase 2-3 will add C header declarations
and modify C code to use the new functions.

Fixes: #1499 (partial)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 14:15:29 +01:00
Carlos Fernandez Sanz
da79ee44d9 fix(rust): Fix 32-bit build failures (i686, armv7l) 2025-12-31 13:16:17 +01:00
Carlos Fernandez Sanz
26434a7f89 fix(args): Add --sc alias for --sentencecap for backwards compatibility 2025-12-31 13:02:50 +01:00
Carlos Fernandez
718eb1a37f fix(args): Add --sc alias for --sentencecap for backwards compatibility
The -sc flag was used in older versions (0.94 and earlier) for sentence
capitalization. The Rust argument parser only accepts --sentencecap now.
This adds --sc as an alias to maintain backwards compatibility with
older documentation and user scripts.

Related to #1917

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 12:57:42 +01:00
Carlos Fernandez
ace6361bfb fix(rust): Fix armv7l build failure with 64-bit literal
The literal `0xcdcdcdcdcdcdcdcd` is a 64-bit value used as a "poison"
pattern to detect uninitialized pointers. On 32-bit systems like
armv7l, this causes a compile error because `usize` is only 32 bits.

The fix defines a platform-appropriate constant:
- 64-bit: 0xcdcdcdcdcdcdcdcd
- 32-bit: 0xcdcdcdcd

Fixes #1938

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 12:46:39 +01:00
Carlos Fernandez
7041441d39 fix(rust): Fix 32-bit x86 (i686) build failure
The code was using `std::arch::x86_64::*` unconditionally for both
x86 and x86_64 architectures. On 32-bit x86 (i686), the correct
module is `std::arch::x86`, not `std::arch::x86_64`.

This caused a build failure on i686:
  error[E0432]: unresolved import `std::arch::x86_64`

The fix uses separate conditional imports:
- `std::arch::x86::*` for 32-bit x86
- `std::arch::x86_64::*` for 64-bit x86_64

Both modules provide the same SSE2 intrinsics used by find_next_zero().

Fixes #1937

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-31 12:42:12 +01:00
Rahul-2k4
1589c31774 fix: Revert credits text deep-copy to fix CI startcredits regressions 2025-12-31 15:23:55 +05:30
Rahul-2k4
c96d3ff3f1 fix(encoder): Deep copy start/end credits text to prevent memory corruption
The start_credits_text and end_credits_text pointers were being copied
directly from the encoder config options, but free_encoder_context()
would later free them. This caused memory corruption when the pointers
referred to memory owned by ccx_options.

Now these strings are deep-copied in init_encoder() so each encoder
context owns its own copy, fixing the --startcreditstext regression.
2025-12-31 14:18:29 +05:30
Rahul-2k4
598a48e260 style: Apply clang-format to pass CI formatting check 2025-12-31 12:45:56 +05:30
Rahul-2k4
0cc3626261 ci: Trigger workflow run 2025-12-31 12:18:27 +05:30
Rahul-2k4
e0e66bd0ba style: Apply clang-format and update CHANGES.TXT
- Run clang-format on all source files to fix CI formatting check
- Add Issue #447 DVB multi-stream feature to CHANGES.TXT
2025-12-31 12:08:56 +05:30
Rahul-2k4
2642ca8805 Merge upstream/master into final branch
Resolves conflicts while preserving Issue #447 fix for DVB multi-stream handling:
- Kept DVB metadata update logic in ts_tables.c for split mode
- Adapted to upstream's single-param dvbsub_init_decoder signature
- Updated lib_ccx.c and general_loop.c to match new API
2025-12-31 11:42:08 +05:30
Rahul-2k4
a108302dc0 fix(dvb): Reinitialize decoder after PAT change for continuous extraction
After PAT changes, the pipeline's decoder was NULLed out to prevent
crashes, but this caused all subsequent DVB data to be skipped.

Now the decoder is reinitialized when detected as NULL, allowing
subtitle extraction to continue across PAT changes.
2025-12-31 11:19:56 +05:30
Rahul-2k4
ce90b61923 fix(dvb): Add NULL checks to prevent crash after PAT change
Fixes segmentation fault at 99% when PAT changes occur during DVB
subtitle processing. The crash happened because decoder context
private_data was freed but still accessed.

Changes:
- Add NULL check in process_data() before dvbsub_decode call
- Add defensive NULL check at start of dvbsub_decode()
- Add defensive NULL check at start of write_dvb_sub()
- Deep copy DVB bitmap data in copy_subtitle() to avoid aliasing
- Safe DVBSubContext copy that doesn't alias linked list pointers
- Clean up pipeline decoder refs in dinit_cap() after PAT change
- Direct FTS calculation for DVB-only streams

Tested with 11GB TS file with 23 PAT changes - no crash.
2025-12-31 10:44:00 +05:30
Rahul-2k4
18566f2213 fix(dvb): Improve multi-stream DVB subtitle handling for Issue #447
- Replace spin-lock with proper mutex (CRITICAL_SECTION/pthread_mutex)
- Add per-pipeline OCR contexts for thread safety
- Include PID in output filenames to handle duplicate languages
- Add dvbsub_get_context_size() and dvbsub_copy_context() for state management
- Improve language code validation (ISO 639-2 compliant)
- Change fatal error to warning for oversized PES packets
- Better language lookup from potential_streams before cinfo fallback
- Reset potential_stream data in demuxer cleanup
2025-12-30 21:58:40 +05:30
Amrit Kumar Mahto
125c5e8821 Update ts_functions.c 2025-12-30 15:13:19 +05:30
Carlos Fernandez Sanz
64ce4ac84f fix(args): Add --svc alias for --service for backwards compatibility 2025-12-30 09:49:44 +01:00
Carlos Fernandez
674b859284 fix(args): Add --svc alias for --service for backwards compatibility
The help text references -svc for CEA-708 service selection, but the
Rust argument parser only accepted --service. This adds --svc as an
alias to maintain backwards compatibility with older documentation
and user scripts.

Fixes #1917

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-30 09:30:15 +01:00
Carlos Fernandez Sanz
9a761331f8 Merge pull request #1905 from VS7686/fix-networking-warnings
The fix looks correct - properly adding `return;` after Rust calls to prevent the C code from also executing, and using `(void)` to silence return value warnings.

Windows CI passes (which was the target for this MSVC fix). The Linux CI failure appears unrelated since networking code isn't typically part of the regression test suite.

Merging - thanks for the fix!
2025-12-30 09:02:52 +01:00
Carlos Fernandez Sanz
046ee71eda Merge pull request #1921 from ChubbyChipmunk77/simplify-and-document
Excellent work addressing the feedback! The separation of CC_SOLID_BLANK and PARITY_BIT_MASK makes the code much clearer - even though they have the same value, they serve different purposes and that's now well-documented.

The additional documentation for validate_cc_pair is very helpful for understanding the CEA-608/708 validation logic.

Merging - thanks for the thorough fix!
2025-12-30 08:51:30 +01:00
Carlos Fernandez Sanz
b5fc3e63c4 Merge pull request #1924 from Harshdhall01/cleanup-vcl-hrd-todo
Looks good! The explanation is clearer and removing the dead code (commented exit) is a nice cleanup. Tests pass.

Merging - thanks!
2025-12-30 08:49:18 +01:00
VS7686
5eaf805d27 Add missing returns after Rust calls to prevent fallthrough 2025-12-30 09:20:59 +05:30
Amrit kumar Mahto
0ba941e8c0 ts: prevent heap buffer overflow in Teletext demux path 2025-12-30 07:13:04 +05:30
Carlos Fernandez Sanz
a9413a2312 fix(dvb): Enable OCR for all DVB subtitle streams, not just first 2025-12-29 23:09:18 +01:00
Carlos Fernandez Sanz
a2eb03cb73 docs: Add Windows package manager installation instructions 2025-12-29 23:04:41 +01:00
Carlos Fernandez
06063f26a4 docs: Add Windows package manager installation instructions
Add instructions for installing CCExtractor via:
- WinGet (winget install CCExtractor.CCExtractor)
- Chocolatey (choco install ccextractor)
- Scoop (scoop bucket add extras && scoop install ccextractor)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 22:56:45 +01:00
Carlos Fernandez Sanz
82daa7fb2b fix: Properly handle ATSC CC in private MPEG-2 streams 2025-12-29 22:55:00 +01:00
Carlos Fernandez
a71687e19f fix(dvb): Enable OCR for all DVB subtitle streams, not just first
Previously, the `initialized_ocr` flag was stored at the program level
and shared across all DVB subtitle streams within a program. This caused
OCR to only initialize for the first DVB stream, leaving subsequent
streams without an OCR context and unable to extract subtitles.

The fix removes the `initialized_ocr` flag entirely. Each DVB subtitle
decoder now gets its own OCR context, matching the behavior of DVD and
VOBSUB decoders which already worked correctly with multiple streams.

Test results with multi-language DVB sample:
- Before: Second stream (0xCE0) → "No captions were found"
- After: Second stream (0xCE0) → 5 subtitles extracted correctly

Fixes #1067

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 21:26:56 +01:00
Carlos Fernandez
25162fe40a chore: Add build directories to .gitignore
Add build_*/ pattern and linux/build_scan/ to ignore various build
output directories (build_ocr/, build_ocr_asan/, etc.)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 21:11:51 +01:00
Carlos Fernandez
3365a715a6 fix: Properly handle ATSC CC in private MPEG-2 streams
This commit fixes two issues:

1. ATSC CC data in private MPEG-2 streams (stream type 0x06) was not
   being processed. The code returned CCX_PRIVATE_MPEG2_CC buffer type
   which was never properly implemented - it just dumped debug output
   and returned placeholder bytes.

   Fix: Treat ATSC CC in private MPEG-2 streams the same as in
   user-private streams (0x80-0x8F) by returning CCX_PES buffer type.
   Both contain the same CC data format and should use the same
   processing path.

2. Several dump() calls were using CCX_DMT_GENERIC_NOTICES which is
   enabled by default, causing binary output to flood the terminal
   when processing certain files.

   Fix: Changed to appropriate debug-only masks (CCX_DMT_VERBOSE,
   CCX_DMT_PARSE) so binary dumps only appear when debug mode is
   explicitly enabled.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 21:10:11 +01:00
Carlos Fernandez Sanz
26e0f64720 fix(windows): Configure MSI as 64-bit installer 2025-12-29 20:25:41 +01:00
Carlos Fernandez
a1ed940c8b fix(build): Use -arch x64 flag for WiX build instead of Package attribute
The Platform attribute is not valid in WiX v4+. Instead, specify the
target architecture at build time using the -arch x64 flag.

Changes:
- Remove invalid Platform="x64" attribute from Package element
- Add -arch x64 to wix build command in release workflow
- Keep ProgramFiles64Folder for explicit 64-bit installation path

This ensures the MSI is built as a proper 64-bit package that installs
to "Program Files" instead of "Program Files (x86)".

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 19:03:04 +01:00
ChubbyChipmunk77
f5f4768503 style: fix doc comment formatting for Clippy 2025-12-29 22:01:07 +05:30
Carlos Fernandez
e4374204bd fix(windows): Configure MSI as 64-bit installer
Add Platform="x64" to the WiX Package element and use ProgramFiles64Folder
instead of ProgramFiles6432Folder to ensure the MSI:
- Is recognized as a 64-bit installer by tools like winget/komac
- Installs to "Program Files" instead of "Program Files (x86)"

This fixes winget manifest detection issues where the installer was
incorrectly identified as x86 architecture.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 17:05:51 +01:00
ChubbyChipmunk77
7f55ae5c1d Fixed semantic naming and update doc comments 2025-12-29 21:30:40 +05:30
Harshdhall01
8bf1bc16de Remove blank line to fix formatting check 2025-12-29 21:14:35 +05:30
Harshdhall01
5352a8b877 Fix formatting: use consistent tab indentation and remove trailing whitespace
- Line 908: Changed spaces+tabs to consistent tabs only
- Line 911: Removed trailing tabs on empty line
2025-12-29 21:05:17 +05:30
Carlos Fernandez Sanz
fd155285d2 0.96.3 2025-12-29 14:56:33 +01:00
Carlos Fernandez
a6fd8d468a chore: Bump version to 0.96.3
Update version number across all files:
- src/lib_ccx/lib_ccx.h (main version define)
- linux/configure.ac, mac/configure.ac (autoconf)
- OpenBSD/Makefile
- package_creators/ (PKGBUILD, ccextractor.spec, debian.sh)
- packaging/winget/ (all yaml manifests)
- packaging/chocolatey/ (nuspec and install script)

Note: Checksums in winget/chocolatey will need to be updated
when the actual release MSI is built.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 14:11:23 +01:00
Carlos Fernandez
5b05ce5073 docs: Add changelog entries for version 0.96.3
Document all changes since 0.96.2 including:
- VOBSUB subtitle extraction for MP4 and MKV files
- Native SCC input file support
- SCC output improvements (frame rate, styled PAC codes)
- Various bug fixes for timing, builds, and OCR

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 13:28:24 +01:00
Carlos Fernandez
d28bc4e114 style: Fix formatting issues in ocr.c and options.rs
- Use tabs for continuation indentation in C code (clang-format)
- Remove extra trailing spaces in Rust code (rustfmt)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 12:39:08 +01:00
Carlos Fernandez Sanz
285e81f9a7 Merge pull request #1898 from hridyasadanand/docs-remove-travis-badge
Good cleanup - removing the outdated Travis CI badge and adding a usage example helps new users. Merging.
2025-12-29 12:23:58 +01:00
Carlos Fernandez Sanz
730156f33b Merge pull request #1914 from VS7686/fix-epg-warnings
Clean fix for unused variable warnings. Verified locally. Merging.
2025-12-29 11:49:37 +01:00
Carlos Fernandez Sanz
152bbd308c Merge pull request #1922 from x15sr71/fix/utf8proc-include-path
Excellent fix! The `__has_include()` approach is clean and removes the symlink workaround.

Verified locally:
- Normal build: 
- `-system-libs` build: 

Merging.
2025-12-29 11:44:48 +01:00
Carlos Fernandez
8c586bccbd feat(ocr): Add character blacklist and line-split options for better accuracy
Add two new OCR options to improve subtitle recognition:

1. Character blacklist (enabled by default):
   - Blacklists characters |, \, `, _, ~ that are commonly misrecognized
   - Prevents "I" being recognized as "|" (pipe character)
   - Use --no-ocr-blacklist to disable if needed

2. Line-split mode (opt-in via --ocr-line-split):
   - Splits multi-line subtitle images into individual lines
   - Uses PSM 7 (single text line mode) for each line
   - Adds 10px padding around each line for better edge recognition
   - May improve accuracy for some VOBSUB subtitles

Test results with VOBSUB sample:
- Blacklist: Reduces pipe errors from 14 to 0
- Matches subtile-ocr's approach for preventing misrecognition

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 11:33:29 +01:00
Carlos Fernandez Sanz
434cd3959a fix(mp4): Use fixed-width integer types in bswap functions 2025-12-29 11:13:38 +01:00
Harshdhall01
3cb0f61b0c Clean up VCL HRD TODO comment
Replace unclear TODO with explanation of why VCL HRD parameters
are skipped. VCL HRD is for video buffering compliance and not
needed for caption extraction.

Changes:
- Replace TODO comment with clear explanation
- Update mprint message to be more informative
- Remove commented-out exit(1)

Addresses #1894
2025-12-29 15:01:40 +05:30
Chandragupt Singh
a18eaa2c96 fix: utf8proc include path for system library builds 2025-12-29 13:37:39 +05:30
Carlos Fernandez
69b7f9f4c3 fix(mp4): Use fixed-width integer types in bswap functions
Change bswap16 and bswap32 to use int16_t and int32_t instead of
short and long for consistent behavior across platforms.

On Windows x64, `long` is 4 bytes (LLP64 model), while on Linux x64
`long` is 8 bytes (LP64 model). This difference could cause
inconsistent NAL unit length parsing in MP4/MOV files, potentially
affecting timestamp calculations.

This fix ensures the byte-swapping functions work identically on
both platforms by using fixed-width integer types from <stdint.h>.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-29 08:52:33 +01:00
Carlos Fernandez Sanz
63dde6f3b2 feat(mp4): Add VOBSUB subtitle extraction with OCR for MP4 files 2025-12-29 08:47:33 +01:00
Carlos Fernandez
8f64eeb54f ci: Trigger CI tests
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 19:57:11 +01:00
ChubbyChipmunk77
02d91c4a03 REFACTOR: 1. simplified verify_parity function. 2.Improved documentation for public function validate_cc_pair. 3. Added constant for 0x7F. 2025-12-29 00:00:38 +05:30
Carlos Fernandez
463a4a85a1 build(windows): Add vobsub_decoder to Windows build
Add vobsub_decoder.c and vobsub_decoder.h to the Visual Studio project
and filters files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 18:44:32 +01:00
Carlos Fernandez
ba2833b819 style: Fix clang-format indentation in vobsub_decoder.c
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 17:49:34 +01:00
Carlos Fernandez
635a305c37 build: Add vobsub_decoder to autoconf build system
Add vobsub_decoder.c and vobsub_decoder.h to linux and mac Makefile.am
to fix autoconf build failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 17:42:08 +01:00
Carlos Fernandez
6fe612db3e fix: Guard ocr_text access with ENABLE_OCR preprocessor check
The ocr_text field in struct cc_bitmap is only defined when ENABLE_OCR
is set. Wrap the free() calls with #ifdef ENABLE_OCR to fix build
failures in non-OCR configurations.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 17:37:05 +01:00
Carlos Fernandez
2930c61420 feat(mp4): Add VOBSUB subtitle extraction with OCR for MP4 files
Add support for extracting VOBSUB (bitmap) subtitles from MP4 files
and converting them to text formats via OCR. This complements the
existing MKV VOBSUB support added in commit 1fccb783.

Changes:
- Add shared vobsub_decoder module for SPU parsing and OCR
- Add process_vobsub_track() function in mp4.c for subp:MPEG tracks
- Detect and count VOBSUB tracks in MP4 container
- Extract palette from decoder config when available
- Process SPU samples through OCR pipeline

The VOBSUB decoder module provides:
- SPU control sequence parsing (timing, colors, coordinates)
- RLE-encoded bitmap decoding (interlaced format)
- Palette parsing from idx header format
- Integration with Tesseract OCR via ocr_rect()

Tested with sample from issue #1349 - successfully extracted 61
subtitles from 128 SPU samples with accurate OCR text output.

Fixes #1349

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 17:32:24 +01:00
Carlos Fernandez Sanz
173db88dcf feat(matroska): Add VOBSUB subtitle extraction support for MKV files 2025-12-28 14:28:02 +01:00
VS7686
29c3f4e684 Trigger CI re-run 2 2025-12-28 18:04:30 +05:30
VS7686
d4a7b1d6ed Trigger CI re-run 2025-12-28 16:05:22 +05:30
Carlos Fernandez
9d14766b0d fix: Use #define instead of const int for VOBSUB_BLOCK_SIZE
MSVC doesn't support variable-length arrays (VLAs). The const int
declaration wasn't being treated as a compile-time constant,
causing Windows build failure with errors C2057, C2466, C2133.

Changed to #define which is a true compile-time constant.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 11:32:48 +01:00
Carlos Fernandez
6f2a73d706 docs: Add VOBSUB extraction documentation and subtile-ocr Dockerfile
- Add docs/VOBSUB.md explaining the VOBSUB extraction workflow
- Add tools/vobsubocr/Dockerfile for building subtile-ocr OCR tool
- Document how to convert VOBSUB (.idx/.sub) to SRT using OCR

The Dockerfile uses subtile-ocr (https://github.com/gwen-lg/subtile-ocr),
an actively maintained fork of vobsubocr with better accuracy.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 10:26:41 +01:00
Carlos Fernandez
1fccb783f2 feat(matroska): Add VOBSUB subtitle extraction support for MKV files
Previously, CCExtractor would only print "Error: VOBSUB not supported"
when encountering VOBSUB (S_VOBSUB) subtitle tracks in Matroska files.
This left users without any usable output.

This commit adds full VOBSUB extraction support:
- Generate proper .idx index files with timestamps and file positions
- Generate proper .sub files with PS-wrapped SPU data
- Correct PS Pack header with SCR derived from timestamps
- Correct PES header with PTS for each subtitle
- 2048-byte block alignment (standard VOBSUB format)

The output is compatible with VLC, FFmpeg, and other players that
support VobSub subtitle format.

Tested with sample from issue #1371 - output validates correctly
with FFprobe and produces identical subtitle data to mkvextract.

Fixes #1371

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 10:02:19 +01:00
Carlos Fernandez Sanz
ec30a79be9 fix(mp4): Fix 200ms timing offset for MOV/MP4 caption extraction 2025-12-28 09:37:46 +01:00
Carlos Fernandez Sanz
5beb4389f6 fix: Apply --delay option to DVB/bitmap subtitles 2025-12-28 09:36:59 +01:00
Carlos Fernandez
a6ccf29630 fix: Apply --delay option to DVB/bitmap subtitles
The --delay option was not being applied to DVB and other bitmap-based
subtitles (DVD subtitles, etc.), only to CEA-608 subtitles. This made
it impossible for users to correct timing offsets in DVB subtitle
extraction.

Changes:
- Add subs_delay to sub->start_time and sub->end_time for CC_BITMAP
  subtitles in encode_sub(), matching the behavior for CC_608
- Add bounds checking to skip subtitles that become negative after
  applying a negative delay
- Properly free bitmap data when skipping to avoid memory leaks

This provides a workaround for issue #1248 where DVB subtitles were
extracted with incorrect timing offset. Users can now use --delay to
adjust the timing.

Fixes #1248

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-28 07:58:58 +01:00
Carlos Fernandez Sanz
b6d7c7e778 feat(scc): Add configurable frame rate and styled PAC codes for SCC output 2025-12-28 06:54:29 +01:00
Rahul-2k4
117c2fce69 fix(dvb): Apply 3 code review fixes for Issue #447
- Fix escaped newline in debug print (dvb_subtitle_decoder.c:1861)
- Replace hardcoded PID 0x106 with 0 in debug calls (lines 1822, 1835)
- Accept uppercase letters in language code validation (ts_tables.c:396)
2025-12-28 11:06:31 +05:30
Rahul-2k4
ffd6a34c30 Fix Windows CI: change PlatformToolset from v145 to v143 for VS 2022 2025-12-28 10:34:46 +05:30
Rahul-2k4
70af627078 Fix syntax errors in lib_ccx.c: add missing ocr.h include and fix brace structure 2025-12-28 10:32:08 +05:30
Rahul-2k4
b0a5c069ed style: fix clang-format issues for Linux CI compatibility 2025-12-28 10:22:44 +05:30
Rahul-2k4
53ee63894c style: apply clang-format to fix CI formatting check 2025-12-28 10:12:40 +05:30
Rahul-2k4
50ece42e0a style: apply clang-format and normalize line endings to all source files 2025-12-28 00:47:25 +05:30
Rahul-2k4
3d00e718f6 style: normalize line endings and apply clang-format 2025-12-28 00:26:17 +05:30
Carlos Fernandez
021b788461 feat(scc): Add configurable frame rate and styled PAC codes for SCC output
This commit addresses the remaining items from issue #1191:

1. SCC Output Frame Rate:
   - Added scc_framerate to encoder_cfg and encoder_ctx structs
   - The --scc-framerate option now affects both input parsing AND output
   - Supports 24, 25, 29.97 (default), and 30 fps

2. Styled PAC (Preamble Address Code) Optimization:
   - Added support for styled PACs that encode color/font at column 0
   - When captions start at column 0 with non-default style, uses a single
     styled PAC instead of indent PAC + mid-row code
   - More efficient output that matches professional SCC files

Files changed:
- ccx_common_option.h/c: Added scc_framerate to encoder_cfg
- ccx_encoders_common.h/c: Added scc_framerate to encoder_ctx
- ccx_encoders_scc.c: Added get_scc_fps(), styled PAC functions,
  and optimized write_cc_buffer_as_scenarist()
- common.rs: Copy scc_framerate to enc_cfg

Fixes #1191

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 19:45:05 +01:00
Rahul-2k4
86e5d47141 style: apply clang-format to all source files 2025-12-28 00:14:16 +05:30
Rahul-2k4
5b36356456 style: apply clang-format fixes 2025-12-28 00:04:26 +05:30
Rahul-2k4
ba04aedae1 fix: add missing set_pipeline_pts and dump_rect_and_log functions 2025-12-27 23:58:26 +05:30
Rahul-2k4
5001df0d6c fix(rust): add missing lang field to cap_info initializer 2025-12-27 23:56:26 +05:30
Rahul-2k4
28506fee7b Add lang member to struct cap_info for DVB split mode 2025-12-27 23:49:29 +05:30
Rahul-2k4
47d8aaddb9 Merge upstream/master into final: Resolve conflicts in option structs (kept both split_dvb_subs and scc_framerate) 2025-12-27 23:34:40 +05:30
Rahul-2k4
1b2254f911 Fix DVB split output: include core logic handling and memory safety fixes 2025-12-27 23:27:36 +05:30
Rahul-2k4
dc34b26afb Fix DVB split output: handle empty PBUS and missing OCR init (Issue #447) 2025-12-27 23:21:08 +05:30
Carlos Fernandez
c06102678e fix(mp4): Fix 200ms timing offset for MOV/MP4 caption extraction
Set in_bufferdatatype for MP4/MOV container tracks to prevent incorrect
cb_field counter increments that were adding ~200ms to caption timestamps.

Root Cause:
-----------
The in_bufferdatatype variable was never set in mp4.c, remaining as
CCX_UNKNOWN. This caused the check in do_cb() (ccx_decoders_common.c)
to fail:

  if (ctx->in_bufferdatatype != CCX_H264 && ctx->in_bufferdatatype != CCX_PES)
      cb_field1++;

With in_bufferdatatype == CCX_UNKNOWN, cb_field1 was incremented for
each CEA-608 caption block processed. When get_fts() was called to
timestamp captions, it added cb_field1 * 1001/30 ms to the base time.

With ~6 caption blocks per frame (typical for roll-up captions), this
added approximately 200ms (6 × 33.37ms ≈ 200ms) to caption start times.

Analysis:
---------
Sample file: 1974a299f0502fc8199dabcaadb20e422e79df45972e554d58d1d025ef7d0686.mov

Before fix:
- FFmpeg first caption: 13,847ms
- CCExtractor first caption: 14,047ms
- Offset: 200ms late

The timing flow:
1. MP4 sample has PTS=1246245 (13,847ms at 90kHz)
2. set_fts() correctly sets fts_now based on PTS
3. do_cb() processes caption blocks, incrementing cb_field1 each time
4. get_fts() returns: fts_now + fts_global + cb_field1 * 1001/30
5. With cb_field1=6: adds 6 * 33.37 = 200ms offset

The fix ensures cb_field counters are not incremented for container
formats (MP4, MOV, MKV) because these formats associate all caption
data with the frame's PTS directly - there's no sub-frame timing.

Fix:
----
Set in_bufferdatatype in the three MP4 track processing functions:
- process_avc_track(): CCX_H264 for H.264/AVC tracks
- process_hevc_track(): CCX_H264 for H.265/HEVC tracks
- process_xdvb_track(): CCX_PES for MPEG-2 video tracks

After fix:
- FFmpeg first caption: 13,847ms
- CCExtractor first caption: 13,847ms
- Offset: 0ms (exact match)

This fix resolves timing issues for tests 226-230 on the sample platform.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 16:34:05 +01:00
Carlos Fernandez Sanz
b0800a112c feat(input): Add native SCC (Scenarist Closed Caption) input support 2025-12-27 16:16:31 +01:00
Carlos Fernandez
2b0d9ed427 chore: trigger CI rebuild
Timing issues in tests 226-230 are pre-existing and unrelated to SCC support.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 15:37:49 +01:00
Carlos Fernandez
fd4db0e7bf chore: Trigger CI re-run 2025-12-27 11:18:02 +01:00
VS7686
00d8c9cb0a Fix unused variable warnings in ts_tables_epg.c 2025-12-27 14:01:13 +05:30
Carlos Fernandez
7829c14c60 fix: Initialize scc_framerate in init_options()
The scc_framerate field was not being initialized in the C init_options()
function, leaving it with an undefined value. This could cause undefined
behavior when the options struct is used before the Rust code initializes
the field.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 08:38:32 +01:00
Rahul-2k4
d3602ec938 Fix: Defensive handling of invalid caption_field in DVB subtitle timing (fixes #447) 2025-12-27 12:48:28 +05:30
Rahul-2k4
f9b5e081a7 Remove duplicate comment in parser.rs 2025-12-27 11:46:24 +05:30
Rahul-2k4
bdc3eaa81b Fix: update Rust parser to allow text based formats for DVB split 2025-12-27 10:16:36 +05:30
Carlos Fernandez
2820042c1d style: Fix formatting and clippy warnings
- Replace tabs with spaces in doc comments
- Use #[derive(Default)] with #[default] attribute
- Use array syntax for char pattern matching
- Apply clang-format to C files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 01:19:00 +01:00
Carlos Fernandez
d4d228125a feat(input): Add native SCC (Scenarist Closed Caption) input support
Add native support for reading SCC files directly, eliminating the need
for external conversion tools like SCC2RAW.exe or Perl scripts.

Implementation:
- New Rust parser module (src/rust/src/demuxer/scc.rs) with:
  - SMPTE timecode parsing (HH:MM:SS:FF format)
  - Configurable frame rates: 29.97 (default), 24, 25, 30 fps
  - CEA-608 hex pair extraction
  - UTF-8 BOM handling
  - 12 comprehensive unit tests
- Stream mode detection in both C and Rust code
- FFI exports for C integration (ccxr_is_scc_file, ccxr_process_scc)
- New --scc-framerate command line option
- Integration in raw_loop() following the McPoodle DVD raw pattern

Testing performed:
- Round-trip test: video → SRT, video → SCC, SCC → SRT
  Result: 118/118 captions matched (100% accuracy)
- Multiple output formats verified (SRT, WebVTT, transcript)
- Frame rate option tested with 24fps sample
- UTF-8 BOM handling verified
- All 260 Rust tests pass

Usage:
  ccextractor input.scc -o output.srt
  ccextractor input.scc --scc-framerate 25 -o output.srt

Closes #1293

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-27 00:54:44 +01:00
Rahul-2k4
43d5ba2f34 Improve error message for incompatible OutputFormat in Rust parser 2025-12-27 02:03:51 +05:30
Rahul-2k4
557774b202 Apply code style fixes from clang-format 2025-12-27 01:59:48 +05:30
Rahul-2k4
4e0472bddf Fix DVB split critical bugs: per-pipeline state separation and timing sync 2025-12-27 01:56:12 +05:30
Rahul-2k4
9a2fe6221e Switch platform toolset from v145 to v143 for GitHub Actions compatibility 2025-12-27 01:12:40 +05:30
Rahul Tripathi
182b23a283 Merge branch 'CCExtractor:master' into final 2025-12-27 00:13:39 +05:30
Rahul-2k4
77f3fd35f4 Fix #447: Resolve DVB split mode crash and routing logic
- Fixed NULL pointer dereference in dvb_subtitle_decoder.c (sub->prev check).
- Corrected logic in dvbsub_handle_display_segment to prevent dropped subtitles.
- Implemented robust encoder context swapping in general_loop.c for DVB streams.
- Added regression test: tests/regression/dvb_split.txt.
- Verified 100% completion in split mode and correct Teletext/DVB routing.
2025-12-27 00:11:09 +05:30
Carlos Fernandez Sanz
14e6919f2e ci: Add winget and Chocolatey packaging workflows 2025-12-26 18:20:55 +01:00
Carlos Fernandez
353a37010d ci: Add winget and Chocolatey packaging workflows
Add automated package publishing for Windows package managers:

## Winget
- Initial manifest files for CCExtractor.CCExtractor
- Workflow to auto-submit PRs to microsoft/winget-pkgs on release

## Chocolatey
- Package files (nuspec, install/uninstall scripts)
- Workflow to build and push packages on release

## Setup Required
- WINGET_TOKEN secret (GitHub PAT with public_repo scope)
- CHOCOLATEY_API_KEY secret (from chocolatey.org account)

Closes #1308

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 18:19:11 +01:00
Carlos Fernandez Sanz
921cbe0c57 ci(linux): Add workflow for system-libs builds 2025-12-26 18:08:11 +01:00
VS7686
f0523ceaa3 Fix logic error: removed early returns to restore C implementation 2025-12-26 21:44:12 +05:30
Carlos Fernandez
7284430fc6 fix(build): Preserve FFmpeg libs with -system-libs -hardsubx
The -system-libs mode was overwriting BLD_LINKER and losing the FFmpeg
libraries that -hardsubx adds. This fix preserves the FFmpeg libraries
when both flags are used together.

Also add permissions: contents: write to the workflow to allow
uploading assets to releases.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 15:49:51 +01:00
Carlos Fernandez
68d0d4094e ci(linux): Add workflow for system-libs builds
Add a new GitHub Actions workflow that builds CCExtractor using the
-system-libs flag, creating binaries that dynamically link against
system libraries instead of bundling dependencies.

This is useful for:
- Linux distribution packaging (Debian, Ubuntu, Fedora, etc.)
- Homebrew/Linuxbrew packaging
- Users who prefer smaller binaries with system library updates

Two variants are built:
- basic: Standard OCR-enabled build
- hardsubx: Build with HardSubX (burned-in subtitle extraction)

The workflow runs on releases and can be manually triggered.

Related to #1907

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 15:38:41 +01:00
Carlos Fernandez Sanz
7075f6291d build(linux): Add -system-libs flag for package manager compatibility 2025-12-26 15:32:32 +01:00
Carlos Fernandez Sanz
170d769476 Merge branch 'master' into build/linux-system-libs-flag 2025-12-26 15:31:31 +01:00
Carlos Fernandez
1ff3457744 Updated CHANGES.TXT for 0.96.2 2025-12-26 15:27:02 +01:00
Carlos Fernandez Sanz
dc352a2202 fix(windows): Bundle tessdata for OCR support out of the box 2025-12-26 15:23:34 +01:00
Chandragupt Singh
c8750e42d1 build(linux): use pkg-config cflags for system-libs includes 2025-12-26 18:51:16 +05:30
Carlos Fernandez
20448bfeb2 fix(windows): Bundle tessdata for OCR support out of the box
The Windows release was missing Tesseract OCR runtime dependencies
(tessdata files) needed for the HardSubx feature to work. Users had
to manually install Tesseract OCR and set TESSDATA_PREFIX.

Changes:
- Add get_executable_directory() to ocr.c that returns the directory
  containing the executable (works on Windows, Linux, and macOS)
- Update probe_tessdata_location() to search for tessdata in the
  executable directory, enabling bundled tessdata to be found
- Update release workflow to download eng.traineddata and osd.traineddata
  from tesseract-ocr/tessdata_fast during release builds
- Update WiX installer to include tessdata directory with the
  traineddata files

Now the Windows release includes tessdata files, and CCExtractor will
automatically find them in the installation directory without requiring
users to install Tesseract separately or set environment variables.

Fixes #1578

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-26 13:05:46 +01:00
VS7686
807df0339e Fix styling: Apply clang-format 2025-12-26 16:11:57 +05:30
Rahul-2k4
6642973c63 CLI + option plumbing for --split-dvb-subs 2025-12-26 14:43:36 +05:30
Chandragupt Singh
f08fd658e6 build(linux): Add -system-libs flag for Homebrew compatibility 2025-12-26 13:07:09 +05:30
VS7686
5ae3116a6c Fix indentation: reduce to 4 spaces 2025-12-26 10:29:36 +05:30
VS7686
826afcd991 Fix styling: increase indentation inside ifndef 2025-12-26 10:14:18 +05:30
VS7686
46af5ce9bb Fix coding style and formatting 2025-12-26 09:59:38 +05:30
VS7686
123b35ae69 Fix coding style and formatting 2025-12-26 09:49:17 +05:30
Carlos Fernandez Sanz
f6e9d55838 fix(release): Update Flutter GUI files and add versioned filenames 2025-12-25 22:34:24 +01:00
VS7686
6f7d3f6169 Fix C4098 warnings in networking.c 2025-12-26 00:26:11 +05:30
Carlos Fernandez
07cc78c2f1 feat(release): Add version numbers to release asset filenames 2025-12-25 16:36:18 +01:00
Carlos Fernandez
affa34848c fix(installer): Update Flutter GUI files for v0.7.0 2025-12-25 13:47:57 +01:00
Carlos Fernandez Sanz
45ee03aecc fix(release): Support 3-part version numbers (e.g., v0.96.1) 2025-12-25 12:58:04 +01:00
Carlos Fernandez
c6e27ca809 fix(release): Support 3-part version numbers (e.g., v0.96.1)
Update the version extraction logic in the release workflow to properly
handle 3-part semantic versions like v0.96.1 in addition to existing
2-part versions like v0.96.

MSI installers require 4-part versions (major.minor.build.revision):
- v0.96 → 0.96.0.0 (unchanged behavior)
- v0.96.1 → 0.96.1.0 (new support)
- v0.96.1.2 → 0.96.1.2 (passthrough)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-25 12:56:13 +01:00
Carlos Fernandez Sanz
a8f25ce25e fix(installer): Fix Windows MSI installer for WiX v6 2025-12-25 11:53:45 +01:00
Carlos Fernandez Sanz
2781a7f7d6 docs(mac): Add documentation for -system-libs build mode 2025-12-25 11:00:47 +01:00
Carlos Fernandez
903ccc1442 chore: trigger CI rerun 2025-12-25 09:59:16 +01:00
Hridya
857a3bc9c6 docs: add basic usage example to documentation 2025-12-25 13:53:15 +05:30
Hridya
c2c589d6f6 docs: remove outdated Travis CI badge from README 2025-12-25 12:44:02 +05:30
GAURAV KARMAKAR
941604b33c docs(mac): Add documentation for -system-libs build mode 2025-12-25 02:15:02 +05:30
Carlos Fernandez
1950f096b6 fix(workflow): Extract only numeric version for MSI
MSI version numbers must be numeric (major.minor.build format).
Strip everything after the first dash from tag names to get valid
version numbers (e.g., v1.08-test becomes 1.08).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 20:05:20 +01:00
Carlos Fernandez
1fc5ec00d4 fix(installer): Use correct WiX v4+ attribute name 'Scope' not 'InstallScope'
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 19:03:53 +01:00
Carlos Fernandez
c0deae4b0c fix(installer): Add InstallScope=perMachine and update InstallerVersion
- Set InstallScope="perMachine" to ensure proper admin-level registry access
- Bump InstallerVersion from 200 to 500 (Windows Installer 5.0)

This should fix the "Could not write key VersionMinor to Product" error.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 18:20:18 +01:00
Carlos Fernandez
84692b5658 fix(installer): Disable path validation to avoid local drive errors
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 17:23:09 +01:00
Carlos Fernandez
4a51ad114e fix(installer): Use custom UI without license dialog
Instead of trying to override WixUI_InstallDir, create a custom UI
based on it but without the LicenseAgreementDlg. This is the proper
way to remove dialogs from WiX UI sets.

- Add CustomUI.wxs with dialog flow: Welcome -> InstallDir -> VerifyReady
- Update installer.wxs to use CustomInstallDirUI instead of WixUI_InstallDir
- Update workflow to build both .wxs files

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 16:25:35 +01:00
Carlos Fernandez
6789376b92 fix(installer): Try Order=999 to force dialog override to fire last
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 16:01:43 +01:00
Carlos Fernandez
ea5125f030 fix(installer): Use Order attribute to override license dialog navigation
The previous Publish elements without Order didn't override the defaults.
Adding Order="1" ensures our overrides fire after the WixUI defaults,
making our InstallDirDlg navigation take precedence.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 13:45:28 +01:00
Carlos Fernandez Sanz
000b39775c Fix typo: 'sring' -> 'string' in DVB subtitle decoder 2025-12-24 12:02:34 +01:00
Carlos Fernandez
23fe02f0d2 fix(installer): Skip license dialog with Publish overrides
Override the WixUI_InstallDir dialog sequence to skip the license
agreement dialog, restoring the original behavior before WiX v6 migration.

- WelcomeDlg Next button now goes directly to InstallDirDlg
- InstallDirDlg Back button returns to WelcomeDlg

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 11:47:33 +01:00
Carlos Fernandez
394fb39a9c fix(installer): Update DLL list to match current build output
The installer.wxs was referencing old FFmpeg DLLs that no longer exist:
- avcodec-57.dll → avcodec-60.dll
- avformat-57.dll → avformat-60.dll
- avutil-55.dll → avutil-58.dll
- swresample-2.dll → swresample-4.dll
- swscale-4.dll → swscale-7.dll

Added new DLLs that are now part of the build:
- avdevice-60.dll, avfilter-9.dll, postproc-57.dll
- libgpac.dll, OpenSVCDecoder.dll
- libcryptoMD.dll, libsslMD.dll
- desktop_drop_plugin.dll

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 10:07:01 +01:00
Harshdhall01
294bf5bc18 Fix typo: 'sring' -> 'string' in DVB subtitle decoder 2025-12-24 13:54:47 +05:30
Carlos Fernandez
4e52e61c91 fix: Remove duplicate WiX property declarations
The <ui:WixUI Id="WixUI_InstallDir" InstallDirectory="INSTALLFOLDER" />
element already defines WIXUI_INSTALLDIR (via the InstallDirectory attribute)
and ARPNOMODIFY (in the wixlib). Declaring them again causes WIX0091 errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 09:05:55 +01:00
Carlos Fernandez
faaaabf63c fix(installer): Add missing WIXUI_INSTALLDIR property and fix RemoveFolder ID
- Added WIXUI_INSTALLDIR property (required per WiX issue #7105)
- Changed RemoveFolder Id from "DesktopFolder" to "RemoveDesktopShortcut"
  to avoid ID conflict with StandardDirectory element

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 07:28:33 +01:00
Carlos Fernandez
f5a9018ef0 fix(release): Upgrade WiX from v4.0.0-preview.0 to v6.0.2 stable
The WiX build was failing due to several WiX v4 to v6 migration issues.

Workflow changes:
- Uninstall existing WiX before installing v6.0.2 (force clean install)
- WiX version: 4.0.0-preview.0 -> 6.0.2
- Extension: WixToolset.UI.wixext/4.0.0-preview.0 -> WixToolset.UI.wixext/6.0.2
- Fixed extension command syntax: "extension -g add" -> "extension add -g"

installer.wxs changes (WiX v6 migration):
- Added ui namespace: xmlns:ui="http://wixtoolset.org/schemas/v4/wxs/ui"
- Replaced custom inline UI with standard <ui:WixUI Id="WixUI_InstallDir">
  (fixes WIX0094 error for WixUIValidatePath custom action)
- Changed Directory to StandardDirectory for DesktopFolder (fixes WIX5437)

See: https://github.com/orgs/wixtoolset/discussions/6516
     https://github.com/wixtoolset/issues/issues/6998

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-24 07:14:18 +01:00
Carlos Fernandez
e01720c05e fix: Use WiX extension by name instead of hardcoded path
The WiX v4 extension path was hardcoded and didn't match the actual
installed location. WiX v4 allows referencing globally installed
extensions by name directly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 23:35:10 +01:00
Carlos Fernandez
f80b1f26ca fix(ci): Add -Force to Expand-Archive for Flutter GUI
The installer directory already has files from the copy step, so
Expand-Archive needs -Force to overwrite/merge.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 22:43:07 +01:00
GAURAV KARMAKAR
e42bc2b9f9 fixed the merged conflict in the ccx_encoders_common.h 2025-12-24 02:25:53 +05:30
Carlos Fernandez
f9ebfd2a32 fix(ci): Add vcpkg setup and fix permissions in release workflow
- Add permissions: contents: write for upload-release-assets
- Add vcpkg environment variables and setup steps from build_windows.yml
- Add gpac installation
- Add vcpkg clone, bootstrap, and dependency installation
- Add VCPKG_ROOT env var to build step
- Change runner to windows-2022 to match build workflow
- Add msbuild-architecture: x64
- Remove redundant llvm/clang setup (pre-installed on runner)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 21:53:40 +01:00
Gaurav karmakar
bf9841a255 Merge branch 'master' into gaurav-v1 2025-12-24 01:55:53 +05:30
Carlos Fernandez
9f670de8ed fix(windows): Use latest Windows SDK instead of hardcoded version
Changed WindowsTargetPlatformVersion from 10.0.22621.0 to 10.0 to
automatically use whichever Windows 10 SDK is installed on the build
machine. This fixes CI failures when the runner has a different SDK
version installed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 21:20:51 +01:00
Carlos Fernandez
fc4a14e7d6 0.96 release, for real 2025-12-23 21:09:47 +01:00
Carlos Fernandez Sanz
4f13b861cd Merge pull request #1888 from CCExtractor/fix/release-workflow-x64
fix(ci): Update Windows release build to use x64 platform
2025-12-23 21:03:16 +01:00
Carlos Fernandez
df692f296d fix(ci): Update Windows release build to use x64 platform
The solution file only has x64 configurations (Release-Full|x64,
Debug-Full|x64). The workflow was incorrectly trying to build with
Win32 platform which doesn't exist.

Changes:
- Platform=Win32 → Platform=x64
- Output path ./Release-Full/ → ./x64/Release-Full/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 20:58:56 +01:00
Carlos Fernandez Sanz
419fc4694d Changelog clean up and start of new version
docs: Add Upcoming section to changelog
2025-12-23 19:38:25 +01:00
Carlos Fernandez Sanz
fc230fc217 feat(teletext): Add multi-page extraction with separate output files (#665) 2025-12-23 19:37:12 +01:00
Carlos Fernandez
825e160e72 Clean up CHANGES.TXT 2025-12-23 19:33:23 +01:00
Carlos Fernandez
8e24c17c1e Clean up CHANGES.TXT 2025-12-23 19:30:32 +01:00
Carlos Fernandez
4e21fae053 docs: Add Upcoming section to changelog with teletext multi-page feature
Start new changelog section for unreleased changes. First entry is
the multi-page teletext extraction feature (#665) which allows
extracting multiple teletext pages simultaneously with separate
output files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 17:42:50 +01:00
Carlos Fernandez
be239a5c46 fix: Restore teletext auto-detect mode for single-page extraction
The page update logic at line 1029-1035 was incorrectly updating
tlt_config.page for all accepted pages, even in single-page auto-detect
mode. This caused the auto-detect logic at line 979 to be bypassed
because the first packet (even with an invalid page number like 0xFF)
would set tlt_config.page, preventing proper auto-detection.

The fix restricts the page update to multi-page mode only. In single-page
mode, tlt_config.page is set exclusively by:
1. User specification (--tpage option)
2. Auto-detect logic (first valid subtitle page found)

This fixes regression in SP Test 76 which uses sample
8c1615c1a84d4b9b34134bde8085214bb93305407e935edcdfd4c2fc522c215f.mpg
with --autoprogram --out=ttxt --latin1.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 16:36:02 +01:00
Carlos Fernandez
1d9f32239e docs: Add doxygen comments to should_accept_page function
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 15:43:54 +01:00
Carlos Fernandez
cbb5f0b0a8 fix(clippy): Use RangeInclusive::contains() instead of manual range check
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 14:41:18 +01:00
Carlos Fernandez
fd063931ea feat(teletext): Add multi-page extraction with separate output files (#665)
Implement support for extracting multiple teletext pages simultaneously,
with each page output to a separate file.

Changes:
- Support multiple --tpage arguments (e.g., --tpage 397 --tpage 398)
- Create separate output files per page with _pNNN suffix
  (e.g., output_p397.srt, output_p398.srt)
- Maintain backward compatibility for single-page extraction (no suffix)
- Add per-page SRT counters for correct subtitle numbering
- Fix BCD to decimal page number conversion in telxcc.c
- Add --tpages-all mode support for auto-detecting all pages

Tested with 21 teletext samples from the sample platform, all passing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-23 14:28:15 +01:00
Carlos Fernandez Sanz
7a9acb7bd2 Merge pull request #1883 from CCExtractor/dependabot/github_actions/actions/upload-artifact-6
build(deps): Bump actions/upload-artifact from 4 to 6
2025-12-23 10:19:30 +01:00
Carlos Fernandez Sanz
cbf180eb39 build(deps): Bump actions/checkout from 4 to 6 2025-12-23 10:19:16 +01:00
Carlos Fernandez Sanz
614e6c42b5 build(deps): Bump softprops/action-gh-release from 1 to 2 2025-12-23 10:18:50 +01:00
Carlos Fernandez Sanz
38bcb7ed85 Merge pull request #1884 from CCExtractor/dependabot/github_actions/actions/cache-5
Routine dependency update for GitHub Actions
2025-12-23 09:32:05 +01:00
Carlos Fernandez Sanz
d57354830e chore: Bump version to 0.96 2025-12-23 00:06:45 +01:00
Carlos Fernandez Sanz
7b43201ce1 fix(mp4/mkv): Add HEVC/H.265 caption extraction for MP4 and Matroska containers 2025-12-23 00:06:12 +01:00
Carlos Fernandez Sanz
ea1c82ac17 [FIX] Handle NULL bitmap gracefully in OCR instead of crashing (#1010) 2025-12-23 00:05:32 +01:00
dependabot[bot]
b3f1e27f5c build(deps): Bump actions/cache from 4 to 5
Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-22 18:02:20 +00:00
dependabot[bot]
82c92d3910 build(deps): Bump actions/upload-artifact from 4 to 6
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-22 18:02:11 +00:00
dependabot[bot]
5bf8e7de0d build(deps): Bump actions/checkout from 4 to 6
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-22 18:02:04 +00:00
dependabot[bot]
5b8a9709df build(deps): Bump softprops/action-gh-release from 1 to 2
Bumps [softprops/action-gh-release](https://github.com/softprops/action-gh-release) from 1 to 2.
- [Release notes](https://github.com/softprops/action-gh-release/releases)
- [Changelog](https://github.com/softprops/action-gh-release/blob/master/CHANGELOG.md)
- [Commits](https://github.com/softprops/action-gh-release/compare/v1...v2)

---
updated-dependencies:
- dependency-name: softprops/action-gh-release
  dependency-version: '2'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-22 18:01:54 +00:00
Carlos Fernandez Sanz
063786c4b7 [FEATURE] Add AppImage build variants and CI workflow (#1348) 2025-12-22 09:12:36 +01:00
GAURAV KARMAKAR
6ed09ea397 SPUPNG: fix formatting to match clang-format 2025-12-22 13:22:25 +05:30
Carlos Fernandez
44363c0acd fix(mkv): Add HEVC/H.265 caption extraction for Matroska containers
Extends HEVC caption extraction support to MKV files.

Changes to matroska.h:
- Add hevc_codec_id constant for V_MPEGH/ISO/HEVC
- Add hevc_track_number field to matroska_ctx structure
- Add process_hevc_frame_mkv() function declaration

Changes to matroska.c:
- Detect HEVC tracks in parse_segment_track_entry()
- Modify parse_simple_block() to route HEVC tracks to HEVC processor
- Add process_hevc_frame_mkv() with is_hevc flag and store_hdcc() call
- Parse HEVCDecoderConfigurationRecord in parse_private_codec_data()
- Initialize hevc_track_number in matroska_loop()
- Update output messages to report HEVC tracks

Tested with HEVC MKV file - extracts 73 captions matching MP4 output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 05:59:23 +01:00
Carlos Fernandez
701271ec82 fix(mp4): Add HEVC/H.265 caption extraction for MP4 containers
PR #1852 added HEVC caption extraction for MPEG-TS containers,
but MP4/MKV containers weren't supported. This adds HEVC support
for MP4 containers using GPAC.

Changes:
- Add HEVC subtype definitions (hev1, hvc1)
- Add process_hevc_sample() to parse HEVC NAL units and extract CC
- Add process_hevc_track() to iterate through HEVC track samples
- Detect and process HEVC tracks in processmp4()
- Add store_hdcc() call to flush buffered CC data after each sample

The key fix was adding store_hdcc() after processing each sample.
Without this, CC data was being parsed but never output because
store_hdcc() is normally called from slice_header() which is
AVC-only.

Closes #1690 (for MP4 containers)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 05:59:23 +01:00
Carlos Fernandez
7c74ea4112 docs: Add 0.96 (Unreleased) section to CHANGES.TXT
Move all changes made after the 0.95 version bump (commit ee232b5)
to a new 0.96 section marked as "Unreleased".

This separates the released 0.95 content from ongoing development
work that will be included in the next release.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 05:58:01 +01:00
Carlos Fernandez
ed42525f44 chore: Bump version to 0.96
Update version strings across all build configurations:
- src/lib_ccx/lib_ccx.h
- linux/configure.ac
- mac/configure.ac
- package_creators/PKGBUILD
- package_creators/ccextractor.spec
- package_creators/debian.sh
- OpenBSD/Makefile

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-22 05:58:01 +01:00
Carlos Fernandez
b88d1ebab2 fix(ci): Fix AppImage build failures for OCR and HardSubX variants
OCR build fix:
- linuxdeploy was failing with "Invalid magic bytes in file header"
  because it was passed the wrapper script instead of the actual binary
- When OCR is enabled, ccextractor is renamed to ccextractor.bin and
  a wrapper script sets TESSDATA_PREFIX before executing the binary
- Now correctly passes ccextractor.bin to linuxdeploy when it exists

HardSubX build fix:
- Add libavdevice-dev to FFmpeg dependencies in CI workflow
- rusty_ffmpeg requires libavdevice which was missing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 22:47:24 +01:00
Carlos Fernandez
ec11b00f9f fix(ci): Use correct Rust toolchain action name 2025-12-21 22:40:06 +01:00
Carlos Fernandez
8c0fe08781 feat: Add AppImage build variants and CI workflow (#1348)
Rewrites the AppImage build script to support three build variants
matching the Docker build options:
- minimal: Basic CCExtractor without OCR (smallest size)
- ocr: CCExtractor with OCR support (default)
- hardsubx: CCExtractor with burned-in subtitle extraction

Changes to build_appimage.sh:
- Add BUILD_TYPE environment variable to select variant
- Fix CMake options (was incorrectly using make flags)
- Bundle tessdata for OCR builds with wrapper script
- Create proper desktop file and icon handling
- Improve error handling and cleanup

New GitHub Actions workflow (build_appimage.yml):
- Builds all three variants on release
- Uploads AppImages as release assets
- Can be manually triggered for specific variants
- Caches GPAC build for faster CI runs

Usage:
  ./build_appimage.sh              # Builds 'ocr' variant
  BUILD_TYPE=minimal ./build_appimage.sh
  BUILD_TYPE=hardsubx ./build_appimage.sh

Closes #1348

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 22:37:22 +01:00
Carlos Fernandez
3304c1b094 fix(ocr): Handle NULL bitmap gracefully instead of crashing (#1010)
When processing DVB subtitles from live streams or corrupted files,
the bitmap clipping operation can fail, resulting in a NULL pix object.
Previously, this would cause a fatal crash with "Failed to perform OCR -
Failed to get text" because the code continued to call TessBaseAPIGetUTF8Text
even when no image was set.

Changes:
- Handle cpix_gs == NULL by logging a message and returning NULL
  (skip this bitmap) instead of continuing and crashing
- Change the fatal error when TessBaseAPIGetUTF8Text returns NULL
  to a non-fatal skip, since this can happen with empty/invalid bitmaps
- Both cases now properly clean up allocated resources before returning

This allows CCExtractor to gracefully skip problematic subtitle frames
instead of crashing, which is especially important for live streams
where packet loss or discontinuities can occur.

Fixes #1010

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 22:25:35 +01:00
Carlos Fernandez
5bad3732c3 chore: Remove plan files from git tracking
The plans/ directory is in .gitignore but these files were added
before that entry existed. Removing from tracking while keeping
files on disk.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 21:46:39 +01:00
Carlos Fernandez Sanz
e3b0defb49 build(rust): Upgrade bindgen to 0.72.1 for Fedora packaging 2025-12-21 21:38:02 +01:00
Carlos Fernandez Sanz
2065c5509d fix(windows): Fix c_long ABI mismatch causing Windows CI failures 2025-12-21 20:16:56 +01:00
Carlos Fernandez
5458370346 refactor: Replace c_longlong with i64 for consistency
For clarity and consistency, use explicit i64 instead of c_longlong.
While c_longlong is 64-bit on all platforms, i64 is clearer and
follows the same pattern as the previous commit that removed c_long.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 17:55:57 +01:00
Carlos Fernandez
9e19c58edf refactor: Replace platform-dependent 'long' with 'int64_t'
The C type 'long' has different sizes on different platforms:
- Linux: 64-bit
- Windows: 32-bit

This causes ABI mismatches when interfacing with Rust, since Rust's
c_long matches the platform's long size, but we were treating these
values as 64-bit throughout.

Changed the following fields from 'long' to 'int64_t':
- asf_constants.h: parsebufsize
- avc_functions.h: cc_databufsize, num_nal_unit_type_7, num_vcl_hrd,
  num_nal_hrd, num_jump_in_frames, num_unexpected_sei_length
- ccx_decoders_608.h: bytes_processed_608
- ccx_demuxer.h: capbufsize, capbuflen
- lib_ccx.h: ts_readstream() return type, FILEBUFFERSIZE
- file_functions.c: FILEBUFFERSIZE definition
- ts_functions.c: ts_readstream() implementation

Also updated Rust code in common.rs to remove c_long casts, since
bindgen will now generate i64 for these fields.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 17:52:24 +01:00
Carlos Fernandez Sanz
0bb56d508a fix(timing): Fix --goptime producing compressed timestamps 2025-12-21 17:50:53 +01:00
Carlos Fernandez
2c67381d2b fix(windows): Fix c_long ABI mismatch in demuxer.rs
The extern declaration for ccxr_add_current_pts used c_long, but the
actual implementation in time.rs uses i64. This caused an ABI mismatch
on Windows where:
- c_long = i32 (32-bit)
- i64 = 64-bit

On Linux both are 64-bit so it worked, but on Windows the type
mismatch could cause incorrect parameter passing.

Changes:
- Change extern fn declaration from c_long to i64
- Remove unnecessary cast (FRAME_DURATION_TICKS is already i64)
- Remove unused c_long import

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 17:00:50 +01:00
GAURAV KARMAKAR
2b708c4a31 Enhance SPUPNG offset calculations and XML tag handling in EIA608 encoder
- Introduced a forward declaration for .
- Updated  to calculate and set image dimensions before writing XML tags.
- Adjusted offset calculations based on screen size for better alignment of subtitles.
- Improved handling of the opening XML tag based on subtitle data presence.
2025-12-21 19:20:28 +05:30
Carlos Fernandez
94a43928ad fix(timing): Fix --goptime producing compressed timestamps (Test 163)
When using --goptime, timestamps were compressed to 00:00:01-02 instead
of actual GOP times (17:56:40-47). This was caused by conflicts between:
- GOP timing set from GOP headers (wall-clock time, e.g., 17:56:40)
- PES PTS timing (stream-relative time, e.g., 00:00:02)

The sync detection saw these as 64,598-second "jumps" and kept resetting
timing, corrupting the output.

Fixes:
1. Guard video PES timing in general_loop.c - skip set_current_pts and
   set_fts when use_gop_as_pts == 1 to prevent PES PTS from overwriting
   GOP-based timing
2. Disable sync check in ccextractor.c when use_gop_as_pts == 1 since
   GOP time and PES PTS are in different time bases and sync detection
   is meaningless

Test results:
- Before: 00:00:01,231 --> 00:00:01,729
- After:  17:56:41,319 --> 17:56:43,084

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 12:34:05 +01:00
Carlos Fernandez Sanz
25d68b75bd fix(708): Support Korean EUC-KR encoding in CEA-708 decoder 2025-12-21 12:23:39 +01:00
Carlos Fernandez
73cd19f5d0 fix(rust): Use i64 instead of c_long for Windows compatibility
On Windows, c_long is i32 while on Linux it's i64. The function
ccxr_print_mstime_static expects i64, so casting to c_long caused
a type mismatch error on Windows builds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 09:43:27 +01:00
Carlos Fernandez
d0caf23a82 fix(timing): Use i64 instead of c_long for Windows compatibility
The Rust FFI functions were using c_long for PTS/FTS timestamps, but:
- C code uses LLONG (int64_t, 64 bits on all platforms)
- Rust c_long is 32 bits on Windows, 64 bits on Linux

This caused timestamp truncation on Windows when PTS values exceeded
2^31 (~24 days at 90kHz), resulting in wrong subtitle timestamps.

For example, a file with Min PTS of 23:50:45 (7,726,090,500 ticks)
would have its PTS truncated, breaking the teletext delta calculation
that normalizes timestamps to start at 0.

Changes:
- ccxr_add_current_pts: pts parameter i64
- ccxr_set_current_pts: pts parameter i64
- ccxr_get_fts: return type i64
- ccxr_get_visible_end: return type i64
- ccxr_get_visible_start: return type i64
- ccxr_get_fts_max: return type i64
- ccxr_print_mstime_static: mstime parameter i64
- fts_at_gop_start: extern static i64

Fixes tests 18 and 19 on Windows CI which showed raw PTS timestamps
(23:50:46) instead of normalized timestamps (00:00:00).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 09:43:27 +01:00
Carlos Fernandez
da3dc52b45 fix(708): Support Korean EUC-KR encoding in CEA-708 decoder
Korean broadcasts use EUC-KR encoding (variable-width) in CEA-708
captions, where ASCII is 1 byte and Korean characters are 2 bytes.
The decoder was always writing 2 bytes per character (UTF-16BE style),
causing NULL bytes to be inserted before every ASCII character.

Changes:
- Add is_utf16_charset() to detect fixed-width 16-bit encodings
- Modify write_char() to accept use_utf16 flag:
  - true: Always 2 bytes (UTF-16BE for Japanese, issue #1451)
  - false: 1 byte for ASCII, 2 bytes for extended (EUC-KR for Korean)
- Detect charset type in write_row() before building output buffer

This fixes Korean subtitle extraction when using --service "1[EUC-KR]"
while maintaining compatibility with Japanese UTF-16BE (issue #1451).

Closes #1065

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 09:43:27 +01:00
Carlos Fernandez Sanz
0fdfb751ba fix(708): Handle null timing pointer in CEA-708 settings conversion 2025-12-21 09:41:25 +01:00
Carlos Fernandez Sanz
0b5f13e2c4 feat(wtv): Add DVB teletext stream detection in WTV files 2025-12-21 09:40:59 +01:00
Carlos Fernandez
60cec9e6de style: Fix clang-format indentation
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-21 09:38:50 +01:00
Carlos Fernandez Sanz
d758f3156a fix(windows): Prevent CEA-708 output file truncation on Windows 2025-12-21 09:36:32 +01:00
Carlos Fernandez Sanz
da802a0a39 fix(security): Add bounds checks for buffer overflow vulnerabilities 2025-12-21 09:35:47 +01:00
Carlos Fernandez
8f78a8bbb2 fix(708): Handle null timing pointer in CEA-708 settings conversion
When converting CEA-708 decoder settings from C to Rust via from_ctype(),
a null timing pointer would cause the entire conversion to fail and return
None. This triggered the unwrap_or(default()) fallback, resetting critical
settings like `enabled` and `services_enabled` to false/0.

This caused CEA-708 captions to not be extracted (exit code 10) even when
--service was specified, because the decoder's is_active flag was reset
to 0 during demuxer initialization.

The fix handles null timing pointer gracefully by using a default
CommonTimingCtx instead of propagating None, preserving the other
decoder settings.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 22:34:44 +01:00
Carlos Fernandez
e87807ec27 feat(wtv): Add DVB teletext stream detection in WTV files
This commit adds detection and basic handling of DVB teletext streams
in WTV (Windows TV) files. Previously, teletext streams were silently
ignored.

Changes:
- Add WTV_STREAM_TELETEXT GUID to wtv_constants.h
- Detect teletext streams by examining the format GUID at offset 0x4C
  in MSTVCAPTION stream metadata
- Initialize teletext decoder when teletext stream is found
- Add timing support for teletext streams
- Wrap teletext data in PES headers for the teletext decoder

Limitation: WTV files store teletext in Microsoft's VBI sample format,
which differs from standard DVB teletext data units. The decoder will
process the data but may not extract subtitles from all WTV files.
This is noted in a warning message shown when teletext is detected.
Even FFmpeg's libzvbi fails to decode this format in the test sample.

Addresses: #1391

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 21:58:50 +01:00
Carlos Fernandez
d097ec881c build(rust): Upgrade bindgen to 0.72.1 for Fedora packaging
Fixes #1608 - Update bindgen to enable Fedora Linux packaging.

- Upgrade bindgen from 0.64.0 to 0.72.1
- Fix deprecated CargoCallbacks API
- Replace (?i) regex flags with character classes for compatibility

The inline case-insensitivity flag (?i) causes bindgen 0.72.1 to
silently produce empty bindings. This fix uses [Dd][Tt][Vv][Cc][Cc]
character classes to match both lowercase (dtvcc_*) and uppercase
(DTVCC_*) type/function names.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 21:04:28 +01:00
Carlos Fernandez Sanz
87c898497a build(linux): Suppress find error when GPAC is not installed 2025-12-20 19:56:30 +01:00
Carlos Fernandez
49b698259d fix(windows): Prevent CEA-708 output file truncation on Windows
On Windows, when processing MP4/MOV files with CEA-708 captions, the
output file was being truncated to only the last subtitle. This occurred
because:

1. C code opened the file using open() and stored the fd in writer->fd
2. At end of processing, Rust's ccxr_flush_decoder was called
3. Rust checked writer->fhandle (a separate Windows-specific field)
4. Since fhandle was null (C only set fd), Rust called File::create()
5. File::create() truncates existing files, losing all previous content

The fix checks if fd is already valid before creating a new file. If fd
is valid, it converts it to a Windows handle using _get_osfhandle(),
avoiding the file truncation.

Fixes #1449

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 19:55:12 +01:00
Carlos Fernandez
5715d6d315 build(linux): Suppress find error when GPAC is not installed
Redirect stderr to /dev/null for the GPAC source file search to avoid
showing "No such file or directory" error when GPAC is not installed.
The build continues to work correctly in both cases.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 19:35:35 +01:00
Carlos Fernandez
9fddaab3b0 fix(security): Add bounds checks for buffer overflow vulnerabilities
Fixes two buffer overflow vulnerabilities reported in issues #1427 and #1428:

- #1428 (Global buffer overflow in slice_header): The slice_type value
  read from H.264 exp-golomb data was used to index slice_types[] array
  without bounds checking. Valid values are 0-9 per H.264 spec Table 7-6.
  Now validates slice_type < 10 before use.

- #1427 (Heap buffer overflow in parse_PMT): ES_info_length from PMT
  descriptor data was trusted without validation against buffer bounds.
  Malformed PMT with excessive ES_info_length could read past buffer end.
  Now validates ES_info_length and descriptor lengths against buffer.

Both issues were discovered using AddressSanitizer with crafted TS files.

Fixes #1427
Fixes #1428

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 19:34:22 +01:00
Carlos Fernandez Sanz
6fdfde0838 fix(mac): Fix HARDSUBX configure script and add documentation 2025-12-20 19:06:17 +01:00
Carlos Fernandez
8db7fc7a6d fix(mac): Correct leptonica library name in configure.ac
Homebrew installs leptonica as 'libleptonica.dylib', not 'liblept.dylib'.
Changed AC_CHECK_LIB from [lept] to [leptonica] to match the actual
library name on macOS.
2025-12-20 18:56:02 +01:00
Carlos Fernandez
d8504f80bd ci(mac): Set Homebrew paths for autoconf HARDSUBX build
The AC_CHECK_LIB checks in configure.ac need LDFLAGS and CPPFLAGS
to find libraries installed via Homebrew (in /opt/homebrew on Apple
Silicon or /usr/local on Intel Macs).
2025-12-20 18:48:43 +01:00
Carlos Fernandez
70404c29ca fix(mac): Fix HARDSUBX configure script and add documentation
Fixes #1173 - Error in ./configure enabling hardsubx on Mac
Fixes #1306 - Add HARDSUBX compilation docs for macOS

The configure.ac script failed on macOS with "binary operator expected"
because pkg-config output was unquoted. When pkg-config returns multiple
libraries (e.g., "-ltesseract -lcurl"), the unquoted expansion caused
`test ! -z` to receive multiple arguments instead of a single string.

Changes:
- Quote pkg-config output in TESSERACT_PRESENT conditional (mac & linux)
- Add macOS section to docs/HARDSUBX.txt with all build methods
- Add GitHub Actions jobs to test HARDSUBX builds on macOS:
  - build_shell_hardsubx: Tests ./build.command -hardsubx
  - build_autoconf_hardsubx: Tests ./configure --enable-hardsubx

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 18:41:37 +01:00
Carlos Fernandez Sanz
feb2a61c1d fix(ts): Skip broken PES packets instead of terminating file processing 2025-12-20 18:22:22 +01:00
Carlos Fernandez Sanz
6503502624 fix(mcc): Add MCC output support for raw caption files 2025-12-20 18:21:39 +01:00
Carlos Fernandez Sanz
bf271de52c build(mac): Add -system-libs flag for Homebrew compatibility 2025-12-20 18:20:59 +01:00
Carlos Fernandez Sanz
67e560d288 build(autoconf): Add GPAC library detection to configure 2025-12-20 18:19:57 +01:00
Carlos Fernandez Sanz
54bc97a3f8 fix(hevc): Add HEVC/H.265 caption extraction support with B-frame reordering 2025-12-20 18:18:27 +01:00
Carlos Fernandez Sanz
3d7c534824 ci: Add Docker build workflow to test all image variants 2025-12-20 18:13:49 +01:00
Carlos Fernandez
eda489265d fix(mac): Correct lib_hash include path for system-libs build
The include "../lib_hash/sha2.h" in params.c requires an include path
that makes "../lib_hash" resolve to "thirdparty/lib_hash".

Changed -I../src/lib_hash (which doesn't exist) to
-I../src/thirdparty/lib_hash. With this path, the compiler searches
for "../lib_hash/sha2.h" as:
  ../src/thirdparty/lib_hash/../lib_hash/sha2.h
  = ../src/thirdparty/lib_hash/sha2.h ✓

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 18:13:12 +01:00
Carlos Fernandez
0ac093e4b2 ci: Add Docker build workflow to test all image variants
Tests all three Dockerfile build types in parallel:
- minimal: Basic CCExtractor without OCR
- ocr: CCExtractor with Tesseract OCR support
- hardsubx: CCExtractor with burned-in subtitle extraction

Each job builds from local source and verifies the image works
by running --version. Uses GitHub Actions cache for faster rebuilds.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 18:06:27 +01:00
Carlos Fernandez
6838666b79 build(mac): Add -system-libs flag for Homebrew compatibility
Add a new `-system-libs` flag to mac/build.command that uses
system-installed libraries via pkg-config instead of bundled ones.
This enables Homebrew formula compatibility while preserving the
default standalone build behavior.

When `-system-libs` is passed:
- Uses pkg-config for: freetype2, gpac, libpng, libprotobuf-c,
  libutf8proc, zlib
- Does not compile bundled thirdparty sources
- Links against system libraries

Default behavior (no flag):
- Compiles bundled libraries as before
- No change to existing builds

Also adds a CI job `build_shell_system_libs` to test the new flag.

Refs #1580, #1534

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 17:58:46 +01:00
Carlos Fernandez
08d59ecb5f build(autoconf): Add GPAC library detection to configure
Previously, configure would succeed even without GPAC installed,
leading to a confusing compile-time error:
  "gpac/isomedia.h: No such file or directory"

Now configure checks for GPAC via pkg-config and fails early with
a helpful error message listing the package names for common distros:
  - gpac-devel (Fedora/RHEL)
  - libgpac-dev (Debian/Ubuntu)
  - gpac (Arch)

Fixes #1584

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 17:36:54 +01:00
Carlos Fernandez Sanz
2ce3e0c0de fix(docker): Rewrite Dockerfile to fix broken builds 2025-12-20 17:29:14 +01:00
Carlos Fernandez
3f45a4e136 fix(docker): Rewrite Dockerfile to fix broken builds
Fixes #1550 - Docker builds were broken after PR #1535 switched from
vendored GPAC to system GPAC.

Changes:
- Switch from Alpine to Debian Bookworm (Alpine's musl libc has issues
  with Rust bindgen's libclang dynamic loading)
- Support three build variants via BUILD_TYPE argument:
  - minimal: No OCR support
  - ocr (default): Tesseract OCR for bitmap subtitles
  - hardsubx: OCR + FFmpeg for burned-in subtitle extraction
- Support dual source modes via USE_LOCAL_SOURCE argument:
  - 0 (default): Clone from GitHub (standalone Dockerfile)
  - 1: Use local source (faster for developers)
- Add .dockerignore to exclude build artifacts (~2.7GB -> ~900KB context)
- Update README.md with comprehensive build instructions

Tested all three variants successfully:
- minimal: ~130MB image
- ocr: ~215MB image
- hardsubx: ~610MB image

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 17:27:42 +01:00
Carlos Fernandez
d0d46fc176 fix(mcc): Add MCC output support for raw caption files
Previously, when using -out=mcc with raw input files (-in=raw),
CCExtractor would print "Output format not supported" and produce
no output. This was because the raw file processing path decoded
CEA-608 data to text, but MCC format requires raw cc_data bytes.

The fix adds a new code path that bypasses the 608 decoder when
MCC output is requested:

- Added process_raw_for_mcc() helper function that:
  - Converts 2-byte raw pairs to 3-byte cc_data format
  - Wraps each CC pair in CDP format via mcc_encode_cc_data()
  - Maintains proper timing at 29.97fps

- Modified raw_loop() to detect MCC output and use the new path

Test results with McPoodle raw files:
- Before: "Output format not supported" (exit code 10)
- After: Valid MCC file with proper timing and CDP-wrapped data

Fixes #1542

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 11:53:50 +01:00
Carlos Fernandez
3e9ed3043b fix(ts): Skip broken PES packets instead of terminating file processing
Fixes #1455

When read_video_pes_header() encounters a malformed or truncated PES
packet (returns -1), copy_capbuf_demux_data() previously returned
CCX_EOF which terminated the entire file processing. This was overly
aggressive - a single broken PES packet should be skipped, not
terminate the entire file.

UK Freeview DVB recordings from September 2022 onwards contain some
malformed PES packets in the DVB subtitle stream that triggered this
condition, causing ccextractor to stop at 0% with "Processing ended
prematurely" error even though VLC could display the subtitles.

The fix changes the error handling to skip the broken packet and
continue processing:
- Before: return CCX_EOF (terminates file)
- After: return CCX_OK (skips packet, continues)

Test results with UK Freeview sample:
- Before: 0% processed, 0 subtitles extracted
- After: 100% processed, 10 subtitles extracted correctly

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 11:08:18 +01:00
Carlos Fernandez
1bdd9abd35 fix(clippy): Suppress dead_code warnings for unused HEVC NAL constants
The HEVC NAL type constants are defined for completeness and reference,
but not all are currently used in the codebase.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 10:50:47 +01:00
Carlos Fernandez
9e970fd788 style: Run cargo fmt on avc/core.rs
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 10:35:18 +01:00
Carlos Fernandez
87bc1d9613 style: Fix clang-format issue in ts_functions.c
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 10:34:50 +01:00
Carlos Fernandez
440cd5527f fix(hevc): Fix garbled captions by implementing B-frame reordering
HEVC uses B-frames extensively, causing CC data to arrive in decode
order instead of presentation order. This was causing character pairs
to be scrambled (e.g., "MEDIOCRE" became "MIOEDCRE").

Changes:
- Implement PTS-based sequence numbering for HEVC CC data (similar to H.264)
- Change flush logic to only trigger on IDR frames (not every VCL NAL)
- Add HEVC fallback detection for streams without PAT/PMT

Fixes #1639 (ATSC 3.0 HEVC caption extraction)
Tested with issue_1639_sample.ts and caption_test_1690.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 10:34:50 +01:00
Carlos Fernandez
0fbbc06bcf fix(hevc): Add HEVC/H.265 caption extraction support
Fixes #1690 - Captions fail to extract on HEVC video stream

HEVC video streams with embedded EIA-608/708 captions weren't being
extracted, even though VLC/MPV could display them.

Root causes fixed:
1. HEVC stream type (0x24) wasn't recognized for CC extraction
2. HEVC NAL parsing used H.264 format (1-byte) instead of HEVC (2-byte)
3. HEVC SEI types (39/40) weren't handled (only H.264 SEI type 6)
4. CC data accumulation across SEIs caused u8 overflow/garbled output

Changes:
- C code: Add HEVC stream detection, CCX_HEVC buffer type, is_hevc flag
- Rust code: HEVC NAL header parsing (2-byte, type=(byte[0]>>1)&0x3F),
  HEVC SEI handling (PREFIX_SEI=39, SUFFIX_SEI=40), immediate CC flush

Thanks to @trufio465-bot for the initial research in PR #1735.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 10:34:50 +01:00
Carlos Fernandez Sanz
5f0c6728bf fix(avc): Handle streams that don't start with NAL start codes 2025-12-20 01:33:37 -08:00
Carlos Fernandez Sanz
b9aabcd60d fix(raw): Fix premature EOF and timing overflow in raw_loop 2025-12-20 01:32:43 -08:00
Carlos Fernandez Sanz
d0243237db fix(args): Add backward compatibility for single-dash long options 2025-12-20 01:32:08 -08:00
Carlos Fernandez Sanz
a86a4ca7ce feat: Add --list-tracks option to list media file tracks 2025-12-20 01:31:38 -08:00
Carlos Fernandez
77624ec678 style: Run cargo fmt on Rust code
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 10:27:22 +01:00
Carlos Fernandez
73db3a2c39 fix(avc): Handle streams that don't start with NAL start codes (#1626)
The AVC parser would fail with "Leading bytes are non-zero" error when
processing HLS/Twitch stream segments that start mid-stream without
proper NAL unit headers at the beginning.

Root cause: When process_avc encountered non-zero leading bytes, it
returned an error with 0 bytes processed. The C code would not remove
any bytes from the buffer, causing subsequent data to accumulate with
the corrupt beginning, leading to infinite errors.

Fix:
- Add find_nal_start_code() to search for valid NAL start codes
- If buffer doesn't start with 0x00 0x00, search for first NAL start
- Skip garbage data before first valid NAL unit
- Return full buffer length when no NAL found (clears the buffer)
- Change forbidden_zero_bit error from fatal to skip-and-continue

Tested with 6 Twitch HLS sample files - all now process correctly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 09:08:14 +01:00
Carlos Fernandez
dd3dab7d52 fix(args): Add backward compatibility for single-dash long options (#1576)
Old versions of ccextractor accepted single-dash long options like
-quiet, -stdout, -autoprogram. The new Rust-based argument parser
(clap) only accepts double-dash options (--quiet, --stdout, etc.).

When users ran scripts with -quiet, clap parsed it as individual
short options -q -u -i -e -t and failed with exit code 7. Users
with stderr redirected never saw the error, causing silent failures
with zero-length output files.

This adds a normalize_legacy_option() function that pre-processes
arguments before passing them to clap:
- Single-dash long options (e.g., -quiet) convert to --quiet
- Double-dash options remain unchanged
- Short options like -o remain unchanged
- Numeric options like -1, -12 remain unchanged

Includes 6 unit tests for the new function.

Fixes #1576

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 08:54:48 +01:00
Carlos Fernandez
ebfa31c333 fix(raw): Fix premature EOF and timing overflow in raw_loop (#1565)
Fix raw caption file processing that would stop at exactly 9:43:00 (2MB).

Root causes and fixes:
1. Premature EOF: After processing first chunk (BUFSIZE ~2MB), data->len
   was never reset. On next iteration, general_get_more_data() calculated
   want = BUFSIZE - len = 0 and returned EOF immediately.
   Fix: Reset data->len = 0 after each chunk and change loop condition.

2. 32-bit integer overflow: The calculation cb_field1 * 1001 / 30 * 90
   overflowed for large cb_field1 values (>1M). For example,
   34,989,487 * 90 = 3,149,053,830 exceeds 32-bit signed max.
   Fix: Cast cb_field1 to LLONG before multiplication.

3. Timing initialization: Raw mode needs min_pts=0, sync_pts=0, and
   pts_set=MinPtsSet for correct fts_now calculation.

Tested with sample files from issue #1565:
- DTV3.raw: Now processes to 17:59:56 (was stopping at 9:43)
- DTV4.raw: Now processes to 14:00:00 (was stopping at 9:43)
- DTV5.raw: Now processes to 13:19:59 (was stopping at 9:43)

Closes #1565

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 08:37:52 +01:00
Carlos Fernandez
d52d26baf8 style: Format Rust code with cargo fmt
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 07:47:17 +01:00
Carlos Fernandez
3a852b7915 feat: Add --list-tracks option to list media file tracks
Add a new --list-tracks (-L) option that lists all tracks found in
media files without processing them. This is useful for exploring
media files before caption extraction.

Supports:
- Matroska (MKV/WebM) files
- MP4/MOV files
- MPEG Transport Stream files

The feature is implemented entirely in Rust with native parsers for
each format, avoiding dependency on external libraries.

Closes #1669

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-20 07:42:38 +01:00
Carlos Fernandez Sanz
c3f637a10e fix(rust): Handle NULL file pointer in ccxr_demuxer_open for UDP/TCP input 2025-12-19 07:44:16 -08:00
Carlos Fernandez Sanz
f3768625c6 fix(wtv): Set sync_pts alongside min_pts to prevent PTS jump detection 2025-12-19 07:43:39 -08:00
Carlos Fernandez
c733902473 fix(wtv): Set sync_pts alongside min_pts to prevent PTS jump detection
The previous WTV timing fix (commit 300f8ca6) set min_pts and pts_set=2
(MinPtsSet) but didn't set sync_pts. This caused the Rust timing code
to detect a massive PTS jump when processing WTV files with large
initial timestamps (e.g., files recorded at 18:38:23).

The PTS jump detection computes (current_pts - sync_pts), and with
sync_pts=0 but current_pts=6039323550 (18:38:23 in PTS units), the
difference exceeded MAX_DIF and triggered the jump handling, resulting
in empty output.

This fix sets sync_pts to the same value as min_pts when first
initializing timing, preventing the false PTS jump detection.

Test results:
- Before: WTV files with large initial PTS produced empty output
- After: Timestamps match expected ground truth exactly
  (e.g., 00:00:00,601 --> 00:00:02,801 for first caption)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 16:40:58 +01:00
Carlos Fernandez
6c44100f97 fix(rust): Handle NULL file pointer in ccxr_demuxer_open for UDP/TCP input
When using --udp or --tcp options, ccxr_demuxer_open() was called with
a NULL file pointer, causing a crash in CStr::from_ptr().

The fix checks if the file pointer is NULL before dereferencing it,
and uses an empty string for network input modes.

Fixes #1846

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 15:30:41 +01:00
Carlos Fernandez Sanz
a0593c60e3 fix: RCWT/WTV timing fixes, Latin-1 music note encoding 2025-12-19 06:25:05 -08:00
Carlos Fernandez
300f8ca65a fix(wtv,encoding): Fix WTV timing and Latin-1 music note encoding
WTV timing fix:
- Set min_pts on first valid timestamp to enable fts_now calculation
- Set pts_set = 2 (MinPtsSet) instead of 1 (Received)
- This fixes WTV files where all timestamps were clustered around 1 second
  instead of being spread across the actual video duration

Latin-1 encoding fix:
- Change music note substitution from pilcrow (0xB6) to '#' (0x23)
- Pilcrow caused grep to treat output files as binary
- '#' is a more recognizable substitute for the musical note character

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 14:00:35 +01:00
Carlos Fernandez
8988152fa5 fix(rcwt): Fix timestamp calculation for RCWT/BIN format files
The rcwt_loop() function set min_pts = 0 for RCWT files but did not
set pts_set = 2 (MinPtsSet). This caused the Rust timing code to skip
the fts_now calculation (which checks pts_set == MinPtsSet), resulting
in all captions having timestamps compressed near 0 instead of their
correct times spread across the file duration.

The fix adds pts_set = 2 after setting min_pts, which tells the timing
system that min_pts is valid and fts_now can be calculated properly.

Fixes Test 217 timing issue where:
- Before: 00:00:00,001 --> 00:00:00,091 (wrong)
- After:  00:00:02,402 --> 00:00:04,536 (correct)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 11:50:57 +01:00
Carlos Fernandez
78642bcf02 ci: Retrigger Sample Platform CI 2025-12-19 09:24:12 +01:00
GAURAV KARMAKAR
609a53f373 [BUG] -out=spupng with EIA608/teletext: offset values in XML may be not correct #893 2025-12-19 13:27:08 +05:30
Carlos Fernandez
0c0e44472d ci: Trigger verification run after merging PRs #1847 and #1848
This PR triggers a fresh CI run to verify the combined effect of:
- PR #1847: Hardsubx crash fix, memory leak fixes, rcwt exit code fix
- PR #1848: XDS empty content entries fix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 07:08:59 +01:00
Carlos Fernandez Sanz
2060db99c8 fix(hardsubx): Fix heap corruption from Rust/C allocator mismatch 2025-12-18 22:02:30 -08:00
Carlos Fernandez Sanz
a299d06d97 fix(xds): Don't output empty XDS content entries 2025-12-18 22:02:04 -08:00
Carlos Fernandez
50b51e4234 fix(xds): Don't output empty XDS content entries
When outputting US TV Parental Guidelines ContentAdvisory XDS data,
the code was always calling xdsprint() for both the age rating and
the content flags (violence, language, etc). However, if there are
no content flags (e.g., for TV-G which has no additional advisories),
the content string is empty.

This caused duplicate XDS entries in the output - one with the age
rating and one with an empty string. The fix only outputs the content
string if it is not empty.

Fixes regression test 113 output mismatch.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 05:48:51 +01:00
Carlos Fernandez
0b74c9226a fix(rcwt): Fix incorrect exit code when captions are found in BIN format
The rcwt_loop function was returning exit code 10 (no captions) even
when CEA-608 captions were successfully extracted from RCWT/BIN format
files. This happened because CEA-608 decoding writes directly to the
encoder via printdata() without setting dec_sub->got_output.

Add a check after the main loop (similar to general_loop) that also
considers enc_ctx->srt_counter, enc_ctx->cea_708_counter, and
dec_ctx->saw_caption_block to properly detect when captions were found.

Fixes regression test 217 which was failing with exit code 10.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 05:40:52 +01:00
Carlos Fernandez
80957d645b fix(hardsubx): Fix heap corruption from Rust/C allocator mismatch
The hardsubx code was using C's free() on strings allocated by Rust's
CString::into_raw(). Since Rust and C use different memory allocators,
this caused heap corruption that manifested as garbage OCR output after
processing ~27 subtitle frames.

Changes:
- Export free_rust_c_string() from Rust as extern "C" function
- Declare free_rust_c_string() in hardsubx.h for C code
- Replace free(subtitle_text) with free_rust_c_string(subtitle_text)
  in hardsubx_decoder.c for Rust-allocated strings
- Fix memory leaks in process_hardsubx_linear_frames_and_normal_subs()
  where subtitle_text_hard and prev_subtitle_text_hard were not freed
- Remove dummy CI trigger file (no longer needed)

Testing:
- AddressSanitizer: No memory errors detected
- Valgrind: 0 bytes definitely lost, 0 bytes indirectly lost
- Manual testing: OCR output now correct for entire video duration

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 05:29:04 +01:00
Carlos Fernandez
80a117e643 fix(hardsubx): Fix memory leaks in hardsubx processing
- Free basefilename in _dinit_hardsubx (allocated by get_basename)
- Free subtitle_text after each frame processing iteration
- Free prev_subtitle_text when replaced and at end of function
- Free sws_ctx with sws_freeContext (was never freed)

Reduces memory leaks from 63,926 bytes to 0 bytes.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 04:46:19 +01:00
Carlos Fernandez
63999369b7 fix(hardsubx): Fix multiple memory bugs causing crashes
1. Remove invalid free(tessdata_path) - probe_tessdata_location() returns
   a pointer to static strings or getenv() result, not heap memory.

2. Fix alloc-dealloc mismatch in OCR text handling:
   - TessBaseAPIGetUTF8Text() allocates with C++ operator new[]
   - The code was freeing with C free() causing allocator mismatch
   - Now properly copy string and use TessDeleteText() before returning
   - Unified all OCR text return paths to use Rust-allocated strings

3. Previous fix: freep(&lctx->dec_sub) instead of freep(lctx->dec_sub)

These fixes resolve Test 241 (Hardsubx) crash on Sample Platform.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 04:40:31 +01:00
Carlos Fernandez
0e815c6e2d fix(hardsubx): Fix crash in _dinit_hardsubx due to incorrect freep usage
The freep() function expects a pointer-to-pointer (void**) so it can
dereference, free, and NULL-out the pointer. The code was passing
lctx->dec_sub directly instead of &lctx->dec_sub.

This caused freep to interpret the first 8 bytes of the cc_subtitle
struct as a pointer and attempt to free() it, resulting in a crash
(SIGABRT/exit code 134) in the memory allocator.

Fixes Test 241 (Hardsubx) crash on Sample Platform.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 04:33:11 +01:00
Carlos Fernandez
0ef7227d7e ci: Add dummy C file to trigger Sample Platform CI
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 04:04:58 +01:00
Carlos Fernandez
2fa023b9fe ci: Add triage tracking file for December 2025 CI analysis
This PR triggers a fresh CI run to analyze all failing regression tests
and determine whether each needs a ground truth update or a code fix.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-19 04:01:18 +01:00
Carlos Fernandez Sanz
2f0770d45f docs: Update CHANGES.TXT with recent bug fixes 2025-12-18 04:20:41 -08:00
Carlos Fernandez
ee36ac1d4d docs: Update CHANGES.TXT with recent bug fixes
Add changelog entries for recent merged PRs:
- Fix: Garbled captions from HDHomeRun and I/P-only H.264 streams (#1109)
- Fix: Enable stdout output for CEA-708 captions on Windows (#1693)
- Fix: McPoodle DVD raw format read/write (#1524)
- Fix: Variable shadowing in general_loop
- Fix: Double-free crash in teletext cleanup
- Fix: Uninitialized memory and memory leaks (Valgrind)
- Fix: Dangling pointers in Rust FFI
- New: Teletext subtitle pages in -out=report (#1034)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 13:19:13 +01:00
Carlos Fernandez Sanz
e160a533b0 fix: McPoodle DVD raw format read/write (Issue #1524) 2025-12-18 04:16:47 -08:00
Carlos Fernandez Sanz
083c12698f fix: Enable stdout output for CEA-708 captions on Windows 2025-12-18 04:11:42 -08:00
Carlos Fernandez
88fbe9190a style: Fix formatting and clippy warnings
- Fix comment spacing (single space before //)
- Mark is_two_byte_loop_marker as #[cfg(test)] since it's only used in tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 13:08:21 +01:00
Carlos Fernandez
ac49bb5978 fix: McPoodle DVD raw format read/write (Issue #1524)
Reading:
- Migrate DVD raw parser from C to Rust (src/rust/src/demuxer/dvdraw.rs)
- Add FFI exports: ccxr_process_dvdraw(), ccxr_is_dvdraw_header()
- Handle both McPoodle's single-byte and legacy 2-byte loop markers
- Add 15 unit tests covering all edge cases

Writing:
- Fix LC3/LC4 constants from 2-byte to 1-byte to match McPoodle's format
- Output files now have identical size to McPoodle's original

Fixes #1524

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 13:03:29 +01:00
Carlos Fernandez Sanz
138ccd01c2 fix: Fix garbled captions from HDHomeRun and I/P-only H.264 streams 2025-12-18 04:01:44 -08:00
Carlos Fernandez
9fe2dab6d4 style: Remove unused mut from current_index variable
Fix clippy warning: variable does not need to be mutable.
The current_index variable is only assigned once during initialization
and never modified afterward.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 12:57:06 +01:00
Carlos Fernandez Sanz
a28561ad0d Merge pull request #1841 from CCExtractor/fix/general-loop-ret-shadowing
fix: Fix variable shadowing and teletext context refresh issues
2025-12-18 03:26:37 -08:00
Carlos Fernandez
c8f6b565fd fix: Fix garbled captions from HDHomeRun and I/P-only H.264 streams
For I/P-only streams (like HDHomeRun recordings), the caption buffer was
being flushed on every reference frame (I and P). Since ALL frames in these
streams are reference frames, this defeated the caption reordering mechanism,
causing garbled output.

The fix:
- Only flush the buffer and reset reference PTS on IDR frames (NAL type 5),
  not on P-frames
- Initialize currefpts on first frame to avoid huge indices at stream start
- Properly flush buffer and reset reference when large PTS gaps are detected

This allows P-frames to accumulate in the buffer and be sorted by their
PTS-based indices before output.

Fixes #1109

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 10:35:20 +01:00
Carlos Fernandez
442ce1015d fix: Fix variable shadowing and teletext context refresh issues
This commit fixes two issues uncovered during Sample Platform testing:

1. Variable shadowing in general_loop() (general_loop.c):
   - The inner `int ret = process_non_multiprogram_general_loop(...)`
     was shadowing the outer `ret` variable
   - This caused the return value to always be 0, making ccextractor
     report "No captions found" even when captions were extracted
   - Also added `ret = 1` when captions are detected via counters,
     needed for CEA-708 which writes directly via Rust

2. Missing private_data refresh in update_decoder_list_cinfo (lib_ccx.c):
   - After PAT changes, dinit_cap() frees the teletext context and
     NULLs dec_ctx->private_data
   - But update_decoder_list_cinfo() returned existing decoder without
     refreshing private_data from the new cap_info
   - This caused all subsequent teletext processing to be skipped
   - Fixed by updating dec_ctx->private_data when returning existing decoder

These fixes resolve Sample Platform test failures in CEA-708 and Teletext
categories where tests returned exit code 10 (no captions) unexpectedly.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 10:10:25 +01:00
Carlos Fernandez
e2dfdaa6a8 Merge branch 'master' into fix/issue-1693-stdout-crash
Resolved conflict in src/rust/src/lib.rs:
- Kept stderr target change from this branch (for --stdout option)
- Merged safety documentation from master
2025-12-18 09:18:50 +01:00
Carlos Fernandez Sanz
a0809caa94 fix(memory): Fix uninitialized memory and memory leaks found by Valgrind 2025-12-18 00:16:01 -08:00
Carlos Fernandez
859741a22c fix(rust): Remove unused import free_rust_c_string_array
This fixes the clippy error: "unused import: crate::utils::free_rust_c_string_array"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 07:41:29 +01:00
Carlos Fernandez
4429067965 fix(rust): Fix Drop compatibility and formatting issues
- demux.rs: Update dummy_demuxer() to explicitly initialize all fields
  instead of using ..Default::default(), which is not allowed when the
  struct implements Drop
- common.rs, demuxer.rs: Apply cargo fmt formatting fixes

This fixes the Rust test compilation error:
"cannot move out of type CcxDemuxer which implements the Drop trait"

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 07:37:39 +01:00
Carlos Fernandez
d72646ac85 fix(memory): Fix XDS memory leak in rcwt_loop path
Add proper cleanup of xds_ctx in rcwt_loop() for --in=bin and --in=raw
formats. The general_loop() path already frees xds_ctx, but rcwt_loop()
was missing this cleanup, causing an 880-byte leak.

This fixes Valgrind tests 217 (--in=bin) and 218 (--in=raw).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-18 07:31:50 +01:00
Carlos Fernandez
4a304346c9 fix(memory): Fix XDS memory leaks in encoder and decoder cleanup
- XDS encoder leak: Free xds_str when skipping subtitles with invalid timestamps
- XDS decoder cleanup: Add proper cleanup for leftover XDS strings in dinit_cc_decode()
- Remove incorrect free(p) after write_xds_string() - the pointer is stored
  for later use by the encoder and must not be freed immediately
- Remove xds_ctx free from dinit_cc_decode() to avoid double-free

These fixes address the 100-byte XDS leak found in Valgrind test 114.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 16:54:38 +01:00
Carlos Fernandez
627e0855ce fix(memory): Fix 608 decoder memory leak in dec_sub.data
The embedded dec_sub struct in lib_cc_decode had its data field
allocated by write_cc_buffer() but never freed during cleanup.

Added cleanup in dinit_cc_decode() to:
- Free DVB bitmap data (data0/data1) if present
- Free the dec_sub.data field itself

This fixes ~1.7MB to ~2.6MB leaks seen in tests 89, 93, and 96.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 13:58:15 +01:00
Carlos Fernandez
7b1a169b8f fix(memory): Fix use-after-free in Teletext and uninitialized variables
This commit fixes several Valgrind-detected memory issues:

1. Use-after-free in Teletext during PAT changes:
   - When parse_PAT() calls dinit_cap() to reinitialize stream info,
     it freed the Teletext context but dec_ctx->private_data still
     pointed to the freed memory
   - Fixed by NULLing out dec_ctx->private_data in dinit_cap() when
     freeing shared codec private data
   - Also added NULL check in process_data() before calling teletext
     functions to gracefully handle freed contexts

2. Uninitialized variables in general_loop():
   - stream_mode, get_more_data, ret, and program_iter were declared
     without initialization
   - While logically set before use, Valgrind tracked them as
     potentially uninitialized through complex control flow
   - Fixed by initializing all variables at declaration

These fixes eliminate millions of Valgrind errors in teletext tests
(tests 78, 80) and uninitialized value warnings (tests 67, 84, 86).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 13:44:13 +01:00
Carlos Fernandez
3d5d8e2a0a fix(memory): Fix major memory leaks in Rust FFI demuxer and decoder
This commit fixes several significant memory leaks found by Valgrind testing:

1. Dtvcc::new encoder leak (decoder/mod.rs):
   - Previously always allocated a new encoder_ctx even when ctx.encoder
     was not null, then threw away the allocation
   - Fix: Only allocate when ctx.encoder is null
   - Impact: Eliminated 55MB-331MB leaks per video processing run

2. ccxr_demuxer_isopen optimization (demuxer.rs):
   - Previously copied entire demuxer structure just to check infd
   - Fix: Directly check (*ctx).infd != -1
   - Impact: Eliminated repeated allocations during file processing

3. ccxr_demuxer_close optimization (demuxer.rs):
   - Previously did full copy roundtrip (C->Rust->C) to close a file
   - Fix: Work directly on C struct, call close() and activity callback
   - Impact: Eliminated copy-related allocations and leaks

4. CcxDemuxer Drop implementation (common_types.rs):
   - pid_buffers and pids_programs contain raw pointers from Box::into_raw
   - These were never freed when CcxDemuxer was dropped
   - Fix: Implement Drop to free all non-null Box pointers
   - Impact: Eliminates remaining FFI-related leaks

Test results show dramatic improvement:
- Test 24: 55MB leak -> 0 bytes (PERFECT)
- Test 26: 9.75MB leak -> 0 bytes (PERFECT)
- Test 27: 237MB leak -> 0 bytes (PERFECT)
- Test 28: 331MB leak -> 0 bytes (PERFECT)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 12:48:51 +01:00
Carlos Fernandez
683468e233 fix(memory): Fix use-after-free and memory leaks in Rust FFI
This commit fixes critical memory issues found during comprehensive
Valgrind testing:

1. **Use-after-free in inputfile array** (common.rs):
   - Problem: `copy_from_rust` was called multiple times (parse_parameters,
     demuxer_open, demuxer_close), and each call freed and reallocated the
     inputfile array. C code holding references to the old array would then
     access freed memory.
   - Fix: Only set inputfile on the first call (when inputfile is null).
     Subsequent calls skip modifying inputfile since it shouldn't change
     during processing.

2. **Memory leak in enc_cfg strings** (common.rs):
   - Problem: Each call to `copy_from_rust` allocated new encoder config
     strings without freeing the old ones, causing 1,536 bytes leaked per
     demuxer open/close cycle.
   - Fix: Only set enc_cfg on the first call (when output_filename is null).
     Encoder config is static and doesn't need to be re-synced.

3. **Uninitialized memory in telxcc_init** (telxcc.c):
   - Problem: `malloc` was used to allocate TeletextCtx but not all fields
     were explicitly initialized, causing Valgrind to report 400+ errors
     about conditional jumps on uninitialized values.
   - Fix: Changed to `calloc` to zero-initialize all fields.

**Valgrind results improvement (Test 3):**
- Errors: 458 → 21 (95% reduction)
- Definitely lost: 2,304 → 768 bytes (67% reduction)
- Use-after-free bugs: Eliminated
- Double-free bugs: Eliminated

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 11:04:19 +01:00
Carlos Fernandez
89849d321f fix(memory): Fix uninitialized memory and memory leaks found by Valgrind
Addresses memory issues identified during Phase 5 (Runtime Analysis) of
the bug analysis plan using Valgrind memory checking.

## Changes

### C Code (Uninitialized Memory)
- ccx_demuxer.c: Use calloc() instead of malloc() in init_demuxer() to
  ensure all struct fields are zero-initialized before use
- lib_ccx.c: Use calloc() instead of malloc() in init_decoder_setting()
  for consistent initialization

### Rust FFI Code (Memory Leaks)
- utils.rs: Add helper functions for proper FFI string memory management:
  - free_rust_c_string(): Free a Rust-allocated CString
  - replace_rust_c_string(): Free old string before allocating new one
  - free_rust_c_string_array(): Free an array of Rust-allocated CStrings
- common.rs: Update copy_from_rust() to properly manage string memory:
  - Free old strings before allocating new ones for all string fields
  - Add free_encoder_cfg_strings() to clean up encoder config strings
  - Free old inputfile array before allocating new one

## Valgrind Results Comparison

| Metric              | Before    | After     | Improvement     |
|---------------------|-----------|-----------|-----------------|
| Definitely lost     | 2,371 B   | 1,536 B   | 35% reduction   |
| Indirectly lost     | 212 B     | 0 B       | 100% fixed      |
| Uninitialized errors| 131,095   | 0         | 100% fixed      |

The remaining 1,536 bytes are from services_charsets array in
EncoderConfig (low priority, rare use case).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 09:21:51 +01:00
Carlos Fernandez Sanz
588ad5260a fix(rust-ffi): Prevent dangling pointers in copy_from_rust 2025-12-17 00:07:27 -08:00
Carlos Fernandez Sanz
ebd8148cad Merge pull request #1838 from CCExtractor/fix/teletext-double-free-crash
fix(teletext): Prevent double-free crash in teletext cleanup
2025-12-17 00:06:32 -08:00
Carlos Fernandez
ba33f7572d fix(rust-ffi): Prevent dangling pointers in copy_from_rust
The `to_ctype()` implementations for `DecoderDtvccSettings` and
`Decoder608Settings` were creating temporaries on the stack and
returning pointers to them. These pointers became dangling after
the function returned, causing memory corruption when
`copy_from_rust()` was called.

This fix:
- Preserves the original C-managed `report` and `timing` pointers
  in `copy_from_rust()` instead of overwriting them with dangling
  pointers to temporaries
- Adds explicit `settings_dtvcc.timing = NULL` initialization in
  `init_options()` for completeness

Before this fix, valgrind reported:
- "Invalid write of size 4" in `dtvcc_init` (4016 bytes below stack
   pointer)
- "Invalid read" errors in `copy_to_rust` / `DecoderDtvccSettings::
   from_ctype`

After this fix, these critical memory corruption errors are resolved.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 09:05:48 +01:00
Carlos Fernandez
9cf96b1899 fix(teletext): Prevent double-free crash in teletext cleanup
This fixes a double-free bug that caused CCExtractor to crash with
exit code 134 (SIGABRT) when processing teletext streams.

## Root Cause

The teletext context (TeletextCtx) pointer was shared between two
structures:
- `dec_ctx->private_data` (decoder context)
- `cinfo->codec_private_data` (capture info in cinfo_tree)

When `general_loop()` ended, it called `telxcc_close()` which freed
the TeletextCtx and NULLed `dec_ctx->private_data`. However, the
shared pointer in `cinfo->codec_private_data` was NOT NULLed.

Later, during cleanup in `dinit_cap()`, the code would find the
non-NULL `cinfo->codec_private_data` and attempt to free it again,
causing a double-free crash.

## The Fix

After `telxcc_close()` frees the teletext context in `general_loop()`,
iterate through all cinfo entries and NULL out any that shared the
same pointer. This prevents `dinit_cap()` from attempting to free
already-freed memory.

## Regression

This bug was exposed by commit 7e1a01447 which added cleanup code
to `dinit_cap()` to free `codec_private_data`. The `telxcc_close()`
call in `general_loop()` has existed since 2015, but the double-free
only became possible after the new cleanup code was added.

## Testing

Validated fix against all 27 teletext-related CI tests that were
failing with exit code 134:

Teletext section (21 tests): 63-83 - all PASS
DVB section: 18, 19 - all PASS
Other teletext tests: 224, 234, 235, 236 - all PASS

Verified with valgrind that no "Invalid free" or "double free"
errors occur after the fix.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-17 08:46:37 +01:00
Carlos Fernandez Sanz
0b3ad40377 Merge pull request #1837 from x15sr71/fix/atsc-vct-xmltv-mapping
[FIX]: Add ATSC VCT virtual channel numbers and call signs to XMLTV output
2025-12-16 22:29:04 -08:00
Chandragupt Singh
ac72625030 Fix ATSC XMLTV output to include VCT virtual channels and call signs 2025-12-17 10:49:41 +05:30
Carlos Fernandez Sanz
f6cb862dcb bump MSRV from 1.54.0 to 1.87.0 (rust) 2025-12-15 23:25:22 -08:00
Carlos Fernandez Sanz
53c0f56b6f Merge pull request #1833 from CCExtractor/dependabot/github_actions/actions/upload-artifact-6
chore(deps): bump actions/upload-artifact from 5 to 6
2025-12-15 23:07:50 -08:00
Carlos Fernandez Sanz
62272e7be6 [FIX] Correct typos in warning message and code comment
[FIX] Correct typos in warning message and code comment
2025-12-15 23:06:59 -08:00
Carlos Fernandez Sanz
a7e05c265c fix(ocr): Improve DVB subtitle OCR quality (fixes #243)
fix(ocr): Improve DVB subtitle OCR quality (fixes #243)
2025-12-15 23:05:58 -08:00
Carlos Fernandez Sanz
9ce13cf45f FIX]: Restore XMLTV generation for ATSC EIT/VCT streams and correct EIT bounds checks
[FIX]: Restore XMLTV generation for ATSC EIT/VCT streams and correct EIT bounds checks
2025-12-15 13:27:41 -08:00
Chandragupt Singh
e0ac99a241 fix(atsc): restore XMLTV generation and ATSC EPG parsing 2025-12-16 01:46:28 +05:30
GAURAV KARMAKAR
6ebf98ea4a Fix typos in encoder warning and comment 2025-12-16 00:59:45 +05:30
dependabot[bot]
9372e15024 chore(deps): bump actions/upload-artifact from 5 to 6
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 5 to 6.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-15 18:02:19 +00:00
Carlos
7e1a01447a fix(ocr): Improve DVB subtitle OCR quality (fixes #243)
This commit addresses Issue #243 where DVB subtitles from Spanish
broadcasts were producing corrupt/garbled OCR output like
"alajentiegaranual dep jemios" instead of "a la entrega anual de premios".

Root cause analysis:
1. Image preprocessing was degrading quality - pixContrastNorm was
   causing issues for some DVB sources
2. Default quantization mode (ocr_quantmode=1) was too aggressive,
   reducing images to just 3 colors which lost important detail

Changes:
- Remove pixContrastNorm calls from ocr.c (both main OCR and color
  detection passes) - these were causing more harm than good
- Change default ocr_quantmode from 1 to 0 (no quantization) in both
  C code (ccx_common_option.c) and Rust code (options.rs)
- Add NULL checks in dvbsub_close_decoder() and telxcc_close() for
  safety
- Add proper cleanup of codec_private_data pointers in lib_ccx.c and
  ts_info.c to prevent double-free crashes

Testing performed:
- Test 21 (English DVB): Completes in ~1 second with good OCR quality
- Test 239 (DVB timing): All 8 subtitles have correct timing
- Spanish DVB (Issue #243): Now produces readable text like
  "¡Bienvenidos a la entrega anual de premios" instead of garbage

Users can still use --quant 1 to restore the old quantization behavior
if needed.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 11:51:30 +01:00
Carlos Fernandez Sanz
b728ddadfa fix: Comprehensive bug fixes - Phases 2-4 (Memory, Buffer, Rust FFI)
Lots of sanitation work - always free stuff, validate buffer sizes, etc.
2025-12-15 02:50:06 -08:00
Carlos Fernandez Sanz
300541b873 Merge pull request #1809 from Rahul-2k4/master
Improve -out=report to show detected Teletext subtitle pages (Fixes #1034)
2025-12-14 23:36:41 -08:00
Carlos Fernandez Sanz
2f1c1bf227 Merge pull request #1721 from Ari1009/mcc_encoder
fix: MCC encoder 16-bit sequence
2025-12-14 23:27:08 -08:00
Carlos Fernandez Sanz
0bcb532428 Merge pull request #1829 from CCExtractor/fix/autoconf-hardsubx-tesseract
build(autoconf): add tesseract/leptonica linking for HARDSUBX
2025-12-14 23:18:12 -08:00
Carlos
d8698dc9cb build(autoconf): add tesseract/leptonica linking for HARDSUBX
This is the autoconf equivalent of the CMake fix in PR #1760.

When building with HARDSUBX enabled but OCR disabled, the autoconf
build system was missing explicit tesseract/leptonica linking in the
HARDSUBX block. While configure.ac sets OCR_IS_ENABLED when HARDSUBX
is enabled (so it would work via the OCR block), this change makes
the dependency explicit and consistent with the CMake fix.

Related: PR #1760, Issue #1719

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 08:12:16 +01:00
Carlos Fernandez Sanz
4cc9231fc8 Merge pull request #1760 from DhanushVarma-2/fix-tesseract-linking-1719
build: add tesseract library linking for hardsubx feature
2025-12-14 23:08:00 -08:00
Carlos
d202a66fd0 style(rust): Apply cargo fmt formatting
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 07:07:41 +01:00
Carlos
d8048bc95a fix(rust): Complete Phase 4 - FFI safety and documentation
Phase 4 of the bug analysis cycle addresses all Rust/FFI boundary issues:

Safety Documentation:
- Added # Safety docs to all 83 production FFI functions
- lib.rs: ccxr_init_logger, ccxr_close_handle
- decoder/encoding.rs: 4 G0/G1/G2/G3 conversion functions
- decoder/service_decoder.rs: ccxr_flush_decoder
- hardsubx/imgops.rs: rgb_to_hsv, rgb_to_lab
- hardsubx/utility.rs: convert_pts_to_ns/ms/s

Panic Prevention (FFI function bodies):
- hardsubx/decoder.rs: Replaced 8 .try_into().unwrap() calls with
  safe `as` casts to prevent potential panics across FFI boundary
- libccxr_exports/net.rs: Replaced expect() with safe error handling
- libccxr_exports/mod.rs: Removed panic!/expect(), use defaults
- libccxr_exports/time.rs: Replaced try_into().unwrap() with unwrap_or()

Clippy Fixes:
- Fixed 72 Clippy warnings across the codebase
- Replaced assert!(false) with unreachable!()
- Added #[allow] attributes for acceptable test code patterns

All 269 tests pass, Clippy reports 0 warnings.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 07:03:25 +01:00
Carlos
af3ab5acd4 fix(buffer): Replace unsafe string functions with safe alternatives
Phase 3: Buffer overrun fixes

Changes:
- Replace 17 sprintf calls with snprintf
- Replace 3 strcpy calls with memcpy (known length)
- Replace 9 strcat calls with safer alternatives (snprintf, memcpy, strncat)
- Fix telxcc.c buffer size for page number formatting
- Add bounds checking to eia608_to_str function

Files modified:
- ocr.c: 7 sprintf→snprintf, 2 strcat→snprintf
- ts_tables_epg.c: 4 sprintf→snprintf, 1 strcat→snprintf
- ccx_encoders_spupng.c: 4 sprintf→snprintf, 1 strcpy→memcpy, 2 strcat→strncat/memcpy
- ccx_encoders_splitbysentence.c: 2 sprintf→snprintf (commented debug code)
- utility.c: 2 strcpy→memcpy, 4 strcat→snprintf/memcpy
- telxcc.c: increased buffer size from 4 to 8 bytes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 06:33:17 +01:00
Carlos
90519e2296 fix(memory): Fix memory issues in final batch of files (Batch 2.7)
Files fixed:
- hardsubx.c: Add free() calls before return NULL at lines 247, 255;
  add null check for dec_sub malloc; free tessdata_path
- ccx_gxf.c: Fix unsafe realloc pattern for ctx->cdp
- wtv_functions.c: Add null checks for malloc calls at lines 143, 192,
  283, 384
- dvd_subtitle_decoder.c: Fix memset before null check; add null checks
  for rect->data0 and rect->data1; add null checks in init_dvdsub_decode
- ts_tables.c: Add null check for PID_buffers malloc; add null check for
  buffer malloc; fix unsafe realloc pattern
- myth.c: Fix unsafe realloc pattern for desp buffer
- ffmpeg_intgr.c: Fix memory leaks in init_ffmpeg error paths; add proper
  cleanup labels; properly allocate codec context instead of using codecpar

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 06:17:45 +01:00
Carlos
494b14b651 fix(memory): Fix memory issues in helpers, splitbysentence, and output
- ccx_encoders_helpers.c:
  - add_word(): Fix unsafe realloc pattern, preserve original pointer
  - shell_sort(): Add null check for temp buffer allocation

- ccx_encoders_splitbysentence.c:
  - init_sbs_context(): Add null checks for context and buffer allocations
  - sbs_append_string(): Fix unsafe realloc pattern for buffer
  - sbs_append_string(): Add null check for cc_subtitle allocation

- output.c:
  - writeraw(): Fix unsafe realloc pattern, preserve original pointer
    and set to NULL on failure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 06:07:58 +01:00
Carlos
5b286c5b8d fix(memory): Fix potential memory leaks in encoder files
- ccx_encoders_ssa.c: Fix combined malloc check pattern
  - Check each allocation separately
  - Free first allocation if second fails before calling fatal

- ccx_encoders_webvtt.c: Fix 2 combined check patterns
  - write_stringz_as_webvtt: Separate checks with proper cleanup
  - write_cc_bitmap_as_webvtt: Separate calloc checks with cleanup

- ccx_encoders_smptett.c: Fix combined malloc check pattern
  - Check each allocation separately
  - Free first allocation if second fails before calling fatal

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 06:04:53 +01:00
Carlos
ea4f884b9d fix(memory): Fix unsafe realloc patterns in asf_functions, telxcc, and ccx_encoders_srt
- asf_functions.c: Fix 2 unsafe realloc patterns
  - Use temporary pointer to preserve original buffer reference
  - Free original buffer before calling fatal on allocation failure

- telxcc.c: Fix 2 unsafe realloc patterns in teletext buffer functions
  - page_buffer_add_string: Use safe realloc pattern with temp pointer
  - ucs2_buffer_add_char: Use safe realloc pattern with temp pointer

- ccx_encoders_srt.c: Fix potential memory leak in write_stringz_as_srt
  - Check each allocation separately
  - Free successful allocation before fatal if second allocation fails

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 06:01:36 +01:00
Carlos
3b0a63d9c6 fix(memory): Fix memory leaks and unsafe realloc patterns in lib_ccx, utility, avc_functions
- lib_ccx.c: Fix memory leaks in init_libraries error paths
  - Add proper cleanup for report_608, EPG buffers, and ctx when
    init_decoder_setting fails
  - Add comprehensive cleanup at end: label when init_ctx_outbase fails

- utility.c: Fix unsafe realloc in str_reallocncat
  - Preserve original pointer and free it on realloc failure
  - Prevents memory leak when realloc returns NULL

- avc_functions.c: Fix unsafe realloc patterns in user_data_registered_itu_t_t35
  - Use temporary pointer for realloc result
  - Free original buffer before calling fatal on allocation failure
  - Fixes two instances of unsafe realloc pattern

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 05:58:20 +01:00
Carlos
390c96f00d fix(memory): Fix memory leaks and unsafe realloc patterns in multiple files
Batch 2.2 memory fixes:

dvb_subtitle_decoder.c:
- Fix memory leak in write_dvb_sub: free rect->data1 and rect before fatal
  when data0 allocation fails

general_loop.c:
- Fix unsafe realloc in rcwt_loop: use temp variable to preserve original
  parsebuf pointer on failure
- Fix memory leak: free parsebuf on early return in rcwt_loop

ts_functions.c:
- Fix unsafe realloc in copy_payload_to_capbuf: use temp variable to
  preserve original cinfo->capbuf on failure
- Fix unsafe realloc in hauppauge buffer handling: free original buffer
  before fatal on failure

ccx_decoders_608.c:
- Fix two unsafe realloc patterns in write_cc_buffer_as_transcript and
  write_cc_buffer_to_gui: use temp variable to preserve original sub->data
  on failure

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 05:52:12 +01:00
Carlos
95f6f09659 fix(memory): Fix memory leaks in ocr.c and ts_tables_epg.c
In ocr.c:
- Fix realloc failure leak in search_language_pack (free dirname)
- Fix malloc failure leaks in ocr_bitmap (free histogram, iot, mcit)
- Fix realloc failure leak for new_text_out
- Fix multiple allocation failure paths in ocr_rect with proper cleanup

In ts_tables_epg.c:
- Fix malloc failure leak in EPG_ATSC_decode_multiple_string (free event_name)
- Fix realloc failure leak in parse_EPG_packet (free buffer)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-15 05:46:52 +01:00
Carlos Fernandez Sanz
42885caedd fix(dvb): Multiple fixes for DVB subtitles - timing, OCR quality, memory access bugs (#224) (#1826)
* fix(dvb): Multiple fixes for DVB subtitle extraction from Chinese broadcasts (#224)

This commit addresses multiple issues with DVB subtitle extraction reported in #224:

1. **PMT parsing crash fix** (ts_tables.c):
   - Added minimum length check (16 bytes) to prevent out-of-bounds access
   - Added bounds check before memcpy to prevent buffer overflow when section > 1021 bytes

2. **Negative subtitle timing fix** (general_loop.c):
   - For DVB subtitle streams, properly initialize min_pts from audio/subtitle PTS
   - This fixes the issue where all timestamps were negative (~95000 seconds off)

3. **OCR improvements** (ocr.c):
   - Fixed ignore_alpha_at_edge() which could create invalid crop windows
   - Added image inversion for DVB subtitles (light text on dark background)
     to improve Tesseract OCR accuracy
   - Added contrast normalization to further improve character recognition
   - Fixed nofontcolor check to respect --no-fontcolor parameter
   - Added iteration safety limit in color detection loop

4. **--ocrlang parameter fix** (Rust files):
   - Changed ocrlang from Language enum to String to accept Tesseract language
     names directly (e.g., "chi_tra", "chi_sim", "eng")
   - Added case-insensitive matching for --dvblang parameter
   - Added better error messages for invalid language codes

Tested with 12GB Chinese DVB broadcast file:
- Timing: All timestamps now positive (0.235s, 2.594s, etc.)
- OCR: ~80-90% accuracy with chi_tra traineddata (improved from ~70%)
- No crashes during full file processing

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(ocr): Fix crashes in DVB subtitle color detection

Two issues fixed in the OCR color detection code:

1. Tesseract crash during iteration:
   - The color detection pass used raw color images without preprocessing
   - Tesseract expects dark text on light background, but DVB subtitles
     have light text on dark background
   - Added grayscale conversion, inversion, and contrast enhancement
     (same preprocessing as the main OCR pass)

2. Heap corruption in histogram calculation:
   - The histogram loop had no bounds checking on array accesses
   - Tesseract could return invalid bounding boxes causing buffer overflows
   - Added validation of bounding box coordinates before processing
   - Added safe index checking for copy->data and histogram arrays

Also added skip_color_detection label for clean error handling and
proper cleanup of the preprocessed image.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(dvb): Fix zero-duration subtitles and overlaps during PTS jumps

Add start_pts field to cc_subtitle struct to track raw PTS values
independent of FTS timeline resets. Modify end_time calculation in
dvbsub_handle_display_segment() to cap duration at 4 seconds when
PTS jumps cause timeline discontinuities, preventing zero-duration
and overlapping subtitles.

Also update .gitignore to exclude plans/ directory and temp files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 20:03:55 -08:00
Carlos Fernandez Sanz
8d95ad0e7b chore: Apply code formatting and update changelog (#1825)
- Apply clang-format to all C/H files in src/
- Apply cargo fmt to Rust code
- Update Cargo.lock with latest compatible dependency versions
- Add 24 new entries to CHANGES.TXT for recent fixes and features

Changes in CHANGES.TXT cover:
- CEA-708 bounds checks and UTF-16BE encoding fixes
- New --ttxtforcelatin option for Teletext
- TS files without PAT/PMT fallback support
- Timing accuracy improvements across MP4/MPEG/TS
- Memory safety improvements (null checks, buffer overruns)
- Multi-file processing fixes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 13:34:16 -08:00
Carlos Fernandez Sanz
1f0980185f fix(rust): Add bounds checks to prevent panic on malformed CEA-708 data (#1817)
* fix(rust): Add bounds checks to prevent panic on malformed CEA-708 data

Fixes #1616 - Segmentation fault when extracting from MP4 remuxed from HLS

The CEA-708 decoder could panic when processing truncated or malformed
caption data blocks:

1. Fixed EXT1 command handling in process_service_block():
   - Changed &block[1..] to &block[(i+1)..] for correct slice offset
   - Added bounds check before accessing the next byte after EXT1

2. Added bounds checks in handle_extended_char():
   - Check for empty block before accessing block[0]
   - Check block.len() >= 2 before accessing block[1] for C3 commands

3. Removed unnecessary `as i64` cast in es/pic.rs to fix clippy warning

Added 4 unit tests to verify the bounds checking:
- test_handle_extended_char_empty_block
- test_handle_extended_char_c3_insufficient_bytes
- test_process_service_block_ext1_at_end
- test_process_service_block_ext1_with_truncated_c3

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(rust): cast c_long to i64 in pic.rs for Windows compatibility

On Windows, c_long is i32 (32-bit) while on Linux it's i64 (64-bit).
The addition of fts_at_gop_start + frame_offset_ms was failing on Windows
because fts_at_gop_start (c_long = i32) couldn't be added to frame_offset_ms (i64).

Added explicit cast to i64 with #[allow(clippy::unnecessary_cast)] since
the cast is necessary for Windows even though it's redundant on Linux.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 07:47:47 -08:00
Carlos Fernandez Sanz
6c764aa56c fix: Correct is_decoder_processed_enough() multiprogram logic and suppress false warnings (#1823)
Fixes #1701

The `is_decoder_processed_enough()` function had a bug where it would always
return FALSE in multiprogram mode due to the condition:
  `dec_ctx->processed_enough == CCX_TRUE && ctx->multiprogram == CCX_FALSE`

This caused the "Error in switch_to_next_file()" warning to trigger incorrectly
for files without captions or in multiprogram mode.

Changes:
- Fix `is_decoder_processed_enough()` in C and Rust:
  - In single-program mode: return TRUE if ANY decoder has processed enough
  - In multiprogram mode: return TRUE only if ALL decoders have processed enough
- Add check for empty decoder list in `switch_to_next_file()`:
  - If no decoders exist (no captions found), suppress the premature ending warning
  - This is a normal condition, not an error
- Update Rust tests to verify the new behavior

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 07:45:16 -08:00
Carlos Fernandez Sanz
a0129df16c fix(708): Write consistent 2-byte UTF-16BE encoding for CEA-708 captions (#1820)
* fix(708): Write consistent 2-byte UTF-16BE encoding for CEA-708 captions

Previously, the write_utf16_char (C) and write_char (Rust) functions
wrote 1 byte for ASCII characters (high byte = 0) and 2 bytes for
non-ASCII characters. This created an invalid mix of 8-bit and 16-bit
values that iconv/encoding_rs couldn't convert properly when UTF-16BE
encoding was specified.

The fix always writes 2 bytes per character, ensuring consistent
UTF-16BE encoding. This allows iconv to properly convert the data to
UTF-8, fixing garbled output for Japanese and Chinese captions.

Before fix (garbled):
人々が私を知‰挰弰栰䴰Ź섰漠時間管理につい‰晦<U+F830>䐰昰䐰縰

After fix (correct):
人々が私を知 ったとき、私は 時間管理につい て書いています

Fixes #1451

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* test(708): Update write_char test to expect 2-byte UTF-16BE output

The test was checking for the old (incorrect) behavior where ASCII
characters were written as 1 byte. The fix for issue #1451 correctly
changed write_char to always write 2 bytes for proper UTF-16BE encoding.
Updated the test to match this correct behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 07:42:46 -08:00
Carlos Fernandez Sanz
d2ab31fe38 fix(teletext): Add --ttxtforcelatin option to force Latin G0 charset (#1821)
Some broadcast streams incorrectly signal Cyrillic character set (via
X/28 or M/29 packets) when the actual content is Latin text. This causes
garbled output where Latin text like "No. Not back then, anyway." appears
as Cyrillic "Но. Нот бацк тхен, анiваi."

This fix adds a new --ttxtforcelatin option that forces the teletext G0
character set to Latin, ignoring any Cyrillic designation in the stream.

Root cause: The broadcast contained triplet 0x1290 which has bits 10-13
set to 0x1 (Cyrillic family) and bits 7-9 set to 0x5 (Ukrainian option),
causing CCExtractor to use CYRILLIC3 charset instead of Latin.

Usage: ccextractor input.ts --ttxtforcelatin -o output.srt

Before fix (without option):
  Subtitle 3: Но. Нот бацк тхен, анiваi.

After fix (with --ttxtforcelatin):
  Subtitle 3: No. Not back then, anyway.

Fixes #1395

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 07:42:06 -08:00
Carlos Fernandez Sanz
3f6656176e fix(ts): Add fallback for TS files without PAT/PMT tables (#1822)
Some DVR recordings (e.g., Channel Master DVR+) create transport stream
files that contain valid video and audio data but lack PAT (Program
Association Table) and PMT (Program Map Table). Without these tables,
CCExtractor couldn't identify which PIDs contain video streams with
embedded captions.

This change adds a fallback mechanism that:
1. Enables packet analysis mode when no PAT is found after reading ~1000
   TS packets (188KB)
2. Detects video streams by analyzing PES headers (stream_id 0xE0-0xEF)
3. Identifies stream type (MPEG-2 vs H.264) from elementary stream data
4. Registers detected video streams for caption extraction
5. Also detects GA94 caption markers to identify caption-carrying PIDs

The fix allows CCExtractor to extract CEA-608/708 captions from TS files
without PAT/PMT, matching the behavior when FFmpeg is enabled.

Fixes #805

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 07:40:52 -08:00
Carlos Fernandez Sanz
f2f63ed65f fix(timing): Set pts_set to MinPtsSet after PTS jump to continue fts_now updates (#1824)
When a PTS discontinuity (jump) is detected, the code updates fts_offset
and min_pts to establish a new timeline. However, it was not setting
pts_set back to MinPtsSet, which meant fts_now calculation (which only
runs when pts_set == MinPtsSet) would stop working. This caused all
timestamps after the PTS jump to be stuck.

This fixes issue #1277 where DVD VOB files with PTS discontinuities
(common at chapter boundaries) would stop extracting captions after
about 6 minutes. Version 0.84 worked correctly, but 0.85+ had this
regression.

Closes #1277

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2025-12-14 07:36:01 -08:00
Carlos Fernandez Sanz
3738540804 style: use CCX_STREAM_TYPE_VIDEO_HEVC enum instead of raw 0x24 (#1819)
Follow-up to PR #1769 - use the defined enum constant for HEVC stream
type (0x24) instead of magic numbers for better code maintainability.

Also simplifies the case statement in get_printable_stream_type() by
removing redundant assignment since the enum value passes through
unchanged.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:55:38 -08:00
Carlos Fernandez Sanz
31c6e94e25 fix(memory): Add null checks for unchecked memory allocations (#1815)
Add proper null checks after malloc/calloc/realloc calls to prevent
potential NULL pointer dereferences on out-of-memory conditions.

Files fixed:
- general_loop.c: Add null checks for line buffer and parsebuf; remove
  duplicate allocation that shadowed outer variable (memory leak fix)
- ccx_encoders_webvtt.c: Add null check for color_events/font_events
- ccx_decoders_isdb.c: Add null check for text->buf before dereference
- dvb_subtitle_decoder.c: Move null check before memset
- mp4.c: Add null check for dec_sub->data before memcpy
- ccx_decoders_608.c: Add null check for decoder context
- ccx_decoders_xds.c: Add null check for string buffer
- asf_functions.c: Add null check after struct initialization with malloc
- ccx_dtvcc.c: Move null check before dereferences (was checking after use)
- lib_ccx.c: Fix memset-before-check ordering; add checks for pesheaderbuf
  and DVB context allocations

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:55:06 -08:00
Carlos Fernandez Sanz
33f41f6045 fix(rust): Add null checks and handle invalid UTF-8 in FFI functions (#1816)
- ccxr_process_cc_data: Add null pointer checks for dec_ctx, data, and
  dec_ctx.dtvcc before dereferencing. Also check cc_count > 0.
- ccxr_parse_parameters: Add null check for argv pointer and use
  to_string_lossy() instead of expect() to handle invalid UTF-8
  gracefully without panicking.

These changes prevent potential crashes when FFI functions are called
with invalid arguments from C code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 03:48:29 -08:00
Chandragupt Singh
137719ebea [FIX]: Add HEVC/H.265 stream type recognition to prevent crashes on ATSC 3.0 streams (#1769)
* Add basic HEVC (0x24) TS stream detection to avoid unknown buffer type errors

* docs: update CHANGES.TXT with HEVC/H.265 stream type fix entry
2025-12-14 03:25:05 -08:00
Carlos
ecb0780af5 fix: Enable stdout output for CEA-708 captions on Windows
Fixes #1693 - ccextractorwinfull.exe can't print captions to stdout

The CEA-708 decoder crashed on Windows when using --stdout because the
dtvcc_writer was not properly initialized for stdout output:

1. Fixed Windows stdout handle initialization in ccx_encoders_common.c:
   - Use GetStdHandle(STD_OUTPUT_HANDLE) instead of NULL for fhandle
   - This allows the Rust writer to detect stdout mode properly

2. Changed env_logger target from Stdout to Stderr in lib.rs:
   - Debug messages no longer pollute stdout when using --stdout
   - This prevents mixing debug output with subtitle content

3. Removed redundant debug statement in service_decoder.rs:
   - The bare `debug!("{}", self.current_window)` was noisy and
     duplicated by a more detailed debug statement below it

Added tests:
- test_writer_output_with_valid_fd: Verifies stdout mode works
- test_writer_output_missing_filename_and_fd: Verifies proper error handling

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 12:09:24 +01:00
Carlos Fernandez Sanz
abce0864a5 fix(rust): prevent panics in timing code when processing multiple files
fix(rust): prevent panics in timing code when processing multiple files
2025-12-14 02:17:29 -08:00
Carlos Fernandez Sanz
9ff46656be fix(timing): correct caption start/end times to match FFmpeg in mp4 / mpeg / ts 2025-12-14 02:13:03 -08:00
Rahul Tripathi
446923c79d Merge pull request #3 from Rahul-2k4/copilot/apply-clang-format-to-source-files
[FIX] Apply clang-format to ensure CI formatting checks pass
2025-12-14 15:11:57 +05:30
copilot-swe-agent[bot]
cde9e1f842 Initial plan 2025-12-14 09:34:22 +00:00
Rahul Tripathi
6c75b26484 Merge branch 'CCExtractor:master' into master 2025-12-14 14:47:03 +05:30
Rahul Tripathi
9c4d5a8a58 patch on teletext
Added conditional check for printing notice about teletext pages based on file report settings.
2025-12-14 14:45:04 +05:30
Carlos
a49ebf4230 fix(rust): cast c_long to i64 for cross-platform compatibility
On Windows, c_long is i32, while on Linux it's i64. This causes
a type mismatch when adding fts_at_gop_start (c_long) to
frame_offset_ms (i64). Fix by explicitly casting to i64.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 09:58:55 +01:00
Carlos
7b8533a2dc Merge branch 'master' into fix/caption-timing-accuracy 2025-12-14 09:58:42 +01:00
Carlos Fernandez Sanz
134cd75d3b Merge pull request #1811 from CCExtractor/fix/multi-file-processing
fix(rust): correctly count and store multiple input files
2025-12-14 00:47:07 -08:00
Carlos
80e21171b1 style: apply cargo fmt formatting
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 09:43:42 +01:00
Carlos
0b262d0e17 fix(rust): prevent panics in timing code when processing multiple files
Replace `.unwrap()` and `.expect()` calls with safe alternatives to prevent
Rust panics when processing multiple files with different characteristics
(e.g., DVD-type followed by HDTV-type).

Changes:
- Use `unwrap_or(0)` for all type conversions that could fail
- Handle RwLock poisoning gracefully in apply_timing_info/write_back_from_timing_info
- Add fps validation and millis capping in GopTimeCode::new()
- Add fallback calculation in ccxr_calculate_ms_gop_time when GopTimeCode
  creation fails

Fixes #1377

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 09:39:44 +01:00
Rahul Tripathi
f579cbe45d Merge branch 'CCExtractor:master' into master 2025-12-14 14:02:16 +05:30
Carlos Fernandez Sanz
1a83913540 Merge pull request #1806 from CCExtractor/fix/ttxt-timestamp-milliseconds
fix(parser): use HHMMSSFFF format for ttxt output timestamps
2025-12-14 00:11:08 -08:00
Carlos
075ae04f1d fix(rust): correctly count and store multiple input files
Fix two bugs that prevented multi-file processing from working:

1. In common.rs: `options.inputfile.iter()` was iterating over the
   Option itself (yielding 0 or 1 items) instead of the Vec contents,
   causing num_input_files to always be 1.

2. In parser.rs: append_file_to_queue() was using vec.len() as the
   index for new files after resizing with empty strings, causing
   files to be placed at positions 0, 10, 20... instead of 0, 1, 2...

Fixes #1810

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-14 08:52:51 +01:00
Carlos
d4949ccfa3 style: apply clang-format and cargo fmt formatting fixes
Fix formatting issues detected by CI:
- C files: Tab alignment, trailing whitespace, blank line cleanup
- Rust: Import statement grouping in pic.rs
- Cargo.lock: Remove duplicate bindgen dependency entries

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 18:41:42 +01:00
Carlos
588c981184 docs: update timing verification plan with Fix 7 results
- Document Fix 7: MP4 c608 track timing and garbage frame detection
- Mark all regressions as fixed or documented as known limitations
- Update status to "Ready for Merge"
- MPEG-PS 66ms offset documented as known limitation (FFmpeg uses
  different timing reference for MPEG-PS vs TS containers)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 18:38:34 +01:00
Carlos
941b88f3f9 fix(timing): handle MP4 c608 tracks and improve garbage frame detection
- Fix MP4 c608/c708 caption tracks by setting frame type to I-frame
  before calling set_fts(). Without video frames, frame type would stay
  Unknown and min_pts would never be set, causing broken timestamps.

- Fix premature pts_set = MinPtsSet assignment. Now only set after
  min_pts is actually set, preventing fts_now calculation with
  uninitialized min_pts (0x01FFFFFFFF) which caused negative timestamps.

- Add garbage frame detection threshold (100ms). When an I-frame arrives:
  - If gap between pending_min_pts and I-frame PTS > 100ms: use I-frame
    PTS (garbage leading frames from truncated GOP)
  - If gap <= 100ms: use pending_min_pts (valid B-frames)

- Track pending_min_pts for all frames (not just unknown type) to enable
  proper garbage vs valid B-frame detection.

Results:
- 5df914ce...mp4: 666ms -> 0ms (FIXED)
- c032183e...ts: 284ms -> 0ms (FIXED)
- addf5e2f...ts: 68ms -> ~1ms (FIXED)
- 80848c45...mpg: remains 66ms (FFmpeg uses different reference for MPEG-PS)
- da904de3...mpg: remains 66ms (FFmpeg uses different reference for MPEG-PS)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 18:35:15 +01:00
Carlos
071d017b27 docs: update timing verification plan with Fix 6 results
- Added Fix 6: Elementary stream frame-by-frame timing
- Updated Category 3 testing results:
  - dc7169d7...h264: FIXED (~500ms, acceptable for roll-up)
  - 6395b281...asf: FIXED (1ms)
  - 0069dffd...mpg: Comparison invalid (mixed language CC)
  - b2771c84...mp4: No captions in file

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 13:55:20 +01:00
Carlos
65d9a7ed1a fix(timing): update fts_now for each frame in elementary streams
For elementary streams with GOP timing (use_gop_as_pts=1), fts_now was
only updated when a GOP header was parsed, not for each frame. This
caused all frames within a GOP to have the same timestamp, resulting
in broken caption timing (1ms, 9ms, 17ms instead of proper times).

The fix calculates fts_now for each frame based on:
  fts_at_gop_start + (frames_since_last_gop * 1000 / fps)

Test results for dc7169d7...h264 (raw MPEG-2 elementary stream):
- Before: 1ms, 9ms, 17ms, 25ms (broken)
- After: 2867ms, 4634ms, 6368ms (correct range)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 13:51:59 +01:00
Carlos
54df50f4fe fix(timing): preserve CR time during pop-on to roll-up transition
When transitioning from pop-on to roll-up mode, the first CR command
(with only 1 line visible, changes=0) was resetting ts_start_of_current_line
to -1. This caused the next caption's start time to be set when characters
were typed (~133ms later), not when the CR command was received.

The fix preserves the CR time when rollup_from_popon=1 and changes=0,
ensuring the caption start time matches when the display state changed.

Test results:
- c83f765c...ts: 134ms offset → 1ms (fixed)
- 725a49f8...mpg: 133ms offset → 0ms (fixed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 13:37:57 +01:00
Carlos
bc5d605543 fix(timing): handle pop-on to roll-up mode transition timing
When transitioning from pop-on to roll-up mode, CCExtractor was setting
the caption start time when the first character was typed. FFmpeg uses
the time when the display state changed to show multiple lines. This
caused the first roll-up caption after a mode switch to be timestamped
too early.

Changes:
- Add rollup_from_popon flag to track mode transitions
- Reset ts_start_of_current_line on mode switch
- Defer start time until CR causes scrolling in transition mode
- Use ts_start_of_current_line when buffer scrolls during transition

Test results for 725a49f8...mpg:
- Before: 484ms early
- After: 133ms late (~4 frames, acceptable)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 13:21:20 +01:00
Carlos
a1a0094167 fix(timing): defer min_pts until frame type is known
The previous timing fixes were being bypassed because set_fts() is called
multiple times per frame - first from the PES/TS layer (with unknown frame
type) and later from the ES parsing layer (with known frame type). The first
call was setting min_pts before we knew whether it was an I-frame.

Changes:
- When frame type is unknown, track PTS in pending_min_pts but DON'T set min_pts
- Only set min_pts when frame type is known AND it's an I-frame
- Added unknown_frame_count for fallback handling of H.264 streams
- After 100+ calls with unknown frame type, use pending_min_pts as fallback

Test results:
- 8e8229b88bc6...mpg: 101ms -> 1ms offset ✓
- c032183ef018...ts: 284ms -> 0ms offset ✓
- add511677cc42...vob: 366ms -> 34ms offset ✓

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 12:12:49 +01:00
Carlos
5b8d8a72d8 fix(timing): add frame type tracking for future timing improvements
Add seen_known_frame_type and pending_min_pts fields to track frame
types during initial stream parsing. This infrastructure supports
distinguishing between MPEG-2 streams (where frame types are set) and
H.264 in MPEG-PS (where frame types remain unknown).

Current behavior maintains compatibility by allowing min_pts to be set
from any frame type, which correctly handles both stream types and
matches FFmpeg timing output.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 11:58:23 +01:00
Carlos
621871eb7c fix(timing): skip leading non-I-frames when setting min_pts
Streams recorded mid-broadcast often start with trailing B/P frames from
a previous GOP. These frames have earlier PTS values than the first
decodable I-frame.

Previously, CCExtractor set min_pts from the first PES packet with a PTS,
which could be an undecodable B/P frame. FFmpeg's cc_dec uses the first
decoded frame (necessarily an I-frame) as its timing reference.

This caused consistent timing offsets. For example, c032183ef01...ts had
a 284ms offset because:
- First PES packet PTS: 2508198438
- First I-frame PTS: 2508223963
- Difference: 25525 ticks = 284ms

Changes:
- timing.rs: Only set min_pts when current_picture_coding_type == IFrame
- ccx_decoders_common.c: Don't increment cb_field counters for container
  formats (CCX_H264, CCX_PES) since frame PTS is already correct
- sequencing.c: Include CCX_PES in reset_cb logic alongside CCX_H264

Test results for c032183ef01...ts:
- Before: CCExtractor 1,836ms vs FFmpeg 1,552ms = 284ms offset
- After: CCExtractor 1,552ms vs FFmpeg 1,552ms = 0ms offset

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 11:29:07 +01:00
Carlos Fernandez Sanz
ffcb5fe149 Merge pull request #1802 from CCExtractor/fix/utility-buffer-overruns
fix(utility): prevent buffer overruns and add OOM checks in change_filename
2025-12-13 01:36:57 -08:00
Carlos Fernandez Sanz
1b0808b4f3 Merge pull request #1807 from CCExtractor/fix/phase3-buffer-safety-medium-priority
fix(lib_ccx): replace unsafe string functions with bounds-checked versions
2025-12-13 01:25:25 -08:00
Carlos
68da0a044d style: fix clang-format issues
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
87b0d22057 fix(ts_tables_epg): add NULL checks and fix memory leaks
- EPG_output_live: add NULL checks for filename/finalfilename malloc,
  add fopen failure check
- EPG_DVB_decode_string: add NULL checks for decode_buffer and out
  malloc
- EPG_decode_content_descriptor: add NULL check for categories malloc
- EPG_decode_parental_rating_descriptor: add NULL check for ratings
  malloc
- EPG_decode_extended_event_descriptor: add NULL checks for net and
  extended_text malloc
- EPG_ATSC_decode_multiple_string: add NULL checks for event_name and
  text malloc
- parse_EPG_packet: add NULL check for buffer malloc, fix unsafe
  realloc that lost original pointer on failure
- EPG_decode_short_event_descriptor: fix memory leak - free event_name
  on early return
- EPG_DVB_decode_EIT: fix memory leak - call EPG_free_event on early
  return

All OOM conditions now use fatal(EXIT_NOT_ENOUGH_MEMORY, ...) following
the project's coding patterns.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
af5e36cdab style: fix clang-format issues in macro definitions
Fix macro formatting to have 'do' and '{' on separate lines and
align backslashes consistently, as required by clang-format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
8329257b99 fix(708_output): replace sprintf with snprintf for buffer safety
Replace all sprintf calls with snprintf to prevent potential buffer
overflows in CEA-708 output functions. Key changes:

- dtvcc_change_pen_colors: add bounds checking for font color tags
- dtvcc_change_pen_attribs: add bounds checking for italic/underline tags
- dtvcc_write_srt: track buffer length with snprintf
- dtvcc_write_transcript: add bounds checking for CC/mode labels
- dtvcc_write_sami_header: use snprintf macro for all SAMI tags
- dtvcc_write_sami_footer: use snprintf with length check
- dtvcc_write_sami: add bounds checking for sync tags
- dtvcc_write_scc_header: use snprintf for SCC header
- add_needed_scc_labels: add buffer size parameter for safe writes
- dtvcc_write_scc: use snprintf macro for all SCC formatting
- dtvcc_writer_init: use snprintf for filename suffix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
1869c4c713 fix(mcc_encoder): prevent buffer overruns and add OOM checks
- Add NULL checks after malloc calls for compressed_data_buffer and buff_ptr
- Replace sprintf with snprintf for all string formatting operations
- Replace strcat with bounds-checked direct character assignment
- Replace vsprintf with vsnprintf in debug_log function
- Replace sprintf loop in random_chars with direct character lookup table
- Increase buffer sizes for date_str (50->64), time_str (30->32), tcr_str (25->32)
- Initialize tcr_str in default case to prevent uninitialized use
- Add lib_ccx.h include for fatal() function declaration

Functions modified:
- mcc_encode_cc_data: OOM check + sprintf -> snprintf + strcat -> direct assignment
- generate_mcc_header: sprintf -> snprintf for uuid_str, date_str, time_str, tcr_str
- add_boilerplate: OOM check for buff_ptr
- random_chars: sprintf -> direct character lookup (more efficient)
- debug_log: vsprintf -> vsnprintf + safer strlen check

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
b3c3bdcdac fix(ocr): add NULL checks and fix memory leaks
- search_language_pack: add NULL check after strdup(), fix unsafe
  realloc() that lost original pointer on failure
- init_ocr: fix memory leak where ctx wasn't freed on early return
  when tessdata not found, add NULL checks for strdup() calls
- ocr_bitmap: fix memory leak when pixCreate partially fails, add
  missing boxDestroy for crop_points on early return, add NULL checks
  for histogram/iot/mcit allocations, fix unsafe realloc() calls,
  add NULL check for text_out strdup
- ocr_rect: add NULL check for copy allocation, initialize copy->data
  to NULL to prevent freep on uninitialized pointer, add NULL check
  for copy->data allocation
- paraof_ocrtext: use fatal() on malloc failure for consistent OOM
  handling

All OOM conditions now use fatal(EXIT_NOT_ENOUGH_MEMORY, ...) following
the project's coding patterns.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
6e295ac374 fix(ccx_encoders_spupng): add NULL checks and fix memory leaks
This commit addresses multiple memory safety issues in ccx_encoders_spupng.c:

**NULL pointer dereference fixes (crash prevention):**

1. write_cc_bitmap_as_spupng() line 440: Added NULL check after malloc
   for pbuf - previously would crash on memset if allocation failed.

2. write_image() line 541: Added NULL check after malloc for row buffer
   with proper cleanup via goto finalise.

3. center_justify() line 611: Added NULL check after malloc for
   temp_buffer - previously would crash immediately on use.

4. utf8_to_utf32() line 718: Added NULL check after calloc for
   string_utf32 - previously would crash on use by iconv.

5. spupng_export_string2png() line 780: Fixed existing NULL check that
   printed error but did not return/exit - code would continue to
   memset(NULL, ...) causing a crash.

**Memory leak fixes:**

6. spupng_export_string2png() line 789: Fixed leak where buffer was not
   freed when strdup(str) failed and function returned early.

7. spupng_export_string2png() line 901: Fixed leak on realloc failure
   where buffer, tmp, and string_utf32 were leaked. Now properly frees
   all three before calling fatal().

All fatal() calls include diagnostic information (function name and
bytes requested where applicable) to aid debugging OOM conditions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:38:23 +01:00
Carlos
468bd2c156 style: fix clang-format issues in macro definitions
Fix macro formatting to have 'do' and '{' on separate lines and
align backslashes consistently, as required by clang-format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:31:10 +01:00
Carlos
bcf7eb2a50 fix(708_output): replace sprintf with snprintf for buffer safety
Replace all sprintf calls with snprintf to prevent potential buffer
overflows in CEA-708 output functions. Key changes:

- dtvcc_change_pen_colors: add bounds checking for font color tags
- dtvcc_change_pen_attribs: add bounds checking for italic/underline tags
- dtvcc_write_srt: track buffer length with snprintf
- dtvcc_write_transcript: add bounds checking for CC/mode labels
- dtvcc_write_sami_header: use snprintf macro for all SAMI tags
- dtvcc_write_sami_footer: use snprintf with length check
- dtvcc_write_sami: add bounds checking for sync tags
- dtvcc_write_scc_header: use snprintf for SCC header
- add_needed_scc_labels: add buffer size parameter for safe writes
- dtvcc_write_scc: use snprintf macro for all SCC formatting
- dtvcc_writer_init: use snprintf for filename suffix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:31:10 +01:00
Carlos
54c7dfa45f fix(mcc_encoder): prevent buffer overruns and add OOM checks
- Add NULL checks after malloc calls for compressed_data_buffer and buff_ptr
- Replace sprintf with snprintf for all string formatting operations
- Replace strcat with bounds-checked direct character assignment
- Replace vsprintf with vsnprintf in debug_log function
- Replace sprintf loop in random_chars with direct character lookup table
- Increase buffer sizes for date_str (50->64), time_str (30->32), tcr_str (25->32)
- Initialize tcr_str in default case to prevent uninitialized use
- Add lib_ccx.h include for fatal() function declaration

Functions modified:
- mcc_encode_cc_data: OOM check + sprintf -> snprintf + strcat -> direct assignment
- generate_mcc_header: sprintf -> snprintf for uuid_str, date_str, time_str, tcr_str
- add_boilerplate: OOM check for buff_ptr
- random_chars: sprintf -> direct character lookup (more efficient)
- debug_log: vsprintf -> vsnprintf + safer strlen check

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:31:10 +01:00
Carlos
984123521d fix(ocr): add NULL checks and fix memory leaks
- search_language_pack: add NULL check after strdup(), fix unsafe
  realloc() that lost original pointer on failure
- init_ocr: fix memory leak where ctx wasn't freed on early return
  when tessdata not found, add NULL checks for strdup() calls
- ocr_bitmap: fix memory leak when pixCreate partially fails, add
  missing boxDestroy for crop_points on early return, add NULL checks
  for histogram/iot/mcit allocations, fix unsafe realloc() calls,
  add NULL check for text_out strdup
- ocr_rect: add NULL check for copy allocation, initialize copy->data
  to NULL to prevent freep on uninitialized pointer, add NULL check
  for copy->data allocation
- paraof_ocrtext: use fatal() on malloc failure for consistent OOM
  handling

All OOM conditions now use fatal(EXIT_NOT_ENOUGH_MEMORY, ...) following
the project's coding patterns.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:31:10 +01:00
Carlos
a2cb65f181 fix(ccx_encoders_spupng): add NULL checks and fix memory leaks
This commit addresses multiple memory safety issues in ccx_encoders_spupng.c:

**NULL pointer dereference fixes (crash prevention):**

1. write_cc_bitmap_as_spupng() line 440: Added NULL check after malloc
   for pbuf - previously would crash on memset if allocation failed.

2. write_image() line 541: Added NULL check after malloc for row buffer
   with proper cleanup via goto finalise.

3. center_justify() line 611: Added NULL check after malloc for
   temp_buffer - previously would crash immediately on use.

4. utf8_to_utf32() line 718: Added NULL check after calloc for
   string_utf32 - previously would crash on use by iconv.

5. spupng_export_string2png() line 780: Fixed existing NULL check that
   printed error but did not return/exit - code would continue to
   memset(NULL, ...) causing a crash.

**Memory leak fixes:**

6. spupng_export_string2png() line 789: Fixed leak where buffer was not
   freed when strdup(str) failed and function returned early.

7. spupng_export_string2png() line 901: Fixed leak on realloc failure
   where buffer, tmp, and string_utf32 were leaked. Now properly frees
   all three before calling fatal().

All fatal() calls include diagnostic information (function name and
bytes requested where applicable) to aid debugging OOM conditions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:31:10 +01:00
Carlos Fernandez Sanz
fe7a4b3f45 Merge pull request #1799 from CCExtractor/fix/ts-tables-epg-memory-safety
fix(ts_tables_epg): add NULL checks and fix memory leaks
2025-12-12 23:30:02 -08:00
Carlos Fernandez Sanz
d4ec0fe49b Merge pull request #1800 from CCExtractor/fix/708-output-buffer-safety
fix(708_output): replace sprintf with snprintf for buffer safety
2025-12-12 23:24:26 -08:00
Carlos Fernandez Sanz
4a98bf5290 Merge pull request #1804 from CCExtractor/fix/mcc-encoder-buffer-overruns
fix(mcc_encoder): prevent buffer overruns and add OOM checks
2025-12-12 23:23:25 -08:00
Carlos Fernandez Sanz
249cac359f Merge pull request #1798 from CCExtractor/fix/ocr-memory-safety
fix(ocr): add NULL checks and fix memory leaks
2025-12-12 23:21:11 -08:00
Carlos
69e521b320 fix(timing): correct caption start/end times to match video frame PTS
The get_visible_start() and get_visible_end() functions were adding a
cb_field offset (cb_field * 1001/30 ms) to caption timestamps. This
offset was designed for broadcast MPEG-TS streams where caption data
arrives continuously at field rate (59.94 fields/sec).

However, for container formats like MP4, all caption data for a video
frame is bundled together and should use the frame's PTS directly. The
offset was causing caption start times to be ~300ms (9 frames) later
than the actual video frame timestamp.

Root cause analysis:
1. Previous caption ends → get_visible_end() returns inflated time
   due to cb_field offset → minimum_fts set to this inflated value
2. New caption starts → get_visible_start() constrained by
   minimum_fts + 1 → start time incorrectly pushed forward

Fix:
- Add new Rust FFI functions ccxr_get_visible_start() and
  ccxr_get_visible_end() that return base FTS (fts_now + fts_global)
  without the cb_field offset
- Update C wrappers to call the new Rust functions
- Update Rust decoder timing to use base FTS

Verification against ffmpeg:
- Before fix: 00:16:06,799 (300ms late)
- After fix:  00:16:06,499 (matches ffmpeg exactly)
- ffmpeg ref: 00:16:06,499

The get_fts() function is unchanged - it still returns the
offset-adjusted time for use cases that need it (like extraction
time boundary checking).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 08:01:29 +01:00
Carlos
8af19df556 fix(lib_ccx): replace remaining unsafe string functions with bounds-checked versions
Replace sprintf/strcpy with snprintf/memcpy in LOW priority files:
- general_loop.c: proper buffer allocation with OOM check, snprintf
- ccx_encoders_g608.c: snprintf with sizeof for timeline buffer
- lib_ccx.c: fix buffer size calculation, add missing null check, snprintf
- ccx_common_timing.c: snprintf with documented max size for time functions
- ts_functions.c: snprintf with sizeof in debug code
- matroska.c: bounded memcpy to prevent overflow from malformed language codes
- output.c: snprintf with known allocated size

This completes Phase 3.1 of the buffer safety audit.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 07:10:54 +01:00
Carlos
bff08bec9e fix(encoders): replace unsafe string functions with bounds-checked versions
Replace sprintf/strcpy/strcat with snprintf/strncat/memmove in:
- ccx_encoders_common.c: 4 sprintf -> snprintf
- ccx_encoders_helpers.c: 3 strcat -> strncat, 1 strcpy -> memcpy
- telxcc.c: 3 sprintf -> snprintf
- asf_functions.c: 3 sprintf -> snprintf
- ccx_encoders_ssa.c: 3 sprintf -> snprintf
- ccx_encoders_curl.c: 1 sprintf -> snprintf, strcpy+strcat -> snprintf with OOM check
- ccx_encoders_splitbysentence.c: 1 strcpy -> memmove (overlapping memory fix), 2 strcat -> strncat

This is part of Phase 3.1 of the buffer safety audit, addressing MEDIUM priority files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 07:00:44 +01:00
Carlos
a66fb8c661 fix(utility): prevent buffer overruns and add OOM checks in change_filename
- Add NULL checks after malloc calls for temp_encoder, current_name, and newname
- Replace sprintf with snprintf for safe string formatting
- Replace strcpy/strcat with strncpy and snprintf to prevent buffer overflows
- Increase buffer sizes from 6/10/15 to 16 chars to safely hold extension numbers
- Use proper size tracking with filename_len and buffer size variables

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:46:28 +01:00
Carlos Fernandez Sanz
042716adde fix(xds_decoder): prevent buffer overruns and fix sprintf logic bug (#1803)
- Replace sprintf with snprintf for all string formatting operations
- Replace strcpy/strcat chains with snprintf for bounds-safe concatenation
- Replace strcpy with strncpy + null terminator for fixed-size buffers
- Fix bug in xds_do_private_data: sprintf in loop was overwriting instead
  of appending hex bytes to output string

Functions modified:
- xds_do_copy_generation_management_system: 3 sprintf -> snprintf
- xds_do_content_advisory: 5 sprintf -> snprintf, strcpy/strcat chain fixed
- xds_do_current_and_future: strcpy -> strncpy for program description
- xds_do_channel: strcpy -> strncpy for network name
- xds_do_private_data: fixed loop to properly append hex bytes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 21:40:54 -08:00
Carlos
1342e4edee fix(ocr): add NULL checks and fix memory leaks
- search_language_pack: add NULL check after strdup(), fix unsafe
  realloc() that lost original pointer on failure
- init_ocr: fix memory leak where ctx wasn't freed on early return
  when tessdata not found, add NULL checks for strdup() calls
- ocr_bitmap: fix memory leak when pixCreate partially fails, add
  missing boxDestroy for crop_points on early return, add NULL checks
  for histogram/iot/mcit allocations, fix unsafe realloc() calls,
  add NULL check for text_out strdup
- ocr_rect: add NULL check for copy allocation, initialize copy->data
  to NULL to prevent freep on uninitialized pointer, add NULL check
  for copy->data allocation
- paraof_ocrtext: use fatal() on malloc failure for consistent OOM
  handling

All OOM conditions now use fatal(EXIT_NOT_ENOUGH_MEMORY, ...) following
the project's coding patterns.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:26:59 +01:00
Carlos
4d1d874243 fix(ccx_encoders_spupng): add NULL checks and fix memory leaks
This commit addresses multiple memory safety issues in ccx_encoders_spupng.c:

**NULL pointer dereference fixes (crash prevention):**

1. write_cc_bitmap_as_spupng() line 440: Added NULL check after malloc
   for pbuf - previously would crash on memset if allocation failed.

2. write_image() line 541: Added NULL check after malloc for row buffer
   with proper cleanup via goto finalise.

3. center_justify() line 611: Added NULL check after malloc for
   temp_buffer - previously would crash immediately on use.

4. utf8_to_utf32() line 718: Added NULL check after calloc for
   string_utf32 - previously would crash on use by iconv.

5. spupng_export_string2png() line 780: Fixed existing NULL check that
   printed error but did not return/exit - code would continue to
   memset(NULL, ...) causing a crash.

**Memory leak fixes:**

6. spupng_export_string2png() line 789: Fixed leak where buffer was not
   freed when strdup(str) failed and function returned early.

7. spupng_export_string2png() line 901: Fixed leak on realloc failure
   where buffer, tmp, and string_utf32 were leaked. Now properly frees
   all three before calling fatal().

All fatal() calls include diagnostic information (function name and
bytes requested where applicable) to aid debugging OOM conditions.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:26:59 +01:00
Carlos
155f56ede7 style: fix clang-format issues in macro definitions
Fix macro formatting to have 'do' and '{' on separate lines and
align backslashes consistently, as required by clang-format.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:24:27 +01:00
Carlos
fb49d9460d fix(708_output): replace sprintf with snprintf for buffer safety
Replace all sprintf calls with snprintf to prevent potential buffer
overflows in CEA-708 output functions. Key changes:

- dtvcc_change_pen_colors: add bounds checking for font color tags
- dtvcc_change_pen_attribs: add bounds checking for italic/underline tags
- dtvcc_write_srt: track buffer length with snprintf
- dtvcc_write_transcript: add bounds checking for CC/mode labels
- dtvcc_write_sami_header: use snprintf macro for all SAMI tags
- dtvcc_write_sami_footer: use snprintf with length check
- dtvcc_write_sami: add bounds checking for sync tags
- dtvcc_write_scc_header: use snprintf for SCC header
- add_needed_scc_labels: add buffer size parameter for safe writes
- dtvcc_write_scc: use snprintf macro for all SCC formatting
- dtvcc_writer_init: use snprintf for filename suffix

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:24:27 +01:00
Carlos
37fed5e5b5 fix(mcc_encoder): prevent buffer overruns and add OOM checks
- Add NULL checks after malloc calls for compressed_data_buffer and buff_ptr
- Replace sprintf with snprintf for all string formatting operations
- Replace strcat with bounds-checked direct character assignment
- Replace vsprintf with vsnprintf in debug_log function
- Replace sprintf loop in random_chars with direct character lookup table
- Increase buffer sizes for date_str (50->64), time_str (30->32), tcr_str (25->32)
- Initialize tcr_str in default case to prevent uninitialized use
- Add lib_ccx.h include for fatal() function declaration

Functions modified:
- mcc_encode_cc_data: OOM check + sprintf -> snprintf + strcat -> direct assignment
- generate_mcc_header: sprintf -> snprintf for uuid_str, date_str, time_str, tcr_str
- add_boilerplate: OOM check for buff_ptr
- random_chars: sprintf -> direct character lookup (more efficient)
- debug_log: vsprintf -> vsnprintf + safer strlen check

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:23:45 +01:00
Carlos
7113036719 fix(parser): use HHMMSSFFF format for ttxt output timestamps
The Rust parser was incorrectly setting date_format to HHMMSS (no
milliseconds) instead of HHMMSSFFF (with milliseconds) for --out=ttxt.

This bug was introduced in PR #1619 when porting the parser to Rust.
The original C code correctly used ODF_HHMMSSMS which includes
milliseconds in the timestamp format (HH:MM:SS,mmm).

Before: 10:25:16 (missing milliseconds)
After:  10:25:16,000 (correct format matching original C behavior)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 06:21:09 +01:00
Carlos Fernandez Sanz
d93d6731ba fix(encoders): replace sprintf/strcpy with bounds-checked versions (#1805)
Replace unsafe string functions with safer alternatives:
- ccx_encoders_sami.c: sprintf -> snprintf (10 fixes)
- ccx_encoders_srt.c: sprintf -> snprintf (6 fixes)
- mp4.c: sprintf/strcpy/strcat -> snprintf (6 fixes, including
  buffer overflow fix in format_duration where 20-byte buffer
  was too small for long duration strings)
- ccx_encoders_webvtt.c: sprintf -> snprintf (6 fixes), plus:
  - Fixed malloc size bug (+4 instead of +5 for null terminator)
  - Added OOM checks for css_file_name and outline_css_file
  - Fixed memory leaks (css_file_name and outline_css_file not freed)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 21:16:20 -08:00
Carlos Fernandez Sanz
77e1dff779 fix(smptett): replace unsafe string operations with bounds-checked versions (#1801)
Replace sprintf, strcpy, and strcat calls with snprintf and bounds-checked
operations to prevent potential buffer overflows. Key changes:

- write_stringz_as_smptett: use snprintf for timestamp formatting
- write_cc_bitmap_as_smptett: use snprintf with INITIAL_ENC_BUFFER_CAPACITY
- write_cc_buffer_as_smptett:
  - Add NULL checks for malloc allocations
  - Track buffer size and use snprintf throughout
  - Replace strcpy/strcat chains with bounds-checked memcpy/snprintf
  - Use snprintf for style tag and color code formatting

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 21:15:39 -08:00
Carlos Fernandez Sanz
58dedba93f fix(scc): Always emit position codes at start of caption (fixes #1776) (#1791)
* fix(scc): Always emit position codes at start of caption (fixes #1776)

The SCC encoder was initializing current_row=14 and current_column=0,
which caused the first position code (PAC) to be skipped when caption
content started at row 14 (the last row), column 0. This happened because
the condition checking if row/column changed would be false.

For example, a caption starting at row 15 (1-indexed), column 0 should
output the PAC code 9470/{1500} but this was being omitted.

Fix by initializing current_row and current_column to UINT8_MAX, which
is an impossible value that will never match any valid row (0-14) or
column (0-31), ensuring the position code is always written for the
first character of each caption.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(rust): Remove unused assignments to fix clippy warnings

Remove unnecessary `time_show.time_in_ms += 1000 / 29.97` operations
that were restoring values that were never read afterwards.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 21:13:02 -08:00
Carlos
9eb266914a fix(ts_tables_epg): add NULL checks and fix memory leaks
- EPG_output_live: add NULL checks for filename/finalfilename malloc,
  add fopen failure check
- EPG_DVB_decode_string: add NULL checks for decode_buffer and out
  malloc
- EPG_decode_content_descriptor: add NULL check for categories malloc
- EPG_decode_parental_rating_descriptor: add NULL check for ratings
  malloc
- EPG_decode_extended_event_descriptor: add NULL checks for net and
  extended_text malloc
- EPG_ATSC_decode_multiple_string: add NULL checks for event_name and
  text malloc
- parse_EPG_packet: add NULL check for buffer malloc, fix unsafe
  realloc that lost original pointer on failure
- EPG_decode_short_event_descriptor: fix memory leak - free event_name
  on early return
- EPG_DVB_decode_EIT: fix memory leak - call EPG_free_event on early
  return

All OOM conditions now use fatal(EXIT_NOT_ENOUGH_MEMORY, ...) following
the project's coding patterns.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-13 02:00:22 +01:00
Carlos Fernandez Sanz
1510396aa0 fix(ccx_decoders_common): add NULL checks and fix memory safety issues (#1796)
- Add NULL checks after malloc calls in copy_encoder_context(),
  copy_decoder_context(), copy_subtitle(), and init_cc_decode()
- Fix buffer overflows in copy_encoder_context() where string
  allocations were missing +1 for null terminator
- Call fatal(EXIT_NOT_ENOUGH_MEMORY, ...) on allocation failure
  following the pattern used in matroska.c
- Initialize pointers to NULL after memcpy to prevent use of
  stale pointers from the copied structure
- Prevent null pointer dereference in init_cc_decode() when dtvcc_init
  returns NULL

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 16:01:47 -08:00
dependabot[bot]
a7dfaea559 chore(deps): bump actions/cache from 4 to 5 (#1790)
Bumps [actions/cache](https://github.com/actions/cache) from 4 to 5.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-12 14:31:22 -08:00
Carlos Fernandez Sanz
e8383c84ee fix(rust): remove unused assignments in tv_screen.rs (#1795)
Remove three unused assignments to `time_show.time_in_ms` that were
flagged by Clippy as "value assigned is never read".

The pattern was: subtract frame delay, use the value, then restore it.
However, since `time_show` is not used after the match statement, the
restoration assignments were unnecessary dead code.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 13:56:03 -08:00
Carlos Fernandez Sanz
810c869bc5 fix(dvb_subtitle_decoder): add NULL checks after malloc calls (#1794)
* fix(matroska): add memory safety checks and fix memory leaks

This commit addresses multiple memory safety issues in the Matroska
parser identified through static analysis (cppcheck).

## Null pointer dereference after malloc (15 fixes)

Added null checks after all malloc/calloc calls to prevent crashes
when memory allocation fails:

- read_byte_block(): line 28
- read_bytes_signed(): line 38
- generate_timestamp_ass_ssa(): line 267
- parse_segment_cluster_block_group_block(): lines 306, 361
- parse_segment_cluster_block_group_block_additions(): line 405
- parse_segment_cluster_block_group(): line 476
- parse_segment_track_entry(): lines 958, 973
- parse_private_codec_data(): line 1019
- generate_filename_from_track(): line 1167
- ass_ssa_sentence_erase_read_order(): line 1191
- save_sub_track(): lines 1264, 1271, 1303, 1310
- matroska_loop(): lines 1496, 1505

## Buffer overflow fixes (3 fixes)

- generate_timestamp_ass_ssa(): Increased buffer from 15 to 32 bytes,
  changed sprintf to snprintf. GCC warned output could be 11-23 bytes.
- save_sub_track(): Increased number[] buffer from 9 to 16 bytes,
  changed sprintf to snprintf.
- generate_filename_from_track(): Now calculates required buffer size
  dynamically instead of using fixed 200 bytes.

## Memory leak fixes (7 fixes)

- parse_ebml(): Fixed leak of read_vint_block_string() return value
- parse_segment_info(): Fixed 4 leaks of read_vint_block_string()
  returns (filename, title, muxing_app, writing_app)
- parse_segment_track_entry(): Added free(lang) before reassignment
- save_sub_track(): Fixed leak where text pointer was advanced,
  losing original allocation

## Realloc error handling (3 fixes)

Fixed realloc calls to use temporary variable, preventing loss of
original pointer if realloc fails:

- parse_segment_cluster_block_group_block(): line 366
- parse_segment_cluster_block_group(): line 475
- parse_segment_track_entry(): line 973

## Use-after-free fix (1 fix)

- matroska_loop(): Saved avc_track_number and dec_sub.got_output
  before calling matroska_free_all(), then used saved values

## Missing free fixes (2 fixes)

- free_sub_track(): Added free(track->sentences) for the array itself
- matroska_free_all(): Added free(mkv_ctx->sub_tracks) for the array

## Other improvements

- Initialized sub_track->sentences to NULL in parse_segment_track_entry()
  to ensure safe NULL check in free_sub_track()

All changes use EXIT_NOT_ENOUGH_MEMORY (exit code 500) for
out-of-memory conditions, consistent with the rest of the codebase.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* fix(dvb_subtitle_decoder): add NULL checks after malloc calls

Add missing NULL checks for 9 malloc() calls in the DVB subtitle decoder
that could cause crashes or undefined behavior if memory allocation fails.

All checks use fatal(EXIT_NOT_ENOUGH_MEMORY, ...) to terminate gracefully
with an appropriate error message, consistent with the approach used in
matroska.c and other parts of the codebase.

Affected functions and allocations:
- dvbsub_init_decoder(): DVBSubContext allocation
- dvbsub_parse_clut_segment(): DVBSubCLUT allocation
- dvbsub_parse_region_segment(): DVBSubRegion, pbuf, DVBSubObject,
  and DVBSubObjectDisplay allocations
- dvbsub_parse_page_segment(): DVBSubRegionDisplay allocation
- write_dvb_sub(): cc_bitmap (rect), data1, and data0 allocations
- dvbsub_handle_display_segment(): private_data allocation

This also fixes a potential memory leak in write_dvb_sub() where rect
and rect->data1 would be leaked if the rect->data0 allocation failed
(previously returned -1 without cleanup, now terminates via fatal()).

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 13:49:09 -08:00
Carlos Fernandez Sanz
b32c120e89 fix(matroska): add memory safety checks and fix memory leaks (#1792)
This commit addresses multiple memory safety issues in the Matroska
parser identified through static analysis (cppcheck).

## Null pointer dereference after malloc (15 fixes)

Added null checks after all malloc/calloc calls to prevent crashes
when memory allocation fails:

- read_byte_block(): line 28
- read_bytes_signed(): line 38
- generate_timestamp_ass_ssa(): line 267
- parse_segment_cluster_block_group_block(): lines 306, 361
- parse_segment_cluster_block_group_block_additions(): line 405
- parse_segment_cluster_block_group(): line 476
- parse_segment_track_entry(): lines 958, 973
- parse_private_codec_data(): line 1019
- generate_filename_from_track(): line 1167
- ass_ssa_sentence_erase_read_order(): line 1191
- save_sub_track(): lines 1264, 1271, 1303, 1310
- matroska_loop(): lines 1496, 1505

## Buffer overflow fixes (3 fixes)

- generate_timestamp_ass_ssa(): Increased buffer from 15 to 32 bytes,
  changed sprintf to snprintf. GCC warned output could be 11-23 bytes.
- save_sub_track(): Increased number[] buffer from 9 to 16 bytes,
  changed sprintf to snprintf.
- generate_filename_from_track(): Now calculates required buffer size
  dynamically instead of using fixed 200 bytes.

## Memory leak fixes (7 fixes)

- parse_ebml(): Fixed leak of read_vint_block_string() return value
- parse_segment_info(): Fixed 4 leaks of read_vint_block_string()
  returns (filename, title, muxing_app, writing_app)
- parse_segment_track_entry(): Added free(lang) before reassignment
- save_sub_track(): Fixed leak where text pointer was advanced,
  losing original allocation

## Realloc error handling (3 fixes)

Fixed realloc calls to use temporary variable, preventing loss of
original pointer if realloc fails:

- parse_segment_cluster_block_group_block(): line 366
- parse_segment_cluster_block_group(): line 475
- parse_segment_track_entry(): line 973

## Use-after-free fix (1 fix)

- matroska_loop(): Saved avc_track_number and dec_sub.got_output
  before calling matroska_free_all(), then used saved values

## Missing free fixes (2 fixes)

- free_sub_track(): Added free(track->sentences) for the array itself
- matroska_free_all(): Added free(mkv_ctx->sub_tracks) for the array

## Other improvements

- Initialized sub_track->sentences to NULL in parse_segment_track_entry()
  to ensure safe NULL check in free_sub_track()

All changes use EXIT_NOT_ENOUGH_MEMORY (exit code 500) for
out-of-memory conditions, consistent with the rest of the codebase.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
2025-12-12 13:29:55 -08:00
Vidit
3d7553349f remove label -without-rust (#1780)
* fix minor issue

* remove -without-rust

* fixed
2025-12-09 20:38:55 +05:30
Rahul Tripathi
d524a0247f Merge pull request #2 from Rahul-2k4/copilot/fix-teletext-page-detection-issue-1034 2025-12-09 13:00:03 +05:30
copilot-swe-agent[bot]
f30f276456 Apply code style fixes from clang-format
Co-authored-by: Rahul-2k4 <216878448+Rahul-2k4@users.noreply.github.com>
2025-12-09 06:28:15 +00:00
copilot-swe-agent[bot]
17a8e1ec7b Remove unintended Cargo.lock changes
Co-authored-by: Rahul-2k4 <216878448+Rahul-2k4@users.noreply.github.com>
2025-12-09 06:23:19 +00:00
copilot-swe-agent[bot]
ebe25af476 Fix indentation to use tabs consistently
Co-authored-by: Rahul-2k4 <216878448+Rahul-2k4@users.noreply.github.com>
2025-12-09 06:16:17 +00:00
copilot-swe-agent[bot]
1f7120f32f Apply teletext page detection fix from fix branch
Co-authored-by: Rahul-2k4 <216878448+Rahul-2k4@users.noreply.github.com>
2025-12-09 06:15:23 +00:00
copilot-swe-agent[bot]
9e9023c258 Initial plan 2025-12-09 06:05:32 +00:00
Dhanush
b2930178be Fix G608 output extra NULL character (#1777) (#1786)
Co-authored-by: dhanush varma <dhanushvarma@dhanushs-MacBook-Air.local>
2025-12-08 20:37:29 -08:00
rudera-byte
759c3f5d41 fix: Issue #1162 TESSDATA_PREFIX requires path separator at its end (#1674) 2025-12-09 04:30:26 +05:30
moveman
3c51fb6536 Handle row_count decrease (#1702)
Co-authored-by: ewong <Edmond.Wong@harmonicinc.com>
Co-authored-by: Prateek Sunal <prtksunal@gmail.com>
Co-authored-by: Carlos Fernandez Sanz <carlos@ccextractor.org>
2025-12-09 04:19:13 +05:30
Deepnarayan Sett
494df3edae [FEAT] added demuxer and file_functions module (#1662)
* feat: added demuxer module

* Cargo Lock Update

* Completed file_functions and demuxer

* Completed file_functions and demuxer

* written extern functions for demuxer

* Removed libc completely, added tests for gxf and ported gxf to C

* Hardsubx error fixed

* Fixing format issues

* clippy errors fixed

* fixing format issues

* fixing format issues

* Windows failing tests

* Windows failing tests

* demuxer: added demuxer data transfer functions and removed some structs

* made Demuxer and File Functions

* Minor formatting changes

* Minor Rebasing changes

* demuxer: format rust and unit test rust checks

* C formatting

* Windows Failing test

* Windows Failing test

* Update CHANGES.TXT

* Update CHANGES.TXT

* Windows Failing Tests

* Windows Failing Tests

* Problem in Copy to Rust and some typos that copilot review suggested

* Minor Formatting Error

* Windows Failing Regressions

* Windows Failing Regressions

* Minor Comment Change

* Data transfer module for DemuxerData added and more rustlike syntax to ctorust.rs

* Minor Formatting Changes

* demuxer: Rebase and a few tweaks to file_functions

* demuxer: Minor Formatting Error

* [FIX] 134 Codes in XDS and General Tests (#1708)

* Made pointers valid in Unit Tests of Decoder

* fix: test_do_cb

* Copilot Suggestions

* Suggestions about Redundancy

* Suggestions about Redundancy

* [FEAT] Add `bitstream` module in `lib_ccxr` (#1649)

* feat: Add bitstream module

* run code formatters

* Run cargo clippy --fix

* Run cargo fmt --all

* refactor: remove rust pointer from C struct

* feat: Add bitstream module

* run code formatters

* Run cargo clippy --fix

* Run cargo fmt --all

* refactor: remove rust pointer from C struct

* Added Bitstream to libccxr_exports

* Minor Formatting Issue

* Bitstream: Removed redundant CType

* bitstream: recommended changes for is_byte_aligned

* bitstream: recommended changes for long comments

* bitstream: comment fix

* bitstream: removed redundant comparism comments

---------

Co-authored-by: Deepnarayan Sett <depnra1@gmail.com>
Co-authored-by: Deepnarayan Sett <71217129+steel-bucket@users.noreply.github.com>

* demuxer: minor formatting changes

* Demuxer: Changes to mistakes in CHANGES.txt

* Demuxer: Removed extra newline in ccextractor.c

* Demuxer: Changes to Encoding resolved

* Demuxer: Moved CCX_NOPTS to common structs and some changes to Demuxer Data regd. MPEG_CLOCK_FREQ

* some refactoring to CCX_NOPTS

* Demuxer: Minor Mistake regarding CHANGES.txt

* Demuxer: Unit test rust failing because of CCX_NOPTS

* Demuxer: changed common_structs to common_types

* Demuxer: Removed redundant libraries from Cargo.toml and moved tempfile to dev-dependencies

* Demuxer: Removed to_vec function and renamed PSIBuffer/PMTEntry from_ctype functions

* Demuxer:  Renamed Stream_Type, improved Time complexity of the default() function and removed redundant comments

* Demuxer:  Removed two repeated code blocks and removed redundant comments

* Demuxer:  Removed two code blocks

* Demuxer: Review Changes

* Demuxer: Removed redundant tests

* Update src/rust/src/demuxer/demux.rs

Co-authored-by: Prateek Sunal <prtksunal@gmail.com>

* Demuxer: Errors due to Rebase

* Demuxer: Removed get_stream_mode

* Demuxer: Errors due to rebasing and removing redundant CType Functions

* Demuxer: Failing ES regressions

* Demuxer: MythTV failing regression

* Demuxer: Removed redundant comments

* Demuxer: Unplugged ES for now

* Demuxer: Replugged in ES

* Demuxer: Formatting error

* Demuxer: Windows failing CI

* Demuxer: Windows failing CI

* Demuxer: Windows failing Regressions

* Demuxer: Formatting

* Demuxer: Minor Cargo Clippy change

* Demuxer: running regressions again

* Demuxer: Cargo Lockfile Change

* Demuxer: running regressions again

* Demuxer: running regressions again

---------

Co-authored-by: Swastik Patel <swastikpatel29@gmail.com>
Co-authored-by: Prateek Sunal <prtksunal@gmail.com>
2025-12-08 22:26:20 +05:30
Carlos Fernandez Sanz
810e02f7fa Fix Issue#1235: Sanitize XML comment to prevent invalid token errors (#1783)
Original description:

Pull Requests Description :
Added logic to detect and replace any occurrence of "--" in comments with a single "-" to ensure valid XML.
Used a bulk write ('fwrite') to efficiently handle portions of the string that don't contain invalid sequences.
Ensured that comments are written correctly without altering the original structure of the code.
Updated function 'write_spucomment' to handle the sanitization process efficiently.
2025-12-07 22:41:11 -08:00
dependabot[bot]
2720448e87 chore(deps): bump actions/checkout from 4 to 6 (#1766)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v4...v6)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-07 18:54:15 -08:00
dependabot[bot]
5fceac5e90 chore(deps): bump actions/upload-artifact from 4 to 5 (#1757)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 4 to 5.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-07 18:53:35 -08:00
Carlos Fernandez Sanz
60ae6fb760 [FIX] Fix Windows build by updating vcpkg baseline and other packages (#1778)
* [FIX] Update vcpkg baseline and use forked rsmpeg for FFmpeg 7

Update vcpkg baseline from Feb 2024 to Dec 2025 to resolve libxml2
hash mismatch. GitLab regenerates archives dynamically, causing
SHA512 verification failures with old baselines.

Switch to CCExtractor's forked rsmpeg (github.com/CCExtractor/rsmpeg)
which pins rusty_ffmpeg to 0.16.4 for FFmpeg 7.1 compatibility.
This provides consistent FFmpeg 7 support across all platforms.

Changes:
- Update vcpkg baseline in workflow and vcpkg.json
- Use forked rsmpeg from git for all platforms
- Use ffmpeg7_1 feature instead of ffmpeg6/ffmpeg8
- Use link_vcpkg_ffmpeg for Windows

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Enable use_prebuilt_binding feature for rsmpeg

This ensures consistent FFmpeg 7 API signatures across all platforms,
regardless of the system FFmpeg version installed. Ubuntu's FFmpeg 6
has different function signatures than FFmpeg 7.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Standardize on FFmpeg 6.1.1 across all platforms

Use FFmpeg 6 consistently:
- Linux: uses apt packages (libavcodec-dev, etc.) which provide FFmpeg 6
- Windows: vcpkg baseline pinned to FFmpeg 6.1.1 (commit 5a58e645)
- macOS: uses system FFmpeg 6

This ensures consistent behavior and API compatibility across all platforms.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Use platform-appropriate FFmpeg versions

- Linux: FFmpeg 6 (from Ubuntu apt packages)
- Windows: FFmpeg 7 (from vcpkg with recent baseline)
- macOS: FFmpeg 7 (from Homebrew)

This fixes the Windows build which was failing due to vcpkg
baseline hash mismatch for libxml2 in older baselines.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Use FFmpeg 7 with prebuilt bindings for Linux

Use ffmpeg7 feature everywhere and use_prebuilt_binding for Linux
to ensure FFmpeg 7 API signatures regardless of system FFmpeg version.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix library names for Windows build with updated vcpkg

- Update leptonica library name from 1.83.1 to 1.85.0
- Update tesseract library name from tesseract53 to tesseract55 (v5.5.1)
- Update libiconv library names: charset.lib -> libcharset.lib, iconv.lib -> libiconv.lib

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix iconv library name for vcpkg static build

vcpkg libiconv for x64-windows-static produces only iconv.lib
with charset functionality bundled in, not separate libcharset.lib
and libiconv.lib files.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix iconv library names: use charset.lib and iconv.lib

Restores the correct vcpkg libiconv library names:
- charset.lib (libcharset library)
- iconv.lib (libiconv library)

These are the original names from vcpkg libiconv package for x64-windows-static.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* try: New Hash

Updated the builtin baseline hash for ccextractor.

* Remove charset.lib and iconv.lib from dependencies

The project has its own win_iconv.c implementation in src/thirdparty/win_iconv/
which provides iconv functionality. With the updated vcpkg baseline (ab2977be),
the libiconv library doesn't produce charset.lib or libcharset.lib files.

FFmpeg is also built with --disable-iconv in this vcpkg configuration, so
the external iconv libraries are not needed by any of the vcpkg dependencies.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Deepnarayan Sett <71217129+steel-bucket@users.noreply.github.com>
2025-12-07 13:20:41 -08:00
dhanush varma
c9d80e12b8 bump: update MSRV from 1.54.0 to 1.87.0
- Update all build configuration files to require Rust 1.87.0+
- Add clippy.toml with MSRV configuration as requested
- Maintain modern Rust features like is_multiple_of()
- Fixes build compatibility issue #1765
2025-11-23 00:10:04 +05:30
dhanush varma
a0aa9e4616 fix(rust): revert is_multiple_of to maintain MSRV 1.54.0
- Reverts is_multiple_of(2) to stable % 2 == 0 check to maintain
  compatibility with Rust 1.54.0 (project MSRV)
- Adds clippy.toml with msrv = '1.54.0' to prevent Clippy from
  suggesting APIs that aren't available in the MSRV

Fixes: #1765
2025-11-21 22:39:26 +05:30
dhanush varma
1515f5c1be build: add tesseract library linking for hardsubx feature
Fixes #1719 - build was failing with --enable-hardsubx due to missing
tesseract library linking. Added pkg_check_modules for tesseract and
leptonica in the HARDSUBX section of CMakeLists.txt.

Tested with: cmake -DWITH_HARDSUBX=ON -DWITH_OCR=ON -DWITH_FFMPEG=ON
2025-11-08 11:42:54 +05:30
Prateek Sunal
42d750950a [FIX] add mac-ocr-hardsubx workflow & ffmpeg variants support (#1745)
## Fix
- Update params and there doc

## Mac OS:
- Fix FFMpeg, tesseract compilation
- Re-add Mac os build hardsubx workflow

## FFMpeg used in workflow:
- MacOS: `8.*`
- Windows: `6.*` (pinned VCPKG supports this)
- Linux: `6.*` (Latest ubuntu runner supports this)
2025-11-03 23:47:42 +05:30
Deepnarayan Sett
5338c15f8d fix: Cargo Clippy failing on 1.91 (#1758) 2025-10-31 23:38:10 -07:00
Hridesh MG
ee232b5ded bump version 0.94 -> 0.95 (#1751) 2025-10-26 20:19:55 +05:30
pszemus
654d00a54e [FIX] Rust: fix unsetting source udp address when not specified by the user (#1750)
* Rust: fix unsetting source udp address when not specified by the user

* Rust: Fix `--udp [[src@]host:]port` parameter
2025-10-22 12:44:55 +05:30
pszemus
d86ee721df Rust: fix setting psm (#1752) 2025-10-22 12:44:45 +05:30
Chandragupt Singh
da03c1ec9d Fix ARM64 build: c_char initialization (#1756) 2025-10-22 12:44:34 +05:30
Deepnarayan Sett
ebd8252b88 Fix: Rust Clippy failing on 1.90 (#1753)
* Fix: Rust Clippy failing on 1.90

* Fix: Format Sourcecode in ES
2025-09-29 19:29:05 -07:00
rboy1
1c7e2a0995 [FIX] Fixed issue with cross compiling using MINGW-w64 (#1731)
* Fixed issue with cross compiling using MINGW-w64

* Update ts_tables_epg.c

* Update ccx_encoders_common.h

* Update ccx_common_platform.h

* Update ts_tables_epg.c

formatting changes a recommended by the clang test

---------

Co-authored-by: Prateek Sunal <prtksunal@gmail.com>
2025-09-13 23:08:14 +05:30
Hridesh MG
fb6a8301f6 fix: ocr luminance calculation fix (#1746) 2025-09-13 23:04:06 +05:30
pszemus
f2168b4c79 dockerfile: fix gpac version to 2.4.0 (#1747)
GPAC renamed its libraries to `libgpac.so.13` causing image build  to fail:

```
Error: building at STEP "COPY --from=builder /usr/local/lib/libgpac.so.12 /usr/local/lib/": checking on sources under "/home/pszemus/.local/share/containers/storage/overlay/faa4f2b5c39251a5cf42a97234d2d5652336a2388c96a64d85fc1922c4c43a71/merged": copier: stat: "/usr/local/lib/libgpac.so.12": no such file or directory
```
so let's fix the gpac version to the latest release (2.4.0)
2025-09-13 23:03:21 +05:30
Deepnarayan Sett
24f718427f [Rust] Fixes to Net Module (#1725)
* chore(cargo): Add dependencies

* feat: Create new module `net` in lib_ccxr

feat: Create new module `net` in lib_ccxr

* feat: Add block related functionality in `block.rs`

* feat: Add `target.rs` module for sending data blocks related functions

* feat(modules): Add all necessary modules

* feat: Add `source.rs` module for reading data blocks related functions from source

* feat: Add C equivalent functions in rust

* feat(module): Add `net` module in `libccxr_export`

* chore(cargo): Update Cargo.lock

* feat: Add C equivalent code in `libccxr_exports` & use in `networking.c`

* chore: Remove unused imports

* chore(clippy): Fix clippy warnings

* Net Module: Fixes in parser.rs - removed an extra check

* Net Module: Fixes in block.rs - fixed formatting issues

* Net Module: Fixes in source.rs - rewrote UDP implementation and a few other fixes

* Net Module: Fixes in target.rs - fixed formatting issues

* Net module: Rebasing and formatting changes

* Net module: Clashing names after rebase

* Net module: Clippy errors

---------

Co-authored-by: IshanGrover2004 <groverishan2004@gmail.com>
2025-09-06 22:11:17 +05:30
Deepnarayan Sett
c2a1f0d91f [Rust]Ported ES Module to Rust (#1736)
* Ported ES Module to Rust

* Windows Failing CI

* ES module: Clippy changes

* ES module: Cmake failing CI

* ES module: Cmake failing CI

* ES Module: Fixed mistake in read_gop_info

* ES Module: Minor mistakes in pic.rs and seq.rs

* ES Module: Goptime regression failing

* ES Module: Windows failing CI

* ES Module: ASCII value change in userdata.rs

* ES Module: Formatting issues
2025-09-06 22:11:03 +05:30
Deepnarayan Sett
12a27f34a0 [Rust]Ported AVC Module to Rust (#1730)
* AVC Module: ported AVC Module to Rust

* AVC module: Minor semantic changes

* AVC Module: Failing CI

* AVC Module: SIMD Optimisations

* AVC Module: Optimization in SEI

* AVC Module: removed panic
2025-09-06 20:18:20 +05:30
Deepnarayan Sett
ba59eb0887 [FEAT] Removed C code already ported to Rust (#1738)
* Removal: Removed redundant C code already ported to Rust

* Removal: C formatting

* Removal: More Removal and CI issues in Mac

* Removal: CI issues in Mac

* Removal: Changes due to Rebase

* Removal: Failing CI on mac

* Removal: Failing regression test on dvdraw
2025-09-06 20:16:39 +05:30
Hridesh MG
3f441150b4 Fix Hardsubx OCR (#1741)
* fix: hardsubx segmentation fault

* fix: hardsubx garbage output

* chore: enable hardsubx on test builds
2025-09-02 13:58:02 +05:30
Hridesh MG
f09b6ff446 fix: ocrlang argument not working (#1742) 2025-09-02 03:40:23 +05:30
Hridesh MG
8c23447d35 Merge pull request #1740 from hrideshmg/fix_windows_dvb
Fix DVB Regressions on windows
2025-09-02 03:17:44 +05:30
Deepnarayan Sett
4b5f68a6a4 [FEAT] Remove share module (#1737)
* replaced nanomsg with nanomsg_sys

* feat: Share Module - squash commits

* Share Module: Added Documentation

* Share Module: Removed Sharing Service

* Share: formatting issues

* Share: failing CI

* Share: failing CI

* Share: Removed protobuf

* Share Module: Update CHANGES.txt

* Share Module: Update Cargo.lock

* Share Module: Update CHANGES.txt

* Share Module: Update Cargo.toml

* Share Module: Update Cargo.toml
2025-08-24 23:23:44 +05:30
dmo
25a447d42e Fix build with ffmpeg 8 (#1739) 2025-08-24 23:22:21 +05:30
Hridesh MG
7eba462b67 fix: unicode encoding regression (#1733) 2025-08-24 20:40:10 +05:30
Hridesh MG
a34ba0f6b7 fix: rust bitstream segfault (#1732) 2025-08-24 20:37:30 +05:30
rboy1
1ac3f05765 [FIX] Regression bug failing to compile with ENABLE_FFMPEG (#1728)
* Fix hardsubx_decoder.c compilation with ENABLE_FFMPEG

Fix unresolved function reference when compiling with ENABLE_FFMPEG

* Fix regression compilation ffmpeg_intgr.c to support ffmpeg 5

Fix regression bug for compiling with ENABLE_FFMPEG and ffmpeg 5, introduced in https://github.com/CCExtractor/ccextractor/issues/1418

* Update CHANGES.TXT

* Update ffmpeg_intgr.c

Update for changes to FFMPEG 5 API
2025-08-24 20:34:39 +05:30
Hridesh MG
39e051b731 fix: dvd regressions (#1714)
* fix: dvd regressions

* chore: fix clippy errors
2025-08-18 20:07:10 +05:30
Hridesh MG
7d95b0574d fix: CEA-708 segmentation faults on MP4 files (#1729)
* fix: CEA-708 segmentation faults on MP4 files

* chore: fix clippy errors
2025-08-18 20:04:48 +05:30
Hridesh MG
6300bb7bca refactor: remove api structures (#1722)
* refactor: remove api structures

* docs: add change to changes.txt
2025-08-11 07:48:38 +05:30
Deepnarayan Sett
afde4d601f feat(rust): Added Encoder Module (#1710)
* Added Encoder Module

* Encoder: Windows Compatibility

* Encoder: C formatting

* Encoder: recommended changes to the encoding module - logic reduction

* Encoder: Minor stylistic change

* Encoder: Review changes, renamed Line21 to Ascii

* Encoder: Slight modification in C version of write_cc_buffer_as_simplexml

* Encoder: Renamed 2 files

* Encoder: Minor Capitalization Change

* Encoder: Review Suggestions
2025-08-06 14:44:13 +05:30
Ari1009
5a016d09b1 fix: MCC encoder 16-bit sequence 2025-07-29 13:25:09 +05:30
Hridesh MG
b63a29cd2e fix: elementary stream regressions 2025-07-24 12:36:14 +02:00
Swastik Patel
81fdecd5af [FEAT] Add bitstream module in lib_ccxr (#1649)
* feat: Add bitstream module

* run code formatters

* Run cargo clippy --fix

* Run cargo fmt --all

* refactor: remove rust pointer from C struct

* feat: Add bitstream module

* run code formatters

* Run cargo clippy --fix

* Run cargo fmt --all

* refactor: remove rust pointer from C struct

* Added Bitstream to libccxr_exports

* Minor Formatting Issue

* Bitstream: Removed redundant CType

* bitstream: recommended changes for is_byte_aligned

* bitstream: recommended changes for long comments

* bitstream: comment fix

* bitstream: removed redundant comparism comments

---------

Co-authored-by: Deepnarayan Sett <depnra1@gmail.com>
Co-authored-by: Deepnarayan Sett <71217129+steel-bucket@users.noreply.github.com>
2025-07-07 04:41:31 +05:30
Deepnarayan Sett
099fa059c7 [FIX] 134 Codes in XDS and General Tests (#1708)
* Made pointers valid in Unit Tests of Decoder

* fix: test_do_cb

* Copilot Suggestions

* Suggestions about Redundancy

* Suggestions about Redundancy
2025-07-07 04:38:45 +05:30
Hridesh MG
e663eca763 fix: XDS segmentation faults (#1707)
* fix: XDS segmentation faults

* fix: memory leaks in unit tests for service decoder
2025-06-29 20:12:47 -07:00
Hridesh MG
77b93e5ced fix: cargo tests failing on windows (#1704) 2025-06-29 23:07:09 +05:30
Deepnarayan Sett
2260165682 Fixed Clippy Errors on 1.88 (#1706) 2025-06-26 16:35:44 -07:00
Hridesh MG
715597e325 fix: trigger windows builds in PRs (#1705) 2025-06-26 16:34:38 -07:00
Hridesh MG
407d0f4e93 fix: windows builds not triggering on rust changes 2025-05-20 09:43:27 +02:00
Deepnarayan Sett
9d1718f85f Fix Unit Test Rust based on the new changes on Rust 1.86.0 (#1694) 2025-05-18 19:39:11 +05:30
Hridesh MG
5b327c78fa fix: replace iconv with encoding_rs 2025-05-04 23:00:08 +02:00
Yasser
17247daf8b [IMPROVEMENT] Refactor and optimize Dockerfile (#1696)
* [FIX] Corrected bitness check for 64-bit systems

* Improve Dockerfile: cleanup, parallel build, and remove redundancies
- Replaced cd with WORKDIR for clarity and Docker best practices.
- Removed unused LIB_CLANG_PATH export, as it only affected a single build layer; the library is automatically detected during build.
- Parallelized the GPAC build using make -j$(nproc).
- Removed redundant CMD instruction, as ENTRYPOINT already defines the container's execution command.

* [DOCS] Update CHANGES.TXT for Dockerfile improvements

---------

Co-authored-by: AhmedYasserrr <ahmdyasrj@gamil.com>
2025-04-27 11:30:25 -07:00
Vatsal Keshav
888ffa4ee0 fix prepoc dir for compilation on mac silicon - autogen, cmake, build.command (#1688)
Co-authored-by: vats004 <=>
2025-04-10 22:44:08 -07:00
Yasser
3851d24315 [FIX] Corrected bitness check for 64-bit systems (#1680)
Co-authored-by: AhmedYasserrr <ahmdyasrj@gamil.com>
2025-03-30 11:15:59 -07:00
Vatsal Keshav
e597f01994 fix(rust): replaced deprecated std::intrinsics with std::ptr (#1668) 2025-03-30 00:54:53 +05:30
tank0nf
b62027a0ae [FIX] Issue#1665 Enhanced Matroska Language Tag Handling (#1671)
* fix unknown element for IETF tag

* added documentation changes

* added formatting for clang-format
2025-03-23 00:12:23 -07:00
Hridesh MG
9685ad6149 [FIX] DVB OCR: Memory Leak & Quantization Issues (#1675)
* fix: do not free ocr text before return

* fix(OCR): erode and dilate function
2025-03-22 16:53:16 -07:00
Hridesh MG
d7231d4567 fix: CMake builds failing due to oudated corrosion (#1677) 2025-03-21 19:06:14 -07:00
Hridesh MG
a84256da01 fix: debugdvb arg typo (#1673) 2025-03-14 05:39:10 -07:00
jstrot
9e2a594bca fix(ocr_bitmap): out of buffer memory copying the "last font tag" & use memmove (#1586)
* ASAN: process_spu copies overlapping buffers

* ocr_bitmap: Make sure there is enough room for the last_font_tag

* Update CHANGES.TXT

* Baseline formatting fixes

* fixup! Baseline formatting fixes

* fixup! fixup! Baseline formatting fixes

* Fix rust comment formatting

* cxx_options.copy_from_rust: Avoid "mutable reference to mutable static" warning
2025-03-11 19:21:33 +05:30
Mohd Umar Khan
fc01fa05bd Update dockerfile (#1652)
added /usr for three libs
2025-03-09 16:41:54 -07:00
hjrgrn
9ea3c9fd41 [FIX] Fix vulnerability in url crate (#1670)
* Update url crate

* Fix vulnerability discovered with `cargo-audit` by upgrading `url` crate to version `2.5.4`

* Update url crate in lib_ccxr submodule

* Fix vulnerability discovered with `cargo-audit` by upgrading `url` crate to version `2.5.4`

* Update Cargo.toml

* Update Cargo.toml with latest compatible version of every crate
2025-03-09 16:40:43 -07:00
dmo
d276fb17f7 Add support leptonica >= 1.83 (#1645) 2025-03-01 12:36:33 -08:00
Vatsal Keshav
8c90bda9a2 fix : ccxr compilation on macos (#1661) 2025-02-23 10:48:49 -08:00
Punit Lodha
27e1a3c849 Pass raw pointer to avoid mut ref to global variable warning 2025-02-22 17:50:11 +01:00
canihavesomecoffee
0912ac8de0 fix: fix clippy warning/error in lib_ccxr 2025-02-22 11:55:39 +01:00
canihavesomecoffee
65a0348b4f fix: fix 2 rust clippy warnings/errors 2025-02-22 11:55:39 +01:00
canihavesomecoffee
564795cdd3 fix: reformat C code according to latest clang-format guidelines 2025-02-22 11:55:39 +01:00
tank0nf
ffe075b1f3 [IMPROVEMENT] Clarify CEA-608/708 Subtitle Extraction Behavior #1448 (#1663)
* made changes to the file src/lib_ccx/params.c

* made changes to the help message.
2025-02-14 08:20:36 -08:00
Colin Cogle
b08c5faa74 Fix compile-time issue involving implicit declaration of mapclut_paletee() (#1648)
* Fix implicit declaration error on some systems.

This commit fixes a compile-time error regarding an implicit declaration
of mapclut_paletee() on some compilers and compiler versions.  Notably,
Arch Linux and Ubuntu 24.10 seem to be affected.

The error resolved is:

```
../src/lib_ccx/ocr.c: In function 'ocr_rect':
../src/lib_ccx/ocr.c:922:9: error: implicit declaration of function 'mapclut_paletee' [-Wimplicit-function-declaration]
  922 |         mapclut_paletee(palette, alpha, (uint32_t *)rect->data1, rect->nb_colors);
      |         ^~~~~~~~~~~~~~~
```

This was resolved by `#include`-ing "ccx_encoders_spupng.h" in the file
src/lib_ccx/ocr.c.  Thanks to GitHub user @steel-bucket for sharing the
fix in this issue's comments.

Fixes: #1646

* Update CHANGES.TXT.

Mention the fix for #1646.

Fixes: #1646
2024-11-27 11:47:41 -08:00
Ishan Grover
cbd8e27fe3 [FEAT] Add timing module in lib_ccxr (#1640)
* feat: Add new module for timings functionality

* feat: Add timing functionality in `timing.rs` module

* feat: List all module & function conversion

* chore: Clippy fixes

* feat: Equivalent `ccx_common_timing.h` functions in rust module

* feat: Add static constants & include struct in `build.rs`

* feat: Add extern C functions

* feat: Include & use rust extern functions in C

* fix: Windows build

* fix: Windows build

---------

Co-authored-by: Prateek Sunal <prtksunal@gmail.com>
2024-09-14 11:50:21 +02:00
Neo2SHYAlien
349020ece9 Add flag for Page Segmentation Modes control (#1601)
* Add flag for Page Segmentation Modes control

I added an flag --psm for controlling PSM (Page Segmentation Modes) in Tesseract. The default option (3) gives me quite bad results. When I use 6, 11, or 12 for Bulgarian, it gives me much better OCR results. I haven't tested other languages yet, but I expect improvements as well if other mode is used.

* feat: add psm for rust parser

* fix: add psm to options

* fix: add default value of psm to 3

* fix: correct type of ocr oem

* fix(rust): use fatal! instead of exit

---------

Co-authored-by: Prateek Sunal <prtksunal@gmail.com>
2024-09-03 19:09:56 +02:00
Prateek Sunal
1a13bbb071 [FIX] Issues in Tests (#1638)
* fix: add ucla checks for millis_separator

* fix: reassign back profane and capitalization lists to c

* fix: C formatting

* fix(rust): clippy warnings
2024-09-02 22:09:17 +02:00
Prateek Sunal
90f9f0a183 [FEAT] add teletext and encoders_helpers module (#1635)
* create lib_ccxr and libccxr_exports

* add bits and levenshtein module

* add log module

* add encoding module

* add common constants module

* add time units module

* add options module

* add teletext module

* chore: remove outdated

* chore: update lock files

* chore: fix naming

* fix: reference to TeletextConfig

* fix: issue with ts_forced_program default value

* fix: use correct definition

* chore: lint warnings

* fix: example code

* fix(rust): adjust defaults, more accurate logging, use safe functions, add encoders_helper module

* fix: tests and formatting

* fix: allow hex values for streamtype

* chore: format files

* fix: naming of fields and docs

* fix: defaults for options

* fix: memory leak in vector to string

* fix(c): init logger before running parser

---------

Co-authored-by: Elbert Ronnie <elbert.ronniep@gmail.com>
2024-08-27 15:21:25 +05:30
Prateek Sunal
98a85e1be3 [rust] add options module (#1632)
* create lib_ccxr and libccxr_exports

* add log module

* add encoding module

* add common constants module

* add time units module

* add options module

* chore: update Cargo lock files

* fix: remove duplicacy

* fix: doc error

* fix: errors

* fix: remove time folder

* chore: lint fix

* chore: lint fix

* fix: errors

* fix: add time mod to utils

* fix: unreachable code

* fix: logging function

* chore: update lock file

* chore: remove duplicate comment

* feat: blend parser and options

* chore: lint fix

* chore: lint fix

* fix: imports

* fix: error in version

* chore: lint fixes

* chore: more lint fixes

* fix: error in svc

* chore: remove from options function

---------

Co-authored-by: Elbert Ronnie <elbert.ronniep@gmail.com>
2024-08-19 22:04:01 +05:30
dependabot[bot]
92f2ce0fa0 chore(deps): bump actions/cache from 3 to 4 (#1633)
Bumps [actions/cache](https://github.com/actions/cache) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 23:42:44 +05:30
Emily
b92ca87835 [IMPROVEMENT] Use Corrosion to build Rust code (#1630) 2024-08-12 16:35:32 +02:00
dependabot[bot]
8d4fdd7f3e Bump actions/cache from 3 to 4 (#1613)
Bumps [actions/cache](https://github.com/actions/cache) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 19:36:24 +05:30
dependabot[bot]
b679215752 Bump microsoft/setup-msbuild from 1.3.1 to 2.0.0 (#1614)
Bumps [microsoft/setup-msbuild](https://github.com/microsoft/setup-msbuild) from 1.3.1 to 2.0.0.
- [Release notes](https://github.com/microsoft/setup-msbuild/releases)
- [Changelog](https://github.com/microsoft/setup-msbuild/blob/main/building-release.md)
- [Commits](https://github.com/microsoft/setup-msbuild/compare/v1.3.1...v2.0.0)

---
updated-dependencies:
- dependency-name: microsoft/setup-msbuild
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-08-12 19:34:29 +05:30
Mykhailo Yavorskyi
25e8b3642d [IMPROVEMENT] Ignore version check on mxf essence container (#1631)
* Ignore version check on mxf essence container

* Fix codestyle

* Update Changelog
2024-08-11 14:08:26 -07:00
Ishan Grover
f8001ae295 [FEATURE] Create unit test for rust code (#1615)
* feat: Add new function to allocate any object to heap with zero allocated

* feat: Add unit tests for `decoder/commands.rs`

* docs: Mention about PR in changelogs

* feat: Add unit tests for `decoder/windows.rs`

Refactor the code and use Default where needed
Implement `PartialEq` also

* fix: Intialise tmp extern C values for easy mocking

* feat: Add unit tests for `decoder/timing.rs`

* feat: Add unit tests for `decoder/output.rs`

* feat: Add unit tests for `decoder/mod.rs`

* feat: Add unit tests for `decoder/tv_screen.rs`

* feat: Add unit tests for `lib.rs`

* fix: Failing test

* feat: [WIP] Add unit tests for `decoder/service_decoder.rs`

* feat: Add unit tests for `decoder/service_decoder.rs`

* feat: Add unit tests for `hardsubx/imgops.rs`

* feat: Add unit tests for `hardsubx/utility.rs`

* fix: cargo clippy

* fix: doctest for `lib_ccxr` module

* feat: Add test `lib_ccxr/util/mod.rs`

* feat: Add test `lib_ccxr/util/levenshtein.rs`

* feat: Add test `lib_ccxr/util/bits.rs`

* feat: Add test `lib_ccxr/time/units.rs`

* chore: Change function name

* fix: Failing of missing values `tlt_config`

* ci: Run unit test cases in `lib_ccxr` module also

* ci: Run clippy & fmt in `lib_ccxr` module also

* chore(clippy): Fix clippy warnings
2024-08-11 12:14:47 +02:00
Ishan Grover
5f9b395bc6 [FEAT] Add encoding module in lib_ccxr (#1628)
* feat: Add new module `encoding`

* feat: Add code for `encoding.rs`

A module for working with different kinds of text encoding formats

* feat: Add code for function `line21_to_utf8`

* feat: Add code for remaining todos function
2024-08-10 13:09:12 +02:00
Prateek Sunal
9340cc7df6 [rust] add parser (#1619)
* feat: unpack gpac

* fix: linux ci

* fix: mac build

* fix: remove unused [no ci]

* fix: ignore config.h [no ci]

* temp commit, will drop this soon

* fix: install gpac

* fix: gpac

* fix: formatting

* fix: preproccessor directive

* fix: comment display version for now

* fix: display dlls code

* fix: bundle vcruntime in hardsubx windows

* fix: again

* fix: erros in ci

* fix: ci

* fix: add vcruntime in additional dependencies

* fix: try to copy vcruntime after build

* fix: space in runtime library

* fix: remove for now [no ci]

* fix: things in vcxproj

* fix: ci for leptonica sys

* fix: docs

* fix: copy dlls on post build event

* fix: copy vcruntime after build

* feat: add arguments through clap

* fix: type of some arguments

* fix: "-" and "--" in comments

* fix: format files

* fix: add argument parsing till mkvlang

* fix: one todo item

* chore: lint fixes

* fix: nocodec value

* fix: for nocodec

* fix: add cfg feature for hardsubx

* feat: complete till startcreditstext

* fix: add more notes, args: option affect processed

* feat: port all till network stuff

* fix: complete almost all argument parsing

* fix: error free code

* fix: complete params port

* fix: hardsubx erros

* feat: clean up main function

* fix: pr reviews

* fix: make input,output function better

* fix: variant not used warning

* fix: warnings

* fix: all clippy warnings

* feat: add tests

* feat: add tests

* chore: lint fixes

* fix: move unit tests to correct folder

* fix: remove unncessary files

* fix: make function for parse_args

* fix: review changes

* fix: Impl CcxOptions whenever I could

* fix: try to convert rust to c

* chore: push c code

* fix: add more rust to c conversions

* fix: use set methods for bitfield

* fix: errors

* fix: arguments parsing

* fix: all issues

* fix: many errors

* chore: lint fix

* fix: err

* fix: unsafe function error

* fix: unsafe warning

* fix: safety lint

* chore: add docs

* fix: windows build

* fix: function

* fix: dependencies

* fix: set_binary_mode

* chore: lint fix

* fix: set_binary_mode for windows

* fix: error

* fix: undefined reference error

* chore: remove comment

* fix: output field

* chore: fix lint

* fix: ru1, ru2, ru3

* fix: undef before

* fix: parameter and update deps

* chore: update vcpkg

* feat: add release-with-debug profile

* fix; uncomment code

* fix: update visual studio to 2022

* chore: update docs

* fix: use default vcpkg

* fix: caching logic on release ci

* fix: vcpkg caching

* fix: add setup vcpkg

* chore: remove unneccesary formatting

* fix: Always write 2 bytes for UTF-16BE

* fix: formatting

* feat: add rest of the notes to bring continuity

* fix: remove extra line

* fix: add hardsubx note

* fix: source code format error

* chore: lint fixes acc to rustfmt

* feat: add unit test ci

* fix: conversion of strings, add file queue handling

* fix: decoder cfg

* fix: update dependencies

* chore: lint fix

* chore: add safety doc

* fix: default value for CcxOptions

* fix(rust): default value for teletext

* fix: leptonica version for windows

* fix: format errors

* fix: workflow

* Revert "fix: leptonica version for windows"

This reverts commit 461ef55e7b.

* fix: pin ffmpeg to 6 for mac

* fix(parser): default values and unwrap's

* fix(parser): hardsubx fixes

* chore(parse): lint fixes

* fix(windows): switch back to sdk 2019

* fix(workflow): windows workflow revert

* fix(windows): revert to old files which were working before

* fix(workflow): pin vcpkg packages

* chore(rust): downgrade leptonica

* fix(windows): move vcpkg.json to correct place

* fix(windows): improve vcxproj

* fix(windows): workflow

* fix(windows): workflow

* fix(windows): workflow clone from vcpkg everytime

* fix(workflow): error

* fix(workflow): don't skip building vcpkg

* fix: remove depth from vcpkg

* temporary commit

* fix(windows): pin gpac and use local vcpkg manifest properly

* fix(windows): install vcpkg dependencies manually

* fix(windows): update dll names

* fix(windows); dependencies copy

* fix(windows): don't continue on error for release

* fix(macos): build ffmpeg for mac workflow

* fix: move ffmpeg to current workspace

* fix: re-add profile for windows

* fix: pkg config for mac

* fix(mac): use ffmpeg@6 from brew

* fix(macos): there is no ffmpeg_prebuilt

* fix(macos): specify ffmpeg pkg config

* fix(macos): globally define pkg config

* fix(macos): add ffmpeg include and libs dir

* fix(macos): include ffmpeg headers in makefile

* fix: include ffmpeg libraries and include directories

* fix: try to manually specify ffmpeg header in rust

* fix: also include leptonica headres

* fix: leptonica name

* fix: test

* fix: string null when output_filename is empty

* fix: error

* fix: remove cflgas

* fix(mac): disable cmake ocr hardsubx

* chore: update gitignore

* fix: null if string is empty

* fix: allow --in

* chore: bump version to 1.0 in rust

* chore: add space to trigger sp

* fix: don't panic with rust

* fix: add double dashes to indicate parameters

* chore: update CHANGES.txt

* fix: test

* fix(workflow): update workflow name

* fix(rust): linux output_filename in sampleplatform

* fix(rust): parser default values

* fix(rust): exit with MalformedParameter instead of panic

* fix(decoder): revert always write 2 bytes

* chore(rust): format

* chore: update lock file

* fix(test): test lib_ccxr and rename to test

* fix(mac): remove failing cmake_ocr test

* fix: ci errors

* fix: feature related changes

* fix: trim down default features

* fix: don't check clippy for all features
2024-08-10 12:55:21 +02:00
Ishan Grover
90204d4cc6 [DOCS] Add C to Rust code migration guide (#1629)
* docs: Add c-to-migration guide docs

* docs: Update suggested typos in `docs/Rust_migration_guide.md`

Co-authored-by: Punit Lodha <48253287+PunitLodha@users.noreply.github.com>

---------

Co-authored-by: Punit Lodha <48253287+PunitLodha@users.noreply.github.com>
2024-08-07 12:08:24 +02:00
Ishan Grover
34bb9dd20d [FEATURE]: Create Docker image for CCExtractor (#1611)
* docs: Create a README for docker image usage

* docs: Update `COMPILATION.md` for adding docker instruction

* docs: Add detailed docker building & usage guide

* feat: Add dockerfile

* feat: Make dockerfile to build CCExtractor

* fix: dockerfile

* feat: Optimize docker image size

* docs: fix some commands usage

* docs: Mention docker image creation in CHANGES.txt

* docs: Update readme to remove dockerhub method
2024-07-16 20:17:57 -07:00
Ishan Grover
8d9bf42be2 [FEAT] Add time units module in lib_ccxr (#1623)
* chore: Add cargo dependencies

* feat: Make time module in `lib_ccxr`

* feat: Add conversion guide in `time/mod.rs` module & Create `units` module

* feat: Add time units code

* feat: Make time module in `lib_ccxr/util` & Add helper function

* feat: Add utils time related functions

* feat: Add extern functions in `libccxr_exports`

* feat: Add extern functions in C and use in proper place

* docs: Mention in Changelogs
2024-07-17 00:04:48 +02:00
Ishan Grover
8e4c07ed97 [FEAT] Add bits and levenshtein module in lib_ccxr (#1627)
* feat: Add 2 new modules

* feat: Add `levenshtein` module & code

* feat: Add `bits` module & code

* feat: Add `extern "C"` function which are equivalent in C-RUST

* feat: Call extern ccxr_ functions in C code

* docs: Mention in Changelogs
2024-07-16 20:00:15 +02:00
Ishan Grover
cf9c9dde53 [FEAT] Add constants module in lib_ccxr (#1624)
* feat: Add common module

common module is made for all `ccx_common_*` files

* feat: Add constants module within common module

Used to have all constants enums listed in ccx_common_constants C file

* feat: Add all constants, enums in rust equivaleent to `ccx_common_constansts` C file

* docs: Mention in Changelogs

* docs: Add more conversion data
2024-07-16 18:12:03 +02:00
Ishan Grover
f5da158935 [FEAT] Add log module in lib_ccxr (#1622)
* chore: Add bitflags crate as dependency

* feat: Add function to initialize Rust logger using options in C

* feat: Add new module `log`

* refactor: Add ccx_s_option into list of bindgen struct

* feat: Add Initialize logger function

* feat: All logging functions & macros

* chore: Fix clippy

* docs: Mention in Changelogs

* chore: format issue fix

* fix: Remove activity_header from rust & use initially to print in C

* refactor: Remove debugging statements

* fix: Add `\n` in info!
2024-07-16 17:45:24 +02:00
Ishan Grover
f12f12b916 [FEAT] Create lib_ccxr and libccxr_exports (#1621)
* create lib_ccxr and libccxr_exports

* chore: Fix bindgen crate version

* chore: Fix rsmpeg crate version

* docs: Add PR info in Changelogs

---------

Co-authored-by: Elbert Ronnie <elbert.ronniep@gmail.com>
2024-07-03 10:31:39 -07:00
Ishan Grover
d6ccf1bfcb [FEATURE] Port 708 decoder encoding module to RUST (#1607)
* feat: Add `decoder/encoding` new module

This `decoder/encoding.rs` file will contain the content of
`lib_ccx/ccx_708_decoder_encoding.c` file

* feat: Add encoding functions

* feat: Add conditional compilation to include Rust functions

* fix: conditional compilation logic

* refactor: Use of match statement instead of if-else

* fix: Calling C function for rust

* feat: Enable `derive_default` feature
2024-05-29 09:28:24 +02:00
Prateek Sunal
8e3b145477 [FIX (Windows)] CI build (#1612) 2024-05-28 21:13:05 +02:00
Ishan Grover
5748042f6d [FIX] Unexpected behavior of get_write_interval (#1609)
* fix: Unexpected behavior of get_write_interval

Adresses Issue#1606

* docs: Add changes to `CHANGES.TXT`
2024-05-24 11:20:48 -07:00
Sberm
3f504412f5 Add gpac package in compilation guide on Archlinux (#1605) 2024-04-04 21:07:39 -07:00
superbonaci
312d10c001 Update COMPILATION.MD: Add gpac-devel dependency for RHEL/Fedora (#1602) 2024-03-24 10:34:20 -07:00
Ishan Grover
f08febfd61 [FEATURE] Create linux AppImage for building CCExtractor (#1592)
* feat!: Add script for building AppImage

* chore(delete): Remove `build-static.sh` file

* refactor: Add link for logo photo

* chore: Replace dead link
2024-03-03 14:59:44 -08:00
Ishan Grover
89a12a7dd0 Bump rsmpeg to latest version for ffmpeg bindings (#1600)
* chore(deps): bump `rsmpeg` to latest version

* docs: Mention in CHANGES.TXT
2024-03-03 08:58:20 -08:00
Ishan Grover
2ada36d50e [FEATURE] Add SCC support to CEA-708 decoder (#1595)
* feat: Add timing functions for SCC format in C & Rust

* feat: Add SCC support to Rust 708 decoder

* feat: Add SCC support to C 708 decoder

* docs: fix symbol in scc_time format

* chore: clippy fixes

* docs: Add new feature in Changelog

* fix: update SCC timing functions according to need

* feat: Add new member(old caption end time) for overlapping situations

* fix: update SCC timing functions according to need

* feat: Add support for overlapping captions situations

* fix: frame formula for timings

* feat: Add support for orientation of subtitles in C

by adding necessary labels needed for it

* feat: Add support for orientation of subtitles in Rust

by adding necessary labels needed for it

* docs: Add info for scc labels

* chore: clippy fixes

* docs: Add what `add_needed_scc_labels` do and correct parameters name
2024-02-17 17:58:01 -08:00
dependabot[bot]
2d2a210c54 Bump actions/cache from 3 to 4 (#1589)
Bumps [actions/cache](https://github.com/actions/cache) from 3 to 4.
- [Release notes](https://github.com/actions/cache/releases)
- [Changelog](https://github.com/actions/cache/blob/main/RELEASES.md)
- [Commits](https://github.com/actions/cache/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-04 20:02:21 +01:00
dependabot[bot]
deaa4a68e0 Bump microsoft/setup-msbuild from 1.3.1 to 2.0.0 (#1593)
Bumps [microsoft/setup-msbuild](https://github.com/microsoft/setup-msbuild) from 1.3.1 to 2.0.0.
- [Release notes](https://github.com/microsoft/setup-msbuild/releases)
- [Changelog](https://github.com/microsoft/setup-msbuild/blob/main/building-release.md)
- [Commits](https://github.com/microsoft/setup-msbuild/compare/v1.3.1...v2.0.0)

---
updated-dependencies:
- dependency-name: microsoft/setup-msbuild
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-04 19:59:52 +01:00
Prateek Sunal
f449d06cd1 chore: lint fixes acc to rustfmt (#1598) 2024-02-04 19:57:04 +01:00
Asher
c550726778 Typo in compilation docs (#1588)
* Typo in compilation docs

* [Fix] Deprecated leptonica name

With version 1.84.0, the library is changed from `liblept` to
`libleptonica`.
http://www.leptonica.org/source/version-notes.html
2024-01-15 00:50:36 -08:00
Prateek Sunal
bce63b88dc [FIX] Compatibility of Arguments in C (#1564)
* feat: breaking all parameters

* fix: some parameters

* fix: many things

* fix: error

* fix: -h

* fix: more parameters

* fix: add dash to help commands

* fix: help for output-field

* fix: single dash

* fix: --out and --in

* fix: move notes to the end of help menu

* fix: final changes to notes

* fix: extra spacing

* fix: wrong formatting of parenthesis
2024-01-14 09:47:13 -08:00
dependabot[bot]
63a259a313 Bump actions/checkout from 3 to 4 (#1567)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-09 20:01:49 +01:00
dependabot[bot]
eef2591c25 Bump actions/upload-artifact from 3 to 4 (#1587)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 3 to 4.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-09 20:00:59 +01:00
dependabot[bot]
870e8bb6ac Bump AButler/upload-release-assets from 2.0 to 3.0 (#1577)
Bumps [AButler/upload-release-assets](https://github.com/abutler/upload-release-assets) from 2.0 to 3.0.
- [Release notes](https://github.com/abutler/upload-release-assets/releases)
- [Commits](https://github.com/abutler/upload-release-assets/compare/v2.0...v3.0)

---
updated-dependencies:
- dependency-name: AButler/upload-release-assets
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-08 19:33:21 +01:00
Vitaly Lysenkov
d2f17deb2c **[FIX]** fix infinite loop in MP4 file type detector and processor (#1566)
* Update stream_functions.c: fix MP4 file type detector

On bad inputs containing e.g. the following sequence of bytes within the first 1MiB "ff ff ff ff 6d 65 74 61" `detect_stream_type` was executing an infinite loop because "ff ff ff ff" was interpreted as a length of the candidate "meta" MP4 box, caused the size_t overflow inside `isValidMP4Box` which pointed `nextBoxLocation` to the previous byte and the execution flow processed the same "meta" again.

* Update CHANGES.TXT

* Treat a candidate MP4 box as invalid instead of bailing out

* Fix stuck mp4 processing in `process_avc_sample`

On corrupted inputs it could read data past the sample end and also get stuck in an infinite loop.

* Fix the stats code to not count zero-sized NALs and avoid dereferencing memory past the NAL end

* Add comment.

* Format changes
2024-01-08 19:30:14 +01:00
Om Thorat
376ff83161 [FIX] Compilation.md - Added a note for Ubuntu 23.10 (#1581)
* [FIX] Added a note for Ubuntu 23.10

libgpac-dev isn't available on Ubuntu 23.10 (Mantic) added a note instructing to build it from source instead.

* [FIX] Added build instructions for Ubuntu 23.10 and later

libgpac-dev isn't available in Ubuntu 23.10 and later, hence causing the build to fail. added the instructions to build it from source.
2023-12-11 05:26:06 -08:00
Prateek Sunal
79aaf86593 [FIX] #1549 Configure Script (#1574)
* fix: #1549 backticks

* fix: use single equal to
2023-10-23 07:24:48 -07:00
Prateek Sunal
280939df75 [FIX] Windows CI (#1568)
* fix: undef before

* chore: bump rust packages

* chore: update vcpkg
2023-09-12 15:30:52 +00:00
Prateek Sunal
af6308b167 [IMPROVEMENT] Mac CI (#1546)
* feat: unpack gpac

* fix: linux ci

* fix: mac build

* fix: remove unused [no ci]

* fix: ignore config.h [no ci]

* temp commit, will drop this soon

* fix: install gpac

* fix: gpac

* fix: formatting

* fix: preproccessor directive

* fix: comment display version for now

* fix: display dlls code

* fix: bundle vcruntime in hardsubx windows

* fix: again

* fix: erros in ci

* fix: ci

* fix: add vcruntime in additional dependencies

* fix: try to copy vcruntime after build

* fix: space in runtime library

* fix: remove for now [no ci]

* fix: things in vcxproj

* fix: ci for leptonica sys

* fix: docs

* fix: copy dlls on post build event

* fix: copy vcruntime after build

* feat: mac ci

* fix: ci dependencies

* fix: more depdendencies

* fix: libavcodec not found

* fix: include directories in mac

* fix: error in endif()
2023-08-18 20:16:45 +00:00
Prateek Sunal
aa4a76a941 [FEAT] Use system gpac library instead of vendoring gpacmp4 (#1535)
* feat: unpack gpac

* fix: linux ci

* fix: mac build

* fix: remove unused [no ci]

* fix: ignore config.h [no ci]

* temp commit, will drop this soon

* fix: install gpac

* fix: gpac

* fix: formatting

* fix: preproccessor directive

* fix: comment display version for now

* fix: display dlls code

* fix: bundle vcruntime in hardsubx windows

* fix: again

* fix: erros in ci

* fix: ci

* fix: add vcruntime in additional dependencies

* fix: try to copy vcruntime after build

* fix: space in runtime library

* fix: remove for now [no ci]

* fix: things in vcxproj

* fix: ci for leptonica sys

* fix: docs

* fix: copy dlls on post build event

* fix: copy vcruntime after build
2023-08-17 20:03:03 +00:00
Prateek Sunal
35e73c1c90 [FIX] Rename generic bitstream.h to cc_bitstream.h #1436 (#1543)
* fix: rename bitstream

* fix: update vcpkg commit hash

* fix: try to fix the linker error
2023-07-05 03:52:51 -07:00
Willem
5b7666965f Cleanup vs configs (#1539)
* Delete (probably) wrongly committed vs config file

* Remove Nuklear GUI

* Clean up SLN configs (Reduce to 64 bit full debug & release)

* Sync bat scripts, prepare to move

* Build rust in release when release

* Update changelog

* Delete rustx86.bat
2023-05-29 18:34:15 +00:00
Prateek Sunal
3efb2b1a68 [FIX] Update Windows build (#1540)
* fix: update windows ci

* fix: update docs for compilation

* fix: build runtime library
2023-05-22 15:08:52 +02:00
Elbert Ronnie
6bcc53ecf9 Provide unique values to enums (#1538) 2023-05-21 08:35:45 -07:00
Mohnish Deshpande
7b873e1902 Fix typo in ffmpeg_intgr.h (#1527) 2023-04-09 11:04:26 -07:00
Elbert Ronnie
005ef5a731 [FIX] Incorrect skipping of packets (#1528)
* don't skip entire packet on undefined window

* always clear packet before starting new one

* mention in CHANGES.TXT
2023-04-09 11:03:26 -07:00
Prateek Sunal
72e769b145 fix: update vcpkg ref (#1529) 2023-04-09 01:35:54 -07:00
Daniel Houck
cf2d207ba1 Fix McPoodle broadcast raw format output (#1523)
The broadcast raw format *must* contain data from onely one field, or
neither `ccextractor` nor McPoodle's tools can actually read it.  Since
we don't actually get XDS data from `writeraw`, there's no reason to
keep the call for field 2.
2023-03-30 08:08:34 -07:00
Elbert Ronnie
d768474e50 [FIX] encoding of solid block in latin-1 and unicode (#1522)
* Fix encoding of solid block in latin1 and unicode
2023-03-29 16:06:09 -07:00
Daniel Houck
4a7dd139ec [FIX] #1520 keep webvtt-full formatting in sync (#1521) 2023-03-27 16:24:00 -07:00
Chidam
fa85a5270d [FIX] #1516 in webvtt added support to two-three-four utf-8 bytes (#1518)
* in webvtt added support to two-three-four utf-8 bytes
2023-03-26 18:39:46 -07:00
Willem
7994096669 Apply formatting (again) (#1519) 2023-03-26 09:39:20 -07:00
Punit Lodha
d379d72685 Add avfilter for hardsubx (#1514) 2023-03-22 17:12:56 -07:00
Donough Liu
9b2215d9c2 hardsubx: Add missing -lavfilter for hardsubx linking (#1513) 2023-03-22 12:18:35 -07:00
Carlos Fernandez Sanz
29562759d2 Specify POSIX locale for numerics 2023-03-22 09:22:47 -07:00
Carlos Fernandez Sanz
0b6a8987ca Fix memory leak processing mp4 files with GPAC (with sample from #1410) 2023-03-21 21:47:57 -07:00
Carlos Fernandez Sanz
a679aadd3a Fix ocr.c writing outside allocated memory #1251 2023-03-21 21:25:15 -07:00
Carlos Fernandez Sanz
77b9696a37 Fix memory leaks (#1511) 2023-03-21 21:13:42 -07:00
Carlos Fernandez Sanz
f21d9e8737 Add address sanitizer on debug build 2023-03-21 20:06:20 -07:00
ziexess
fb3da4cd3a add erosion then dilation after quantization (#1510) 2023-03-21 14:01:59 -07:00
Prateek Sunal
b983de6a54 [IMPROVEMENT] Make Environment variables for Hardsubx optional (#1508)
* feat: automatically link ffmpeg

* fix: ci

* chore: documentation update for vcpkg and hardsubx

* fix: add ffmpeg5 feature

* fix: remove ffmpeg5 feature

* fix: update rsmpeg
2023-03-21 09:53:55 -07:00
Punit Lodha
260052b68c update compilation docs for hardsubx (#1507) 2023-03-20 07:06:39 -07:00
Apteryks
8105bc0b73 linux/configure.ac: Fix tesseract conditional problem. (#1504)
Fixes #1503.

Using tesseract-ocr's stock pkg-config, it would produce an error due to
unquoted whitespace:

  $ test ! -z `pkg-config --libs-only-l --silence-errors tesseract`
  bash: test: syntax error: `-larchive' unexpected

* linux/configure.ac: Use a positive test, and double-quote the $() command
substitution.

Co-authored-by: Carlos Fernandez Sanz <carlos@ccextractor.org>
2023-03-17 07:56:41 -07:00
Apteryks
ea4998f635 linux/Makefile.am: Add missing generated header. (#1505)
This header is generated by the pre-build.sh script.  The compilation
fails if it is missing.

* linux/Makefile.am (ccextractor_SOURCES): Add
../src/lib_ccx/compile_info_real.h.
2023-03-17 07:54:49 -07:00
Archit Bhonsle
cb496a7119 [IMPROVEMENT] getting rid of the warnings during rust builds (#1497) 2023-03-16 00:13:00 +01:00
Prateek Sunal
79958f7393 [IMPROVEMENT] Update documentation for windows build (#1498)
* fix: update instructions for FFMpeg on windows

* fix: update docs in COMPILATION.md

* fix: error in doc
2023-03-13 15:17:30 -07:00
Prateek Sunal
0264e7da2b [IMPROVEMENT] Update Rust and fix windows build (#1480)
* fix: bump leptonica-sys to 0.4.3 and update Cargo.lock

* fix: bump rust version to 1.57.0 and build vcpkg for window hardsubx builds

* fix: add Bcrypt dependency

* fix: switch to rust stable

* chore: bump package versions

* fix: try to remove i686 to fix error

* fix: install tesseract and lint fixes

* fix: try using ffmpeg the third

* fix: include headers

* fix: add rsmpeg

* fix: switch default triplet to static md

* fix: import errors

* fix: directory path

* fix: pre build commands

* fix: update vcxproj

* fix: linux ci

* fix: ci fixes

* chore: lint fixes

* fix: error

* fix: copy include files

* fix: ci error

* fix: link swresample lib

* fix: some errors

* fix: include directory path and include all libraries

* fix: try to add library directories

* fix: fixes in libraries

* fix: formatting ci

* fix: mflat errors

* fix: libcurl

* fix: preprocessor definitions

* fix: add libcrypto

* fix: remove lib_hash to fix conflicts (we have libcrypto already)

* fix: add avcodec and avformat dependencies on windows

* fix: add remaining deps that may fix the build

* fix: add crypt depdency

* fix: rename conflicting names

* Revert "fix: remove lib_hash to fix conflicts (we have libcrypto already)"

This reverts commit f57ff716ed.

* fix: prefix with CC_

* fix: post build actions

* fix: ocr error

* Revert "fix: ocr error"

This reverts commit 92599454b6.

* fix: xcopy error

* fix: generated file name for x64

* fix: ocr error

* fix: add item group at top to see if it works

* fix: remove unwanted headers, removed \\ from VCPKG_ROOT, remove unwanted includes in vcxproj

* fix: add libpng for non hardsubx, comment the broken ocr code again

* fix: libpng path

* feat: add lib png headers in ClCompile

* fix: png.h not found

* fix: last try for ocr fix

* fix: libpng not found

* fix: cl compile headers

* fix: libpng and ocr

* fix: libpng error

* fix: redefinition error

* fix: zlib for non hardsubx

* fix: lib names

* fix: zlib.h not found
2023-03-12 13:45:21 -07:00
Archit Bhonsle
257388bad3 reverting names of the secondary linux build scripts (#1496) 2023-03-12 11:23:28 -07:00
Archit Bhonsle
1604572995 [IMPROVEMENT] linux/build script revamp (#1494)
* improving `linux/build` script

* docs for the improved `linux/build` script
2023-03-12 08:38:06 -07:00
Ibrahim M. Akrab
9125165231 [FIX] tesseract 5.x traineddata location in ocr (#1493)
* fix traineddata location with tesseract version 5.x in ocr

* Add the fix to changelog
2023-03-10 11:14:36 -08:00
Prateek Sunal
b1cbfcea9b fix: ffmpeg 5 and tesseract 5 compatibility (#1479)
* fix: replace deprecated `codec` property with `codecpar`

* fix: replace deprecated method `avcodec_decode_video2` with `avcodec_receive_frame` and `avcodec_send_packet`

* Update CHANGES.TXT

* fix: remove deprecated `av_register_all` function

* fix: formatting

* fix: add support for tesseract 5

* fix: tesseract v5

* fix: hardsubx codec context error

* fix: lint const warning
2023-03-08 12:14:53 -08:00
Prateek Sunal
8bb52fa6d5 fix: broken -hardsubx flag (#1491) 2023-03-08 10:27:38 -08:00
Archit Bhonsle
7bd3f7e788 Adding Arch Linux instructions and other minor fixes to COMPILATION.MD (#1482) 2023-03-07 19:57:19 -08:00
Elbert Ronnie
f4bf40b05d Fix missing # in color attribute of font tag (#1486)
Co-authored-by: Elbert Ronnie <elbertronnie@gmail.com>
2023-03-07 11:01:21 -08:00
dependabot[bot]
b488126d09 Bump microsoft/setup-msbuild from 1.1.3 to 1.3.1 (#1475)
Bumps [microsoft/setup-msbuild](https://github.com/microsoft/setup-msbuild) from 1.1.3 to 1.3.1.
- [Release notes](https://github.com/microsoft/setup-msbuild/releases)
- [Changelog](https://github.com/microsoft/setup-msbuild/blob/main/building-release.md)
- [Commits](https://github.com/microsoft/setup-msbuild/compare/v1.1.3...v1.3.1)

---
updated-dependencies:
- dependency-name: microsoft/setup-msbuild
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-02-06 18:17:48 +00:00
Willem
1c6160f548 Run clang-format on all source files (#1465) 2022-12-14 22:17:57 +00:00
abhi-kr-2100
40145abccf [FIX] Fix issue #1453: Respect -stdout if multiple CC tracks are found in a Matroska input file (#1460)
* Respect `-stdout` if multiple CC tracks are found

When passed the `-stdout` flag, CCExtractor should write the
subtitles to standard output, instead of an output file.

However, as noted in Issue #1453, CCExtractor doesn't
respect the `-stdout` flag when multiple CC tracks are present in
a Matroska input file (usually .mkv).

This commit ensures that output is written to standard output if `-
stdout` is present even if the input file is a Matroska container
with multiple CC tracks.

Signed-off-by: Abhishek Kumar <abhi.kr.2100@gmail.com>

* Mention fixing of issue #1453 in changelog

Signed-off-by: Abhishek Kumar <abhi.kr.2100@gmail.com>

* Correctly spell Matroska

Signed-off-by: Abhishek Kumar <abhi.kr.2100@gmail.com>

Signed-off-by: Abhishek Kumar <abhi.kr.2100@gmail.com>
2022-12-14 13:46:16 -08:00
emkman99
492f0d5197 [FIX] WebVTT X-TIMESTAMP-MAP header placement (#1463) (#1464)
* [FIX] WebVTT X-TIMESTAMP-MAP header placement (#1463)
* Fixed --no-timestamp-map flag
* Disable X-TIMESTAMP-MAP by default
* X-TIMESTAMP-MAP is only part of the HLS spec, and is not valid WebVTT, so it should be disabled by default.
* Write second WebVTT newline when timing info is missing
2022-12-14 13:44:17 -08:00
dependabot[bot]
4b0928ad9b Bump microsoft/setup-msbuild from 1.0.2 to 1.1.3 (#1456)
Bumps [microsoft/setup-msbuild](https://github.com/microsoft/setup-msbuild) from 1.0.2 to 1.1.3.
- [Release notes](https://github.com/microsoft/setup-msbuild/releases)
- [Changelog](https://github.com/microsoft/setup-msbuild/blob/master/building-release.md)
- [Commits](https://github.com/microsoft/setup-msbuild/compare/v1.0.2...v1.1.3)

---
updated-dependencies:
- dependency-name: microsoft/setup-msbuild
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-12-01 09:38:16 +01:00
Shashwat Singh
0e3dfdc73b [WIP] Port hardsubx classifier (#1446)
* add tesseract-sys in dependencies of rust modules

* add appropriate feature flags and required packages to cargo toml

* expose classifier

* Redefine structs that are required for hardsubx

Note: rust-bindgen isn't being used directly for this because it will also redefine structures of leptonica, tesseract, and ffmpeg and we don't want that.
We want to use definitions of structs as in the rust interfact libraries we are importing

* write code to generate bindings for mprint

* - write a function to convert rust strings to c strings
- write a memory safe wrapper to mprint that uses above function

* - add helper function to deal with tess strings in a memory safe manner
- port get_ocr_text_simple
- port get_ocr_text_wordwise

* improve conversion of C string to Rust string by using built-in functions

* replace mprint usage with warn!

* port get_ocr_text_letterwise

* remove redundant mprint function

* improve readability _tess_string_helper by using more general variable names inside

* make get_ocr_text_simple call get_ocr_text_simple_threshold to remove redundant codefix bugs

* remove manual definition of cc_subtitle and use bindgen bindings

* style changes to rust hardsubx classifier

* add get_ocr_text_letterwise_threshold and make get_ocr_text_letterwise call it appropriately

* move hardsubx context struct to mod.rs

* add get_ocr_text_wordwise_threshold and make get_ocr_text_wordwise call it

* use the ffmpeg-sys definition of Pix

* hide ported functions under macros

* use the AVPacket from bindings and not ffmpeg to make compatibility work for now.
TODO: rewrite init_hardsubx and also deal with the ffmpeg stuff when that is done

* improce _tess_string_helper by using appropriate built-in functions

* linter recommended changes

* clang style change

* fix loop bug that didn't allow for re-evaluation of it on usage of continue statement

* start porting of decoder with the _process_frame_color_basic function and related code

* hide the C version of _process_frame_color_basic behind an #ifdef

* add _process_frame_tickertext

* hide the C version of _process_frame_tickertext behind ifdef and add #[no_mangle] to the rust version

* check if word is empty as soon as word is detected

* port _process_frame_white_basic

* hide the C version _process_frame_white_basic behind compiler macros

* stylistic changes

* safety docs for hardsubx classifier

* safety docs for decoder as of now

* safe docs for utils.rs

* style changes

* format and style changes

* modify safety docs

* formatting fix
2022-10-24 08:13:28 +02:00
Willem
4cb474c5a3 Update build_windows.yml 2022-07-18 20:54:40 +00:00
Willem
19f6ef43ef Update build_linux.yml 2022-07-18 20:51:22 +00:00
Willem
4dbcbe083e Update build_linux.yml 2022-07-18 20:50:15 +00:00
Willem
2a9a922d1a Update build_linux.yml 2022-07-18 20:46:50 +00:00
Willem
0d3e1d003d Update build_linux.yml
Try to fix bad behaviour for pushes
2022-07-18 20:43:26 +00:00
Shashwat Singh
170066f046 Port hardsubx utility (#1443)
* set up bindings conversion of hardsubx utility functions (and structs) and set up the module

* add low level ffmpeg rust binding

* Methods ported:

- convert_pts_to_ns
- convert_pts_to_ms
- convert_pts_to_s

A pure rust method was added called _edit_distance_rec that implements levenstein distance calculation using recursion and dynamic programming

The port of edit_distance_rec is simply a wrapper that calls above function.

This redundancy won't be nevessary as more downstream modules are ported to Rust

* put C code of hardsubx_utility under define rust flag

* run formatter

* make compilation of hardsubx rust modules conditional on the HARDSUBX and the OCR flags. Make ffmpeg a conditional dependency based on those flags

* remove namespaced dependency in cargo because that is a nightly feature

* add conditioal compilatio of ffmpeg related bindigs in build.rs

* make clang argument of -DENABLE_HARDSUBX conditional on cargo feature of hardsubx_ocr

* enable specific relevant features for ffmpeg-sys-next

* enable hardsubx_ocr feature in windows build

* add build feature in ffmpeg-sys-next

* ffmpeg build feature is conditional on platform

* Revert "ffmpeg build feature is conditional on platform"

This reverts commit e456fee942.

This is because conditional features do not work in cargo toml

* install yasm in the linux build github action for ocr and hardsubx enabled cmake

* turn globals to locals to reduce code

* remove redundant attributes

* style changes

* make import of ffmpeg-sys-next conditional on hardsubx_ocr flag

* add --all-features flag in clippy for github workflow

* run formatter

* fix clippy command

* install yasm as part of rust format build check

* install libtesseract-dev etc. for clippy build test

* readability change

* declare the function edit_distance as unsafe

* remove commented code

* formatting changes

* combine declaration and assignment

* add build command for building hardsubx rust

context to issue: #1445

* make hardsubx rust work with autoconf build. For issue: #1445

* update autoconf for mac for issue #1445
2022-07-13 14:36:30 +05:30
Punit Lodha
0bd213e789 Fix file extension for IDX files (#1444)
* Fix file extension

* Vobsub not supported

* Fix formatting

* More formatting

Co-authored-by: Punit Lodha <punitlodha@pm.com>
2022-07-09 19:04:01 +05:30
Carlos Fernandez Sanz
4712d85190 Maybe make format checker happy? 2022-07-05 14:05:17 -07:00
dependabot[bot]
d95a3b3354 Bump regex from 1.5.4 to 1.5.6 in /src/rust (#1440)
Bumps [regex](https://github.com/rust-lang/regex) from 1.5.4 to 1.5.6.
- [Release notes](https://github.com/rust-lang/regex/releases)
- [Changelog](https://github.com/rust-lang/regex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/regex/compare/1.5.4...1.5.6)

---
updated-dependencies:
- dependency-name: regex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-06-18 11:23:03 +05:30
nikolabr
39724fe6a7 [FIX] Fix issue #1421 (#1431)
* [Fix] Fix issue #1421

* Fix header offset
2022-06-17 18:12:23 +00:00
Shashwat Singh
0f90afaa1b Port hardsubx imgops (#1439)
* add hardsubx rust module and expose it

* port rgb_to_hsv to rust

* add dependency fast-math and extern it

* port rgb_to_lab to rust

also make preprocessor to not allow compilation of hardsubx_imgops
if WITHOUT_RUST is OFF

* improve if-else constructs for readability

* unroll  macros that were only used once and remove their definition

* Improve readability of rgb_to_lab function (and fixes)

The function in Rust behaves slightly differently than its C counterpart

* remove fast math library, use palette library and rewrite imgops using it

* run formatter

* replace destructuring assignment statement with normal assignment statements because of build rust compiler issues

* run formatter on C code for imgops

* remove extern for modules because it is not required

* improve comment placement in rust imgops

Co-authored-by: Punit Lodha <48253287+PunitLodha@users.noreply.github.com>
2022-06-15 10:35:03 +05:30
Shashwat Singh
689d92ab59 put generated files of the rust project in the .gitignore (#1441) 2022-06-07 19:50:35 +00:00
dependabot[bot]
ca303d6942 Bump actions/upload-artifact from 2 to 3 (#1430)
Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2 to 3.
- [Release notes](https://github.com/actions/upload-artifact/releases)
- [Commits](https://github.com/actions/upload-artifact/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/upload-artifact
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-04-11 09:48:03 +02:00
Shashwat Singh
6a9a16e611 add option to extract closed captions and burnt in subs in the same pass (#1422)
* [NEW] add functionality to allow extraction of cc and burnt-in subs in the same pass
- add flag under hardsubx called -hcc that calls this method
- minor refactoring of moving some code from general_loop to a new function
- appropriate addition to the header files to expose certain methods

* add change log

* run clang formatter
2022-03-27 07:51:23 -07:00
dependabot[bot]
30bc27aa0c Bump actions/cache from 2 to 3 (#1424)
Bumps [actions/cache](https://github.com/actions/cache) from 2 to 3.
- [Release notes](https://github.com/actions/cache/releases)
- [Commits](https://github.com/actions/cache/compare/v2...v3)

---
updated-dependencies:
- dependency-name: actions/cache
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-21 21:27:00 +00:00
Aditya Pratap Singh
c3fc323150 [IMPROVEMENT] Deprecated the --with-gui flag for linux/configure and mac/configure (#1415)
* Deprecated the --with-gui flag for linux/configure and mac/configure

* Update docs/CHANGES.TXT

Co-authored-by: Willem <github@canihavesome.coffee>

Co-authored-by: Willem <github@canihavesome.coffee>
2022-03-07 13:20:11 +01:00
Willem
b5fe0609fc Update build_windows.yml 2022-03-02 14:31:48 +01:00
Punit Lodha
0a4049c97c Fix clippy warning and Use rust 1.56.0 for CI (#1420)
* Fix cippy warning

* Use rust 1.56.0 for CI

Co-authored-by: Punit Lodha <punitlodha@pm.com>
2022-03-02 14:16:29 +01:00
dependabot[bot]
6e4ac56e9c Bump actions/checkout from 2.4.0 to 3 (#1419)
Bumps [actions/checkout](https://github.com/actions/checkout) from 2.4.0 to 3.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2.4.0...v3)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-03-01 18:51:56 +00:00
Shashwat Singh
e6503d5c81 [FIX] segmentation fault when using hardsubx (#1417)
* Fix segmentation fault when using hardsubx
* initialize library before hardsubx call
2022-02-21 09:14:34 -08:00
Willem
1717cbb44d Update CHANGES.TXT 2022-01-24 17:54:00 +00:00
Arpan Kapoor
caa960e657 Fix #1407 (#1409) 2022-01-24 17:52:19 +00:00
Willem
290e2f10f9 Update debian.sh 2021-12-23 19:41:44 +00:00
Willem
325464f793 Update ccextractor.spec 2021-12-23 19:41:20 +00:00
Willem
f533a53902 Update PKGBUILD
Update PKGBUILD based on https://aur.archlinux.org/cgit/aur.git/tree/PKGBUILD?h=ccextractor
2021-12-23 19:40:23 +00:00
Punit Lodha
97b381a2b0 Switch to rustc 1.56.0 (#1404)
* Update CHANGES.TXT

* Update release flow

* use 1.56.0 compiler as 1.57.0 is bugged

Co-authored-by: PunitLodha <punitlodha@pm.com>
2021-12-15 17:03:45 +00:00
Punit Lodha
03b0749e91 Update release flow (#1403)
* Update CHANGES.TXT

* Update release flow

Co-authored-by: PunitLodha <punitlodha@pm.com>
2021-12-15 16:41:11 +00:00
Carlos Fernandez Sanz
7bcdd6729f Bump version 0.93 -> 0.94 2021-12-14 09:46:01 -08:00
Punit Lodha
3dd3d5f6aa Update CHANGES.TXT (#1402)
Co-authored-by: PunitLodha <punitlodha@pm.com>
2021-12-14 09:37:23 -08:00
Ritesh Maurya
ba37cc41c8 Update COMPILATION.MD (#1401)
Most of the users use Ubuntu 18.04 and later, so added the `libtesseract-dev`  rather than `tesseract-ocr-dev` in the bash command so new people don't run into any errors as the NOTE was written after the command.
2021-12-13 08:29:53 -08:00
Punit Lodha
6efa41a7e6 Extract 708 subs by default (#1398)
* Extract 708 subs by default

* fix fmt
2021-12-05 06:21:34 -08:00
Manolis Miminas
9b90c91f07 Update COMPILATION.MD (#1397)
Add missing slash character.
2021-12-01 12:16:07 +01:00
Carlos Fernandez Sanz
35936618e3 Display explicit message if text:text is found 2021-11-21 10:16:07 -08:00
Carlos Fernandez Sanz
e98a584e98 Exit build if rust part fails 2021-11-21 09:31:05 -08:00
Punit Lodha
1a8c8a86f3 Check start/end at param before encoding DVB subs (#1396) 2021-11-21 07:13:48 -08:00
Punit Lodha
57663b8cf1 Fix Carriage Return command (#1394)
* Fix Carriage Return command

* fix fmt

* Fix rollup
2021-11-20 09:29:19 -08:00
Willem
2b3d759e20 Update CHANGES.TXT
Add links to GH issues for 2 improvements in new version
2021-11-20 16:25:27 +00:00
Punit Lodha
ed1b5dddce Update windows build (#1393)
* Compile rust in a pre-build event

* Add msbuild to windows compilation docs

* Update CHANGES.TXT
2021-11-14 10:03:39 -08:00
Punit Lodha
86fede6af8 Fix negative delay bug, and other miscellaneous changes (#1392)
* Add message for detected version

* Update rust build scripts for windows

* Fix bug with negative delay values

* fix formatting
2021-11-13 06:38:50 -08:00
dependabot[bot]
68e6390c76 Bump actions/checkout from 2.3.4 to 2.4.0 (#1388)
Bumps [actions/checkout](https://github.com/actions/checkout) from 2.3.4 to 2.4.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2.3.4...v2.4.0)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-11-10 07:03:06 +00:00
Hugh Mackworth
0ebeec4183 Repair Mac Build processes (#1390)
* Fix Mac Build processes

For all:
  Add Neon files to libpng for Apple Silicon
  Update compilation.md documentation

For autoconf:
  Make Linux and Mac Makefile.am and configure.ac identical
  Fix wrong location for zvbi/bcd.h in both Mac/Linux

For cmake::
  Include GPAC config for Darwin in Mac version

For mac/build.command:
  Update for new zvbi location

* Update CHANGES.TXT for Mac Build commit
2021-11-09 17:30:21 -08:00
Punit Lodha
8c10ded107 Add check for MSRV and update compilation docs (#1387)
* Add check for MSRV

* Update docs

* fix docs
2021-10-28 09:08:50 -07:00
Punit Lodha
3a1851f904 Make rust decoder default (#1375)
* Use rust by default and add -WITHOUT_RUST flag

* Fix for shell and autoconf builds

* change directory for version check

* change to  staticlib

* Update windows to build rust

* fix formatting

* add information about 708 decoder in version flag

* revert file mode to 644

* Use x86 for OCR releases

* fix flushing bug

* fix formatting

* update lib names

* remove bazel

* update changelog
2021-10-15 08:15:51 -07:00
Punit Lodha
50aceb45fb remove default BOM from windows (#1383)
* remove default BOM from windows

* update changelog
2021-10-08 15:30:54 -07:00
Willem
cad6b0495c Release flow improvements (#1381)
- Create .tar.gz for Linux that excludes the Windows & Git folders
- Create portable (zipped) version of CCExtractor (closes #1376)
2021-10-08 12:26:21 +02:00
Carlos Fernandez Sanz
c7ebd45d9f Version bump, 0.92 -> 0.93 2021-08-16 11:31:28 -07:00
Carlos Fernandez Sanz
77abe01885 Fix warning about using a keyword (new) as identifier 2021-08-16 11:30:59 -07:00
Punit Lodha
98cec31516 Rust updates:- Update documentation (#1374)
* Fix warning

* Update documentation

* Fix typo

* fmt
2021-08-14 08:44:23 -07:00
Jayesh Nirve
46b145a396 Update README.md (#1373) 2021-08-10 14:13:09 +02:00
Carlos Fernandez Sanz
ccf2a031e9 Bump version on Mac 0.91 -> 0.92 2021-08-10 04:24:39 -07:00
Carlos Fernandez Sanz
9784cd5bd1 Update CHANGES.TXT with last-minute Rust updates 2021-08-10 04:22:11 -07:00
Punit Lodha
5d8dc3b9eb Rust updates:- Add writers for transcripts and SAMI (#1372)
* add rust-iconv

* Add writer for transcipts and SAMI

* consistent import ordering
2021-08-10 04:21:22 -07:00
Carlos Fernandez Sanz
a42e847bcb Bump version 0.91 -> 0.92 2021-08-10 04:10:43 -07:00
Jayesh Nirve
b7a1dd1030 Update ISSUE_TEMPLATE.md (#1370) 2021-08-05 20:43:32 -07:00
Punit Lodha
b18e696c85 Rust updates: Added srt writer (#1368)
* pass -DENABLE_RUST to clang

* impl Default

* Update time str format

* Add SRT writer

* fmt
2021-08-03 13:59:31 -07:00
Willem
d58f078c38 Add missing DLL to the installer
Fixes #1367
2021-07-29 14:02:52 +02:00
PunitLodha
0bbdfc13ee Rust updates (#1364)
* add copy to screen

* Add tv_screen and more functions
2021-07-27 23:55:24 -07:00
Carlos Fernandez Sanz
5127da50d1 Push version 0.90 -> 0.91 2021-07-26 09:53:45 -07:00
PunitLodha
352f035214 [rust] Add Pen Presets and timing functions (#1363)
* add pen presets and timing functions

* fix typo

* fix formatting
2021-07-22 19:26:06 -07:00
PunitLodha
f04ba8d0c4 Rust updates (#1361)
* add handlers for CLW, HDW, TGW, DLW, and CR

* refactor rust code

* fix clippy warnings

* add ccxr_  prefix to rust functions

* Add SPA, SPC, SPL and CWx commands

* Add DSW and DFx commands

* Add more C0 and extended commands

* Use slice instead of sending whole packet and pos
2021-07-18 10:48:14 -07:00
Willem
1ea94d0b14 Update installer.wxs
Make ID's unique
2021-07-14 09:53:52 +02:00
Willem
7f99603859 Update installer.wxs
Add missing DLL's to the installation folder
2021-07-14 09:41:36 +02:00
Carlos Fernandez Sanz
3713283dfc Bump version 0.89 -> 0.90 2021-07-14 00:16:09 -07:00
PunitLodha
09129f1e63 [Rust] Add few commands and refactor the code (#1360)
* add handlers for CLW, HDW, TGW, DLW, and CR

* refactor rust code

* fix clippy warnings

* add ccxr_  prefix to rust functions

* Add SPA, SPC, SPL and CWx commands
2021-07-10 09:32:53 -07:00
PunitLodha
c56840ff2c Add functions to rust (#1358)
* add process_current_packet

* add process_service_block

* Add handle_G0 and G1 code sets

* remove unnecessary return

* Add C0 and C1 commands and their handlers
2021-07-04 11:11:03 -07:00
Willem
2a34bd99e6 [IMPROVEMENT] Automate release process for installer (#1357)
* Do not run push/pull request workflows for tags

* Stop including the old UI into artifacts for Widnows

* Introduce WiX installer and release flow
2021-06-28 14:18:33 -07:00
PunitLodha
c7886ed615 Add CI and docs for rust lib (#1355) 2021-06-27 17:58:51 +00:00
PunitLodha
948531a4be Update win_iconv path (#1356) 2021-06-26 11:43:32 -07:00
PunitLodha
022987c804 Add rust library (#1351)
* Add rust lib

* add steps for building rust lib

* use rust lib

* add conditional flag for rust

* use cargo config.toml

* add decoder module and update bindings

* use match instead of if else

* add target directory flag

* add env_logger

* use env_logger

* Process data first and then pass to safe function
2021-06-25 18:03:00 -07:00
PunitLodha
db6c852fae Add -DGPAC_CONFIG_LINUX for UNIX platforms (#1353) 2021-06-23 06:22:30 +00:00
PunitLodha
b793f16343 Update function declarations and naming style (#1350)
* Add declarations of functions and update names

* fix formating

* update function signature for dtvcc_process_data
2021-06-19 08:32:34 -07:00
carlos@ccextractor.org
ceaaa65a26 Remove confusing commits from build-static 2021-06-13 19:28:19 +00:00
carlos@ccextractor.org
1d7589e653 Bump version to 0.89 2021-06-13 19:05:33 +00:00
PunitLodha
e09abe7a83 Fix column length (#1345)
* Fix column length
Don't take column length from curr_window, as row could from any window

* update CHANGES.TXT
2021-06-11 07:39:31 -07:00
canihavesomecoffee
e86e8692a8 Fix formatting for mp4.c 2021-06-11 00:01:10 +02:00
canihavesomecoffee
961bfda727 Clang-format mp4.c, ocr.c and ts_functions.c 2021-06-10 23:57:18 +02:00
canihavesomecoffee
8218d5ff73 Do not run format on thirdparty or zvbi libraries 2021-06-10 23:54:47 +02:00
canihavesomecoffee
5850ef073d Apply clang-format
Apply to:
- ccextractor.c
- lib_ccx:
-- ccx_common_option.c
-- ccx_common_timing.c
-- ccx_encoders_common.c
-- general_loop.c
-- mp4.c
-- output.c
-- sequencing.c
2021-06-10 23:47:17 +02:00
Willem
7347440277 [FIX] Attempt to fix long-running regression in TeleText (#1341)
* Attempt to fix long-running regression in TeleText

Regression test 78 (https://sampleplatform.ccextractor.org/regression/test/78/view)
has been broken since #614 was merged to fix other issues.

It's been traced back to be caused by not setting t0 at the correct time
(setting it using a calculated PTS time rather than taking it from the video frame),
and this commits attempts to fix that.

* Add changes

* Clang-format changes

* Improved fix

This uses the current_pts rather than the min_pts because the value
of the delta should be relative to when the packet was received.

If min_pts wasn't set yet, it'll be retrieved and set as current_pts

* Fixup
2021-06-10 14:38:03 -07:00
PunitLodha
4007198342 fix for missing subtitles (#1344)
Avoid overwriting data, by processing it first
2021-06-10 08:01:28 -07:00
carlos@ccextractor.org
c09524d043 Add build notes for hardsubx on debian 2021-06-08 17:57:22 +00:00
carlos@ccextractor.org
d81c692bbb Fix frame number calculation in SCC. Closes #1340 2021-06-08 15:00:21 +00:00
PunitLodha
6d366bfdc6 fix for timing 0:00, when -dru is selected (#1342) 2021-06-05 08:30:29 -07:00
Carlos Fernandez Sanz
ceb0110378 Initialize MXFContext when the input format is manually specified.
Closes #1336
2021-06-01 15:02:51 -07:00
PunitLodha
f06436c1fe Fix min and max fts when PTS resets (#1338) 2021-05-27 08:03:43 -07:00
Carlos Fernandez Sanz
67e15aaf80 memset write structure on allocation.
Closes #1337
2021-05-26 15:12:57 -07:00
emkman99
5b29ef281a [FIX] Multitrack, WebVTT, and Segfault issues (#1332)
* [FIX] Must have two newlines after WEBVTT header

Bug introduced in #1092

* [FIX] segfault with multitrack reports

* [FIX] segfault with unsupported file reports

* [FIX] Write subtitle header to multitrack outputs

* [FIX] Write multitrack files to the output file directory
2021-05-19 14:28:06 -07:00
Jayesh Nirve
24b90970c7 modify gui output for easier parsing (#1335)
* modify gui output for easier parsing

* fix formatting

* make time tag consistent with subtitle
2021-05-18 21:21:47 -07:00
dependabot[bot]
84e6891922 Bump actions/checkout from 2 to 2.3.4 (#1334)
Bumps [actions/checkout](https://github.com/actions/checkout) from 2 to 2.3.4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v2...v2.3.4)

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2021-05-11 20:11:18 +02:00
Werner Robitza
0dcdf72042 fix links (#1333)
* fix links

Fix various links to the new website

* Update CHANGES.TXT
2021-05-10 11:58:02 -07:00
Suvigya
e3b939baad [FIX] Removed extra lines of code which added extra p tags (#1331)
* [FIX] Removed extra lines of code which added extra p tags

* [FIX] Removed extra lines of code which added extra p tags
2021-05-02 10:53:10 -07:00
KW781
0e5910ebee Update ts_functions.c (#1326)
Get rid of get_video_min_pts() as described in issue #1303
2021-05-02 10:51:32 -07:00
pranavrajpal
8a81a57a24 [IMPROVEMENT] Update GPAC to version 1.0.1 (#1328)
* Add update_gpac.py

Add a Python script that partially automates updating GPAC to a newer
version.

* Update GPAC to version 1.0.1

Update the vendored version of GPAC to version 1.0.1.

* Add necessary GPAC header files

Add some GPAC header files that GPAC needs to compile.

* Define _GF_CONFIG_H_ to fix Linux build failing

gpac/configuration.h has a series of default configuration options for
various platforms, but it doesn't have a case for Linux and it results
in a compilation error if it encounters an unknown platform.

The settings in configuration.h don't appear to try to set any defaults
for Linux anyway, so we can disable all use of those configuration.h
settings by defining _GF_CONFIG_H_.

* Add some more necessary GPAC header files

Add a few more header files necessary to get GPAC to compile.

* Fix renamed and removed media types

Some mp4 media types ("clcp", "c608") were renamed by GPAC. "c708"
appears to have been removed, so we can just add the definition of that
to the top of mp4.c.

* Remove Remotery from updated GPAC

Remotery appears to be some code for profiling GPAC which we aren't
using, and including Remotery.c and Remotery.h ends up pulling in a lot
of files, so it's easier to just remove the include of Remotery.h and
the single use of it in os_divers.c

* Remove unused box definitions

Remove box definitions that we don't use from box_funcs.c in order to
avoid adding too many files from GPAC.

* Replace alloc function declarations with defines

Replace the GPAC wrappers around the malloc-style functions (gf_malloc,
gf_free, etc.) with defines that use the standard C versions of these
functions so that we can avoid including GPAC's alloc.c

* Remove WebVTT handling code in gf_isom_dump_srt_track

Remove the code that handles WebVTT in gf_isom_dump_srt_track to avoid
needing to pull in a lot of other files from GPAC.

gf_isom_dump_srt_track doesn't appear to be used by ccextractor directly
or indirectly (it's only called in gf_isom_text_dump which doesn't
appear to be called anywhere else) so it should be fine removing it.

* Disable use of Remotery and gzip on Linux

Use GPAC_DISABLE_REMOTERY and NO_GZIP to disable Remotery because we
aren't interested in profiling (see
5c0c9cf71e for more info) and gzip
compression through gzio.c respectively.

* Fix compilation errors in GPAC on linux

GPAC on linux after the update requires some threading functions and
dynamic loading functions in pthread and dl respectively.

* Add necessary files for GPAC to compile

Add several C and header files that GPAC needs to compile

* Disable Remotery and Gzip in all build systems

Disable Remotery and gzip (using the same method as
f49dc371b5) for:

- The linux build script (linux/build)
- The mac build script (mac/build.command)
- The mac makefile
- cmake
- bazel
- Visual Studio

* Add extra GPAC files to several build systems

Add the names of several GPAC files that were added in the update to the
linux and mac Makefiles and to the Windows Visual Studio project.

Adding these filenames isn't necessary for CMake, Bazel, or the linux or
mac build scripts because all of them compile all C files recursively in
the src/thirdparty/gpacmp4 directory instead of having an explicit list
of files to compile.

* Change NO_GZIP to GPAC_DISABLE_ZLIB in VS project

Instead of defining NO_GZIP to disable gzip support, define
GPAC_DISABLE_ZLIB, which does the same thing but also prevents the
compiler from trying to zlib.

* Avoid using GPAC's configuration.h completely

GPAC's configuration.h has a few problems with the defaults that it
sets:
- It defines GPAC_MEMORY_TRACKING on Windows, which switches to an
  alternate implementation of malloc, meaning that we would have to pull
  in alloc.c
- It causes compilation errors on Linux (see 9164c08979)

This disables using configuration.h by:
- Defining GPAC_HAVE_CONFIG_H to make GPAC use a separate config.h file
  instead of the default configuration.h file
- Making an essentially empty config.h file to make attempts to include
  it not fail

This commit also removes configuration.h from the repo to make sure we
don't accidentally include it, and removes the _GF_CONFIG_H_ hack from
the previously mentioned commit because we don't need it anymore (it's
sole purpose was avoiding using configuration.h).

* Link pthread and dl on Mac and Linux

Add -lpthread and -ldl to link pthread and dl respectively on Mac and
Linux. Needed because the update to GPAC 1.0.1 introduced os_thread.c
(which uses pthread) and os_module.c (which uses dlsym and related
functions).

* Remove unused Remotery.h header file

5c0c9cf71e removed the only use of
Remotery.h in the GPAC files that we pulled in, so there's no need to
keep it around.

* Add GPAC update to changelog

* Fix cmake build error

Building with CMake currently fails because it can't find functions from
dl (dlopen, dlsym, etc.)

* Fix bazel build error

Bazel currently doesn't find the header files in gpac/modules/ when
building gpac, most likely because it isn't searching all directories in
gpac/ recursively for header files

* Define GPAC_HAVE_CONFIG_H in lib_ccx BUILD file

lib_ccx indirectly includes gpac/tools.h, which tries to include
gpac/configuration.h, which was removed in
b46c4e8a2d. This just copies the solution
from that commit to the bazel BUILD file (defining GPAC_HAVE_CONFIG_H so
GPAC uses gpac/config.h instead).

* Link to dl and pthread in bazel GPAC BUILD file

The updated GPAC version requires functions from dl and pthread, which
weren't linked to previously when building with bazel.
2021-04-30 04:59:13 -07:00
dependabot-preview[bot]
02d84d27d0 Upgrade to GitHub-native Dependabot (#1330)
Co-authored-by: dependabot-preview[bot] <27856297+dependabot-preview[bot]@users.noreply.github.com>
2021-04-29 23:03:48 +00:00
Suvigya
a2af0d7044 [FIX] segmentation fault on encoding McPoodle's raw to WebVTT (#1329) 2021-04-27 08:21:07 -07:00
Venkata Shravan
4f5bd7bf37 Add bazel build to Github Actions (#1321) 2021-04-12 21:16:35 +00:00
PunitLodha
91ef488dff Revert "Ignore extra padding data in the current_packet (#1304)" (#1325)
This reverts commit 7f4acae74b.
2021-04-12 10:45:14 -07:00
PunitLodha
1af107aef8 Fix 708 timing issue (#1319)
* Fix 708 timing issue
Process packet as soon as the packet len is equal to the specified len

* check if cc_valid

* fix formatting

* Check if header is parsed before parsing pkt data
2021-04-12 09:42:15 -07:00
Matej Plavevski
9a60796674 [IMPROVEMENT]Update LibPNG to 1.6.37 (#1271)
* Update LibPNG to 1.6.37
2021-04-05 16:25:54 -07:00
PunitLodha
7f4acae74b Ignore extra padding data in the current_packet (#1304)
* Ignore extra padding data in the current_packet

* refactor to avoid buffer overflow
2021-04-04 16:20:04 -07:00
Carlos Fernandez Sanz
fa8b0a3023 Build with Bazel (#1316)
Initial Bazel build files

Co-authored-by: Willem <github@canihavesome.coffee>
Co-authored-by: Divyam Ahuja <39771050+DivyamAhuja@users.noreply.github.com>
2021-04-04 16:07:12 -07:00
Sivaram D
acb55470f6 [DOCS] Documentation fix and mentioned alternatives that ccx accepts for -stdin and -cc2 options (#1295)
* added alternate params for -stdin and -cc2
* change readme text file to markdown
* deleted README.TXT
2021-04-04 12:46:17 -07:00
Abhik Jain
97da554da6 remove '-nots' flag from file-format parsing (#1315) 2021-04-03 01:53:12 -07:00
pranavrajpal
a121823adc [FIX] Fix segfault on Windows (#1313)
* Fix segfault on Windows

Using the format specifier %d to print out size is technically undefined
behavior, as size is defined as a u64, while %d is meant to print out
ints, which seems to be defined as 32 bits on most machines, and using a
format specifier with the wrong size is undefined behavior. This causes
a segfault on Windows as this apparently causes the wrong pointer to be
passed in for the filename.

* Add change to changelog
2021-04-02 10:09:40 -07:00
Carlos Fernandez Sanz
cb85740690 Remove -cf (#1312) 2021-03-31 12:28:47 -07:00
Carlos Fernandez Sanz
e91a13bb60 Remove python (#1311)
Since this code is both unused an unmaintained I'm making the executive decision to get rid of it to make our life easier.
2021-03-31 09:55:06 -07:00
Carlos Fernandez Sanz
a063be996b Minor file structure reorg (#1310)
Moved zvbi from thirdparty to lib_ccx.
Moved mp4 from gpacmp4 to libccx.
Adjusted build files as needed.
2021-03-31 09:39:54 -07:00
Abdul Malik
19da837232 docs : Fixed a typo (#1307) 2021-03-25 16:09:14 +00:00
Sivaram D
22a494d834 mentioned debug info on compilation docs (#1300) 2021-02-16 07:26:20 +00:00
Nils
2e68e9f600 Remove -Wimplicit-function-declaration warning #1296 (#1297) 2021-02-08 17:42:16 +00:00
Sivaram D
b1c22e5034 added block for if statement (#1291) 2021-01-14 09:13:46 -08:00
Venkata Shravan
e3c54327e8 Updated Github actions and reduced steps required to upload artifacts. (#1289)
Updated Github actions, reduced upload artifact steps [Windows]. Closes #1284.
2020-12-28 09:59:46 +01:00
Venkata Shravan
9e62f8c557 Documentation fix (#1290) 2020-12-26 08:49:08 -08:00
VaishnaviC
6216247ecb Created block of code for single line branches at lines between 660-670. (#1287)
* Commit 2 ocr.c

Added {} to single-line conditional statements to create blocks instead of keeping them as single line branches.

* Update ocr.c
2020-12-24 15:18:37 -08:00
Tim Gates
082100a0d4 docs: fix simple typo, commmon -> common (#1283)
There is a small typo in src/thirdparty/gpacmp4/gpac/isomedia.h.

Should read `common` rather than `commmon`.
2020-12-23 01:44:39 -08:00
Willem
cf828471d6 Fix Windows build pipeline
warrenbuckley/Setup-MSBuild has been deprecated in favour of microsoft/setup-msbuild, which includes a fix for the failure of the build pipelines (refer to https://github.blog/changelog/2020-10-01-github-actions-deprecating-set-env-and-add-path-commands/)
2020-12-21 10:26:56 +01:00
MackeyStingray
cf84757e02 Fix hardsubx segmentation fault (#1280) 2020-09-13 10:10:02 -07:00
Nils
f486efbb57 [FIX] -Wunused-result warnings (#1269)
* Fix -Wunused-result warnings

* Wrap checked writes into a function

* In write_wrapped, continue writing in case of partial write

If a partial write occurs, it doesn't necessarily mean that something
failed, according to write(2). If this is the case, then the following
write will return -1.

* Fix build on MSVC

https://stackoverflow.com/questions/37460579/error-c2036-void-unknown-size
2020-06-28 14:29:35 -07:00
Nils
0db5b0c838 [Formatting] Remove trailing whitespace (#1270) 2020-05-20 15:09:00 +02:00
hamelg
e411a75dcd [FIX] Disable BOM in non-Windows build (#1268)
* Disable BOM in non-Windows build

* Disable BOM in non-Windows build
2020-05-09 15:21:45 -07:00
hamelg
33ecccedce [FIX] Allow all oem modes with tesseract v4 (#1267)
* Allow all oem modes with tesseract v4

* Allow all oem modes with tesseract v4

* Fix formatting
2020-05-08 14:52:47 -07:00
Willem
28dd35b040 Add DLL's to artifact (#1263)
Expands the Windows build steps to include DLL's in the artifact, making an out-of-box use of said artifacts easier. The new artifacts will allow running ccextractor (not the GUI yet) directly.
2020-04-28 22:31:15 +02:00
Willem
e82a492c94 Update build_windows.yml
Add version information for all builds
2020-04-26 21:09:19 +02:00
Willem
4509b9daf5 Update build_windows.yml 2020-04-26 21:04:09 +02:00
Willem
d330b78f37 Update build_windows.yml 2020-04-26 20:54:19 +02:00
Willem
ab89f88aea Update build_windows.yml
Add a build information on the release build.
2020-04-26 20:50:09 +02:00
Willem
0227c2787a Update build_linux.yml
Correct path for version check for building with cmake
2020-04-26 20:48:25 +02:00
Willem
84dec36845 Update build_linux.yml
Add version information step on all Linux builds
2020-04-26 20:42:44 +02:00
Willem
b4f692807a Update build_linux.yml
Add a step to show version information
2020-04-26 20:38:59 +02:00
apovalyaev
1f5ec6cd8d Update VS project build settings (issue #1254) (#1261)
Improves the build for 32 bit variants.

Contains fixes:
- `/SAFESEH:NO`: needed for linking precompiled ffmpeg-lib libraries
- add paths from $(ProjectDir)libs\lib\ffmpeg-lib and avcodec.lib; avformat.lib; avutil.lib; swscale.lib
- add extra post-build actions to copy libraries
- add $(vcpkg) paths
2020-04-25 17:13:33 +02:00
Willem
6f375cd9b3 Update build_windows.yml
Split up artifacts for easier re-use; ensure paths are correct.
2020-04-25 13:00:27 +02:00
Willem
e959654c6f Update build_windows.yml
Fix wrong paths
2020-04-25 12:39:06 +02:00
Willem
18484d555f Add OCR build to Windows action
Adds a (likely non-working) build stage for building with OCR to the Windows GitHub actions, so we can assure that Windows keeps building with OCR just fine.
2020-04-25 12:32:04 +02:00
Carlos Fernandez Sanz
1534d81ae7 Added new utf8proc location to Windows project 2020-04-12 15:13:48 -07:00
Nils
84b5df2713 Mention where to send private invitation in the ISSUE_TEMPLATE (#1253)
Makes a small update to the ISSUE_TEMPLATE to clarify instructions for sending samples that cannot be made public.

Co-authored-by: Willem <github@canihavesome.coffee>
2020-04-07 13:55:49 +02:00
Anshul Maheshwari
8e729cc62c Merge pull request #1246 from anshul1912/master
put check for DVB duration with pagetimeout
2020-03-30 22:35:04 +05:30
Willem
0f1f4d889f Apply suggestions from code review 2020-03-29 22:22:49 +02:00
Willem
487b521c9b Merge branch 'master' into master 2020-03-29 22:19:26 +02:00
Willem
1aed90e42c [IMPROVEMENT] Apply clang-format to all remaining files (#1247)
Apply clang-format to all files aside from the icon file in the GUI and modify the action appropriately.
2020-03-29 22:16:39 +02:00
Anshul Maheshwari
e2d387bfa9 put check for DVB duration with pagetimeout 2020-03-28 22:26:40 +05:30
Nils
b974a7ed81 Remove installation of clang (#1244)
This is possible thanks to
https://github.com/actions/virtual-environments/pull/447
2020-03-20 13:08:19 +01:00
vishwesh-D-kumar
522ebae65e [FIX] Fixed paths in MakeFile, fixing the AutoConf compile error (#1242)
Closes #1241. 

Co-authored-by: Willem <github@canihavesome.coffee>
2020-03-03 20:50:55 +01:00
Willem
1b17a04b25 [FIX] Fix Mac build error for reproducible builds (#1232)
* Fix Mac build error for reproducible builds
* Shorten solution with vr8hub's suggestion

Closes #1230
2020-02-16 01:08:21 +01:00
Willem
588c4a8187 Merge pull request #1231 from NilsIrl/remove_branch_specification
[IMPROVEMENT] Remove the need for the push to be on the master branch
2020-02-15 23:50:34 +01:00
Nils André-Chang
88830e6c58 Remove the need for the push to be on the master branch
This is because contributors don't have branches called master it isn't
possible to manually trigger workflows as suggested by
https://github.community/t5/GitHub-Actions/GitHub-Actions-Manual-Trigger-Approvals/m-p/31517.

Also removed the workflow file from the path as it is implicitely set.
2020-02-15 22:12:55 +00:00
Carlos Fernandez Sanz
db646f50ac Update ISSUE_TEMPLATE.md 2020-02-12 17:39:12 -08:00
Nils
b1c9540085 [IMPROVEMENT] Comment out issue (#1178)
* [ISSUE_TEMPLATE.md] Comment out instructions

* [PULL_REQUEST_TEMPLATE.md] Comment out instructions

* Mention in ISSUE_TEMPLATE.md that only useful arguments should be put

* Follow feedback
2020-02-12 17:36:05 -08:00
Nils
e98137e059 [FIX] Fix tags displaying incorrectly (#1229)
This was caused by 19241744d7, moving from
`unsigned char` to `enums` for colors and fonts. The problem with this is
that each colour isn't one byte next to each other so memcpy and memset
didn't work anymore.

The problem:

```patch
6812,6813c6812,6813
< EDITION OF AMERICA'S NEXT TOP
< <i> MODEL</i> ON WEDNESDAYS.<i>          </i>
---
> EDITION OF<i> AMERICA'S NEXT TOP</i>
> <i> MODEL</i> ON WEDNESDAYS.
6817c6817
< EDITION OF AMERICA'S NEXT TOP
---
> EDITION OF<i> AMERICA'S NEXT TOP</i>
6819c6819
< >><i> THE VAMPIRE DIARIES         </i>
---
> >><i> THE VAMPIRE DIARIES</i>
6824,6825c6824,6825
< >><i> THE VA</i>MPIRE DIARIES
< AND<i> THE SECRET CIRCLE          </i>
---
> >><i> THE VAMPIRE DIARIES</i>
> AND<i> THE SECRET CIRCLE</i>
6829,6831c6829,6831
< >><i> THE VA</i>MPIRE DIARIES
< AND<i> THE S</i>ECRET CIRCLE
< ON THURSDAYS.<i>                  </i>
---
> >><i> THE VAMPIRE DIARIES</i>
> AND<i> THE SECRET CIRCLE</i>
> ON THURSDAYS.
6835c6835
< AND<i> THE S</i>ECRET CIRCLE
---
> AND<i> THE SECRET CIRCLE</i>
```
2020-02-12 15:01:15 -08:00
Willem
3c37d49764 Merge pull request #1228 from CCExtractor/canihavesomecoffee-patch-1
[FIX] Remove Windows XP workaround
2020-02-08 21:01:54 +01:00
Willem
a8d6b81baf Remove Windows XP workaround
Removes the workaround that was put in place while waiting for actions/virtual-environments#288 to being fixed.
2020-02-08 20:56:45 +01:00
Willem
b8321cac0f Finetune formatting action
Only trigger action when the action is edited, or when source code is actually being changed.
2020-02-08 20:52:36 +01:00
Ed Marshall
6697ed3496 [FIX] Fix multiple definitions with new -fno-common default in GCC 10 (#1226)
* Fix multiple definitions with new -fno-common default in GCC 10

* Add GCC 10 fix to changelog
2020-02-01 22:26:48 -08:00
Nils ANDRÉ-CHANG
722d52420c [IMPROVEMENT] Clang format (#1222)
* Add .clang-format

* Add clang-format github action

* Set more explicit name to GitHub workflow

Co-Authored-By: Willem <github@canihavesome.coffee>

Co-authored-by: Willem <github@canihavesome.coffee>
2020-01-30 09:00:00 -08:00
Nils ANDRÉ-CHANG
af6d8282cb [IMPROVEMENT] Move dependencies to a third party directory (#1219)
* Move dependencies in a folder

* Windows

* MacOS
2020-01-30 04:58:37 -08:00
kdrag0n
732b20aefa [FIX] Clang warning fixes (#1205)
* file_buffer: Fix unitialized variable usage warning

Clang warns:

In file included from src/lib_ccx/asf_functions.c:5:
src/lib_ccx/file_buffer.h:76:7: warning: variable 'result' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
                if (buffer)
                    ^~~~~~
src/lib_ccx/file_buffer.h:86:9: note: uninitialized use occurs here
        return result;
               ^~~~~~
src/lib_ccx/file_buffer.h:76:3: note: remove the 'if' if its condition is always true
                if (buffer)
                ^~~~~~~~~~~
src/lib_ccx/file_buffer.h:73:15: note: initialize the variable 'result' to silence this warning
        size_t result;
                     ^
                      = 0

* common_timing: Fix uninitialized variable usage warning

The vast majority of the code is already using fatal(), so I don't see
why this should be an exception.

Clang warns:

src/lib_ccx/ccx_common_timing.c:274:3: warning: variable 'fts' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
                default:
                ^~~~~~~
src/lib_ccx/ccx_common_timing.c:280:9: note: uninitialized use occurs here
        return fts;
               ^~~
src/lib_ccx/ccx_common_timing.c:261:11: note: initialize the variable 'fts' to silence this warning
        LLONG fts;
                 ^
                  = 0

* encoders: Fix handling of multibyte characters in UTF-8 converter

This is actually incorrect because characters longer than 1 byte will be
butchered.

Clang warns:

src/lib_ccx/ccx_encoders_common.c:178:12: warning: result of comparison of constant 256 with expression of
type 'unsigned char' is always true [-Wtautological-constant-out-of-range-compare]
                                        if (c < 256)
                                            ~ ^ ~~~
src/lib_ccx/ccx_encoders_common.c:193:12: warning: result of comparison of constant 256 with expression of
type 'unsigned char' is always true [-Wtautological-constant-out-of-range-compare]
                                        if (c < 256)
                                            ~ ^ ~~~
src/lib_ccx/ccx_encoders_common.c:209:12: warning: result of comparison of constant 256 with expression of
type 'unsigned char' is always true [-Wtautological-constant-out-of-range-compare]
                                        if (c < 256)
                                            ~ ^ ~~~
src/lib_ccx/ccx_encoders_common.c:229:12: warning: result of comparison of constant 256 with expression of type 'unsigned char' is always true [-Wtautological-constant-out-of-range-compare]
                                        if (c < 256)
                                            ~ ^ ~~~

* gxf: Fix tautological comparison warnings

Clang warns:

src/lib_ccx/ccx_gxf.c:425:17: warning: result of comparison of constant 256 with expression of type 'unsigned char' is always false [-Wtautological-constant-out-of-range-compare]
                                if (tag_len > STR_LEN)
                                    ~~~~~~~ ^ ~~~~~~~
src/lib_ccx/ccx_gxf.c:542:17: warning: result of comparison of constant 256 with expression of type 'unsigned char' is always false [-Wtautological-constant-out-of-range-compare]
                                if (tag_len > STR_LEN)
                                    ~~~~~~~ ^ ~~~~~~~
src/lib_ccx/ccx_gxf.c:617:17: warning: result of comparison of constant 256 with expression of type 'unsigned char' is always false [-Wtautological-constant-out-of-range-compare]
                                if (tag_len > STR_LEN)
                                    ~~~~~~~ ^ ~~~~~~~

* gxf: Fix uninitialized variable usage warnings

Clang warns:

src/lib_ccx/ccx_gxf.c:1449:8: warning: variable 'first_field_nb' is used uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
                case TRACK_TYPE_MPEG1_525:
                     ^~~~~~~~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1475:35: note: uninitialized use occurs here
        debug("first field number %d\n", first_field_nb);
                                         ^~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:28:115: note: expanded from macro 'debug'
                                                                                                                  ^~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1450:8: warning: variable 'first_field_nb' is used uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
                case TRACK_TYPE_MPEG2_525:
                     ^~~~~~~~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1475:35: note: uninitialized use occurs here
        debug("first field number %d\n", first_field_nb);
                                         ^~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:28:115: note: expanded from macro 'debug'
                                                                                                                  ^~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1456:3: warning: variable 'first_field_nb' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
                default:
                ^~~~~~~
src/lib_ccx/ccx_gxf.c:1475:35: note: uninitialized use occurs here
        debug("first field number %d\n", first_field_nb);
                                         ^~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:28:115: note: expanded from macro 'debug'
                                                                                                                  ^~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1410:30: note: initialize the variable 'first_field_nb' to silence this warning
        unsigned char first_field_nb;
                                    ^
                                     = '\0'
src/lib_ccx/ccx_gxf.c:1449:8: warning: variable 'last_field_nb' is used uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
                case TRACK_TYPE_MPEG1_525:
                     ^~~~~~~~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1476:34: note: uninitialized use occurs here
        debug("last field number %d\n", last_field_nb);
                                        ^~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:28:115: note: expanded from macro 'debug'
                                                                                                                  ^~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1450:8: warning: variable 'last_field_nb' is used uninitialized whenever switch case is taken [-Wsometimes-uninitialized]
                case TRACK_TYPE_MPEG2_525:
                     ^~~~~~~~~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1476:34: note: uninitialized use occurs here
        debug("last field number %d\n", last_field_nb);
                                        ^~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:28:115: note: expanded from macro 'debug'
                                                                                                                  ^~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1456:3: warning: variable 'last_field_nb' is used uninitialized whenever switch default is taken [-Wsometimes-uninitialized]
                default:
                ^~~~~~~
src/lib_ccx/ccx_gxf.c:1476:34: note: uninitialized use occurs here
        debug("last field number %d\n", last_field_nb);
                                        ^~~~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:28:115: note: expanded from macro 'debug'
                                                                                                                  ^~~~~~~~~~~
src/lib_ccx/ccx_gxf.c:1411:29: note: initialize the variable 'last_field_nb' to silence this warning
        unsigned char last_field_nb;
                                   ^
                                    = '\0'

* ts_functions: Fix incorrect enumeration type in get_buffer_type

Clang warns:

src/lib_ccx/ts_functions.c:127:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_PES;
                ~~~~~~ ^~~~~~~
src/lib_ccx/ts_functions.c:131:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_H264;
                ~~~~~~ ^~~~~~~~
src/lib_ccx/ts_functions.c:135:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_DVB_SUBTITLE;
                ~~~~~~ ^~~~~~~~~~~~~~~~
src/lib_ccx/ts_functions.c:139:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_ISDB_SUBTITLE;
                ~~~~~~ ^~~~~~~~~~~~~~~~~
src/lib_ccx/ts_functions.c:143:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_HAUPPAGE;
                ~~~~~~ ^~~~~~~~~~~~
src/lib_ccx/ts_functions.c:147:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_TELETEXT;
                ~~~~~~ ^~~~~~~~~~~~
src/lib_ccx/ts_functions.c:151:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_PRIVATE_MPEG2_CC;
                ~~~~~~ ^~~~~~~~~~~~~~~~~~~~
src/lib_ccx/ts_functions.c:155:10: warning: implicit conversion from enumeration type 'enum ccx_bufferdata_type' to different enumeration type 'enum ccx_stream_type' [-Wenum-conversion]
                return CCX_PES;
                ~~~~~~ ^~~~~~~
src/lib_ccx/ts_functions.c:491:24: warning: implicit conversion from enumeration type 'enum ccx_stream_type' to different enumeration type 'enum ccx_bufferdata_type' [-Wenum-conversion]
        ptr->bufferdatatype = get_buffer_type(cinfo);
                            ~ ^~~~~~~~~~~~~~~~~~~~~~

* utility: Fix tautological comparison warnings

Clang warns:

src/lib_ccx/utility.c:605:24: warning: result of comparison of constant 65536 with expression of type 'unsigned short' is always true [-Wtautological-constant-out-of-range-compare]
        } else if (utf16_char < 0x010000) {
                   ~~~~~~~~~~ ^ ~~~~~~~~
src/lib_ccx/utility.c:610:24: warning: result of comparison of constant 1114112 with expression of type 'unsigned short' is always true [-Wtautological-constant-out-of-range-compare]
        } else if (utf16_char < 0x110000) {
                   ~~~~~~~~~~ ^ ~~~~~~~~

* ocr: Fix floating point -> integer abs() warning

Clang warns:

src/lib_ccx/ocr.c:529:8: warning: using integer absolute value function 'abs' when argument is of floating point type [-Wabsolute-value]
                                if(abs(h-h0)>50) // Color has changed
                                   ^
src/lib_ccx/ocr.c:529:8: note: use function 'fabsf' instead
                                if(abs(h-h0)>50) // Color has changed
                                   ^~~
                                   fabsf
src/lib_ccx/ocr.c:529:8: note: include the header <math.h> or explicitly provide a declaration for 'fabsf'

* encoders: Fix incorrect string types when EIA-608 is in use

Clang warns:

src/lib_ccx/ccx_encoders_helpers.c: In function ‘clever_capitalize’:
src/lib_ccx/ccx_encoders_helpers.c:186:4: warning: case label value exceeds maximum value for type
  186 |    case 0x89: // This is a transparent space
      |    ^~~~

* ocr: Fix implicit struct declaration warning

Clang warns:

In file included from src/lib_ccx/dvd_subtitle_decoder.c:10:
src/lib_ccx/ocr.h:18:54: warning: ‘struct encoder_ctx’ declared inside parameter list will not be visible outside of this definition or declaration
   18 | char *paraof_ocrtext(struct cc_subtitle *sub, struct encoder_ctx *context);
      |                                                      ^~~~~~~~~~~
2020-01-29 21:39:40 -08:00
Nils ANDRÉ-CHANG
54318d0402 Allow the user the choose between CRLF and LF (#1220)
Defaults to CRLF
2020-01-28 21:18:10 -08:00
kdrag0n
5f61fae0c7 scc: Switch to CRLF line endings (#1209)
All the SCC and CCD examples I can find have CRLF line endings. VLC and
libavformat (used by MPV) don't care, so just go with the popular
convention and switch to CRLF. There's no reason a user would want to
choose their line endings in this scenario.
2020-01-25 19:33:22 -08:00
kdrag0n
0afba56a26 scc: Implement colors (#1213) 2020-01-25 16:16:00 -08:00
Carlos Fernandez Sanz
0873953d9f Update CHANGES.TXT 2020-01-25 15:35:34 -08:00
Carlos Fernandez Sanz
75af5f2e8c Applied clang formatting to our .c files. Tried to leave everyone else's alone. 2020-01-25 13:29:18 -08:00
Nils ANDRÉ-CHANG
8d8dc9ccc2 Improve and simplify dprintf implementation (#1185)
It now returns a value like the rest of the printf family. It doesn't
brute force the amount of memory that needs to be allocated.

It also removes a warning.

I do not believe there should be any performance concerns with this
implementation as it is what `glibc` does:

https://code.woboq.org/userspace/glibc/libio/iovdprintf.c.html
2020-01-24 23:58:44 -08:00
Nils ANDRÉ-CHANG
e37a21aace Fix longer subtitles (#1216) 2020-01-23 19:05:19 -08:00
Nils ANDRÉ-CHANG
40a603d366 Fix documentation (#1218) 2020-01-23 18:49:58 -08:00
kdrag0n
c5bed1e3b2 [FIX] GCC warning fixes (#1204)
* cea708: Fix missing new line in log message

* subtype: Remove unused CC_708 type

CEA-708 inputs are coerced to CC_608 before hitting encode_sub.

GCC warns:

src/lib_ccx/ccx_encoders_common.c: In function ‘encode_sub’:
src/lib_ccx/ccx_encoders_common.c:1119:2: warning: enumeration value ‘CC_708’ not handled in switch [-Wswitch]
 1119 |  switch (sub->type)
      |  ^~~~~~

* build: Disable pointer-sign warning

This warning triggers all over the codebase due to the widespread use of
unsigned char arrays for parsed subtitle strings and them being passed
to string functions that expect signed ones. Since this won't actually
cause issues, silence the warning across the entire codebase.

* splitbysentence: Fix warnings

GCC warns:

src/lib_ccx/ccx_encoders_splitbysentence.c: In function ‘sbs_is_pointer_on_sentence_breaker’:
src/lib_ccx/ccx_encoders_splitbysentence.c:170:7: warning: variable ‘p’ set but not used [-Wunused-but-set-variable]
  170 |  char p = *(current - 1);
      |       ^
src/lib_ccx/ccx_encoders_splitbysentence.c: In function ‘sbs_find_insert_point_partial’:
src/lib_ccx/ccx_encoders_splitbysentence.c:231:1: warning: multi-line comment [-Wcomment]
  231 | //   sprintf(fmtbuf, "SBS: sbs_find_insert_point_partial: compare\n\
      | ^
src/lib_ccx/ccx_encoders_splitbysentence.c:263:1: warning: multi-line comment [-Wcomment]
  263 | //   LOG_DEBUG("SBS: sbs_find_insert_point_partial: LEFT CHANGED,\n\tbuf:[%s]\n\tstr:[%s]\n\
      | ^
src/lib_ccx/ccx_encoders_splitbysentence.c:297:1: warning: multi-line comment [-Wcomment]
  297 | //   sprintf(fmtbuf, "SBS: sbs_find_insert_point_partial: REPLACE ENTIRE TAIL !!\n\
      | ^
src/lib_ccx/ccx_encoders_splitbysentence.c:222:6: warning: unused variable ‘i’ [-Wunused-variable]
  222 |  int i; // top level indexer for strings
      |      ^
src/lib_ccx/ccx_encoders_splitbysentence.c: In function ‘reformat_cc_bitmap_through_sentence_buffer’:
src/lib_ccx/ccx_encoders_splitbysentence.c:730:8: warning: unused variable ‘str’ [-Wunused-variable]
  730 |  char *str;
      |        ^~~
src/lib_ccx/ccx_encoders_splitbysentence.c:729:6: warning: unused variable ‘i’ [-Wunused-variable]
  729 |  int i = 0;
      |      ^
src/lib_ccx/ccx_encoders_splitbysentence.c:728:6: warning: unused variable ‘used’ [-Wunused-variable]
  728 |  int used;
      |      ^~~~
src/lib_ccx/ccx_encoders_splitbysentence.c:727:18: warning: unused variable ‘ms_end’ [-Wunused-variable]
  727 |  LLONG ms_start, ms_end;
      |                  ^~~~~~
src/lib_ccx/ccx_encoders_splitbysentence.c:727:8: warning: unused variable ‘ms_start’ [-Wunused-variable]
  727 |  LLONG ms_start, ms_end;
      |        ^~~~~~~~
src/lib_ccx/ccx_encoders_splitbysentence.c:726:20: warning: unused variable ‘rect’ [-Wunused-variable]
  726 |  struct cc_bitmap* rect;
      |                    ^~~~

* spupng: Fix warnings

GCC warns:

src/lib_ccx/ccx_encoders_spupng.c: In function ‘init_face’:
src/lib_ccx/ccx_encoders_spupng.c:644:6: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
  644 |  if (error = FT_New_Face(ft_library, font, 0, face))
      |      ^~~~~
src/lib_ccx/ccx_encoders_spupng.c:651:6: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
  651 |  if (error = FT_Set_Pixel_Sizes(*face, 0, FONT_SIZE))
      |      ^~~~~
src/lib_ccx/ccx_encoders_spupng.c: In function ‘spupng_export_string2png’:
src/lib_ccx/ccx_encoders_spupng.c:698:7: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
  698 |   if (error = FT_Init_FreeType(&ft_library))
      |       ^~~~~
src/lib_ccx/ccx_encoders_spupng.c:706:6: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
  706 |  if (error = init_face(&face_regular, ccx_options.enc_cfg.render_font))
      |      ^~~~~
src/lib_ccx/ccx_encoders_spupng.c:708:6: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
  708 |  if (error = init_face(&face_italics, ccx_options.enc_cfg.render_font_italics))
      |      ^~~~~
src/lib_ccx/ccx_encoders_spupng.c:850:9: warning: unused variable ‘height’ [-Wunused-variable]
  850 |     int height = slot->bitmap.rows;
      |         ^~~~~~
src/lib_ccx/ccx_encoders_spupng.c:849:9: warning: unused variable ‘width’ [-Wunused-variable]
  849 |     int width = slot->bitmap.width;
      |         ^~~~~
src/lib_ccx/ccx_encoders_webvtt.c: In function ‘write_webvtt_header’:
src/lib_ccx/ccx_encoders_webvtt.c:263:1: warning: control reaches end of non-void function [-Wreturn-type]
  263 | }
      | ^

* webvtt: Fix missing return warning

The return value of this function is never used, so just drop the
values.

GCC warns:

src/lib_ccx/ccx_encoders_webvtt.c: In function ‘write_webvtt_header’:
src/lib_ccx/ccx_encoders_webvtt.c:263:1: warning: control reaches end of non-void function [-Wreturn-type]
  263 | }
      | ^

* gxf: Fix MIN macro redefinition warning

GCC warns:

src/lib_ccx/ccx_gxf.c:23: warning: "MIN" redefined
   23 | #define MIN(a, b) ( (a < b) ? a : b)
      |
In file included from src/lib_ccx/ccx_demuxer.h:8,
                 from src/lib_ccx/ccx_gxf.h:4,
                 from src/lib_ccx/ccx_gxf.c:13:
src/lib_ccx/utility.h:8: note: this is the location of the previous definition
    8 | #define MIN(X, Y) (((X) < (Y)) ? (X) : (Y))
      |

* dvd: Fix unused variable warnings

GCC warns:

src/lib_ccx/dvd_subtitle_decoder.c: In function ‘get_bitmap’:
src/lib_ccx/dvd_subtitle_decoder.c:133:9: warning: unused variable ‘discard’ [-Wunused-variable]
  133 |     int discard = get_bits(ctx, &nextbyte, &pos, &m);
      |         ^~~~~~~
src/lib_ccx/dvd_subtitle_decoder.c:172:9: warning: unused variable ‘discard’ [-Wunused-variable]
  172 |     int discard = get_bits(ctx, &nextbyte, &pos, &m);
      |         ^~~~~~~
src/lib_ccx/dvd_subtitle_decoder.c: In function ‘write_dvd_sub’:
src/lib_ccx/dvd_subtitle_decoder.c:320:6: warning: unused variable ‘ret’ [-Wunused-variable]
  320 |  int ret =0;
      |      ^~~

* es_functions: Fix unused variable warning

This also removes the stale commented code that used this variable.

GCC warns:

src/lib_ccx/es_functions.c: In function ‘read_pic_info’:
src/lib_ccx/es_functions.c:682:7: warning: unused variable ‘frame_type_to_char’ [-Wunused-variable]
  682 |  char frame_type_to_char[] = { '?', 'I', 'P','B', 'D', '?', '?','?' };
      |       ^~~~~~~~~~~~~~~~~~

* dvb: Fix unused variable warning when OCR is disabled

GCC warns:

src/lib_ccx/dvb_subtitle_decoder.c: In function ‘write_dvb_sub’:
src/lib_ccx/dvb_subtitle_decoder.c:1509:6: warning: unused variable ‘ret’ [-Wunused-variable]
 1509 |  int ret = 0;
      |      ^~~

* general_loop: Fix warnings

GCC warns:

src/lib_ccx/general_loop.c: In function ‘general_loop’:
src/lib_ccx/general_loop.c:1113:15: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
 1113 |      (enc_ctx && (enc_ctx->srt_counter || enc_ctx->cea_708_counter) ||
      |       ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
At top level:
src/lib_ccx/general_loop.c:25:28: warning: ‘DO_NOTHING’ defined but not used [-Wunused-const-variable=]
   25 | const static unsigned char DO_NOTHING[] = {0x80, 0x80};
      |                            ^~~~~~~~~~

* networking: Fix unknown pragma warning for non-MSVC compilers

GCC warns:

src/lib_ccx/networking.c:22: warning: ignoring #pragma warning  [-Wunknown-pragmas]
   22 | #pragma warning( suppress : 4005)
      |

* networking: Fix unused variable warnings on non-Windows platforms

GCC warns:

src/lib_ccx/networking.c: In function ‘net_udp_read’:
src/lib_ccx/networking.c:342:12: warning: variable ‘addr’ set but not used [-Wunused-but-set-variable]
  342 |  in_addr_t addr;
      |            ^~~~
src/lib_ccx/networking.c:340:12: warning: unused variable ‘len’ [-Wunused-variable]
  340 |  socklen_t len = sizeof(source_addr);
      |            ^~~
src/lib_ccx/networking.c:338:7: warning: unused variable ‘ip’ [-Wunused-variable]
  338 |  char ip[INET_ADDRSTRLEN];
      |       ^~

* params: Fix unused variable warning when OCR is disabled

GCC warns:

src/lib_ccx/params.c: In function ‘version’:
src/lib_ccx/params.c:1015:8: warning: unused variable ‘leptversion’ [-Wunused-variable]
 1015 |  char *leptversion;
      |        ^~~~~~~~~~~

* params_dump: Fix empty encoding when ASCII is used

GCC warns:

src/lib_ccx/params_dump.c: In function ‘params_dump’:
src/lib_ccx/params_dump.c:110:2: warning: enumeration value ‘CCX_ENC_ASCII’ not handled in switch [-Wswitch]
  110 |  switch (ccx_options.enc_cfg.encoding)
      |  ^~~~~~

* params_dump: Fix comparison between mismatching enums

GCC warns:

src/lib_ccx/params_dump.c: In function ‘print_file_report’:
src/lib_ccx/params_dump.c:402:18: warning: comparison between ‘enum ccx_stream_type’ and ‘enum ccx_stream_mode_enum’ [-Wenum-compare]
  402 |    (info->stream == CCX_SM_TRANSPORT ||
      |                  ^~
src/lib_ccx/params_dump.c:403:18: warning: comparison between ‘enum ccx_stream_type’ and ‘enum ccx_stream_mode_enum’ [-Wenum-compare]
  403 |     info->stream == CCX_SM_PROGRAM ||
      |                  ^~
src/lib_ccx/params_dump.c:404:18: warning: comparison between ‘enum ccx_stream_type’ and ‘enum ccx_stream_mode_enum’ [-Wenum-compare]
  404 |     info->stream == CCX_SM_ASF ||
      |                  ^~
src/lib_ccx/params_dump.c:405:18: warning: comparison between ‘enum ccx_stream_type’ and ‘enum ccx_stream_mode_enum’ [-Wenum-compare]
  405 |     info->stream == CCX_SM_WTV))
      |                  ^~

* telxcc: Fix unused variable warning

GCC warns:

src/lib_ccx/telxcc.c: In function ‘process_telx_packet’:
src/lib_ccx/telxcc.c:928:10: warning: unused variable ‘flag_subtitle’ [-Wunused-variable]
  928 |  uint8_t flag_subtitle;
      |          ^~~~~~~~~~~~~

* ts_functions: Fix unused variable warnings

GCC warns:

src/lib_ccx/ts_functions.c: In function ‘get_pts’:
src/lib_ccx/ts_functions.c:642:11: warning: variable ‘pes_packet_length’ set but not used [-Wunused-but-set-variable]
  642 |  uint16_t pes_packet_length;
      |           ^~~~~~~~~~~~~~~~~
src/lib_ccx/ts_functions.c:641:10: warning: variable ‘pes_stream_id’ set but not used [-Wunused-but-set-variable]
  641 |  uint8_t pes_stream_id;
      |          ^~~~~~~~~~~~~

* ts_tables_epg: Fix warnings

GCC warns:

src/lib_ccx/ts_tables_epg.c: In function ‘EPG_add_event’:
src/lib_ccx/ts_tables_epg.c:380:6: warning: unused variable ‘isnew’ [-Wunused-variable]
  380 |  int isnew=true, j;
      |      ^~~~~
src/lib_ccx/ts_tables_epg.c: In function ‘EPG_DVB_decode_string’:
src/lib_ccx/ts_tables_epg.c:469:6: warning: variable ‘ret’ set but not used [-Wunused-but-set-variable]
  469 |  int ret=-1;
      |      ^~~
src/lib_ccx/ts_tables_epg.c: In function ‘EPG_ATSC_decode_EIT’:
src/lib_ccx/ts_tables_epg.c:802:25: warning: variable ‘emt_location’ set but not used [-Wunused-but-set-variable]
  802 |   uint8_t title_length, emt_location;
      |                         ^~~~~~~~~~~~
src/lib_ccx/ts_tables_epg.c:764:10: warning: variable ‘table_id’ set but not used [-Wunused-but-set-variable]
  764 |  uint8_t table_id;
      |          ^~~~~~~~
src/lib_ccx/ts_tables_epg.c: In function ‘EPG_ATSC_decode_VCT’:
src/lib_ccx/ts_tables_epg.c:837:10: warning: variable ‘table_id’ set but not used [-Wunused-but-set-variable]
  837 |  uint8_t table_id;
      |          ^~~~~~~~
src/lib_ccx/ts_tables_epg.c: In function ‘EPG_DVB_decode_EIT’:
src/lib_ccx/ts_tables_epg.c:883:10: warning: variable ‘segment_last_section_number’ set but not used [-Wunused-but-set-variable]
  883 |  uint8_t segment_last_section_number;
      |          ^~~~~~~~~~~~~~~~~~~~~~~~~~~
src/lib_ccx/ts_tables_epg.c:882:10: warning: variable ‘last_section_number’ set but not used [-Wunused-but-set-variable]
  882 |  uint8_t last_section_number;
      |          ^~~~~~~~~~~~~~~~~~~
src/lib_ccx/ts_tables_epg.c: In function ‘parse_EPG_packet’:
src/lib_ccx/ts_tables_epg.c:1041:11: warning: unused variable ‘transport_error_indicator’ [-Wunused-variable]
 1041 |  unsigned transport_error_indicator = (tspacket[1]&0x80)>>7;
      |           ^~~~~~~~~~~~~~~~~~~~~~~~~

* matroska: Fix unused variable warning

The call is left alone since it might create a decoder context.
GCC warns:

src/lib_ccx/matroska.c: In function ‘matroska_save_all’:
src/lib_ccx/matroska.c:1182:27: warning: unused variable ‘dec_ctx’ [-Wunused-variable]
 1182 |     struct lib_cc_decode *dec_ctx = update_decoder_list(mkv_ctx->ctx);
      |                           ^~~~~~~

* utility: Only define MIN when necessary

GCC warns:

In file included from src/lib_ccx/ccx_demuxer.h:8,
                 from src/lib_ccx/lib_ccx.h:15,
                 from src/gpacmp4/mp4.c:6:
src/lib_ccx/utility.h:8: warning: "MIN" redefined
    8 | #define MIN(X, Y) (((X) < (Y)) ? (X) : (Y))
      |
In file included from src/gpacmp4/gpac/tools.h:33,
                 from src/gpacmp4/gpac/isomedia.h:50,
                 from src/gpacmp4/mp4.c:5:
src/gpacmp4/gpac/setup.h:324: note: this is the location of the previous definition
  324 | #define MIN(X, Y) ((X)<(Y)?(X):(Y))
      |
2020-01-23 18:49:16 -08:00
Nils ANDRÉ-CHANG
8db3398eb7 [IMPROVEMENT] Implement subtitle modifications for each encoder (#1214)
* Implement subtitle modification for all 608 encoders

This is done by modifying the subtitles in `ccx_encoders_common.c`
rather than per encoder.

* Use `char *` instead of subtitle data to capitalize

* Implement subtitle modification for OCR encoders

* Remove signness warnings

* Remove two-word profanity

They do not work for the moment

* Deal with different encoding

* Mention in changelog
2020-01-23 18:45:56 -08:00
Nils ANDRÉ-CHANG
7b038ab649 Fix use-after-free (#1215) 2020-01-23 09:39:45 -08:00
kdrag0n
7d0c2ede26 [IMPROVEMENT] Clean up SCC control codes (#1212)
* scc: Reformat control code list

- Separate sections with a blank line
- Align with 4-wide tabs rather than spaces
- Rewrite some comments

* scc: Revamp control code handling

This can be made much more readable by adding a small info struct that
contains all the information about a control code (first byte odd &
even, second byte, and assembly). Information is stored in and retrieved
from an array, created using an array initializer with the enum values
as indices.

This allows us to remove the massive switch-case blocks, leading to much
cleaner and more streamlined code.
2020-01-22 23:26:23 -08:00
Nils ANDRÉ-CHANG
60773bb859 [IMPROVEMENT] Add noreturn attribute to fatal (#1179)
* Set no return

* Add MSVC
2020-01-22 23:25:43 -08:00
kdrag0n
a919ef4410 [FIX] SCC character pair writing (#1210)
* scc: Fix character pair writing

The space was being inserted in the wrong position, so the first
character of each caption was being cut off. The last character was also
cut off in captions with even lengths.

Reported-By: Nils ANDRÉ-CHANG <nils@nilsand.re>

* scc: Apply pair writing to control codes

The same mandatory pair logic applies here.
2020-01-22 23:23:00 -08:00
kdrag0n
424e67f5f4 [FIX] Fix SCC timing and lingering captions (#1211)
* scc: Fix timing and lingering captions

- Write EDM codes at end times to clear them from the screen as intended
  by the captioners
- Show captions at the correct times:
  - EOC+ENM *shows* the caption. It doesn't clear it -- that's EDM's job.
  - The caption is *not* shown immediately after loading. EOC (End Of
    Caption) is required for it to actually show.

Old behavior:
Start time: Load caption
End time: Show loaded caption

New behavior:
Start time: Load and show caption
End time: Clear displayed caption

These changes fix the issue where captions were always one line off --
that is, caption 1 would show when caption 2 was supposed to show.

* scc: Calculate frame number using a more precise frame rate

* scc: Fix timecode format specifiers

These are ints are unsigned.
2020-01-22 23:18:18 -08:00
Nils ANDRÉ-CHANG
4097831b9b Remove useless O(N) operations and memory allocations (#1207) 2020-01-22 09:03:21 -08:00
kdrag0n
1764aa1f92 scc: Write all characters in pairs (#1208)
This is how every example appears to be structured. MPV doesn't display
anything without this.

Before: "e5 f2 e5 20"
After: "e5f2 e520"
2020-01-22 08:01:53 -08:00
kdrag0n
19de49763a [FIX] Fix minor memory leak in OCR code (#1206)
* ocr: Fix minor memory leak

Detected by Valgrind:

==1203168== 2,880 bytes in 57 blocks are definitely lost in loss record 3 of 4
==1203168==    at 0x483877F: malloc (vg_replace_malloc.c:309)
==1203168==    by 0x51ADBEE: strdup (in /usr/lib/libc-2.30.so)
==1203168==    by 0x24D1F8: ocr_bitmap (ocr.c:569)
==1203168==    by 0x24E25B: ocr_rect (ocr.c:907)
==1203168==    by 0x284832: write_dvb_sub (dvb_subtitle_decoder.c:1665)
==1203168==    by 0x284B7A: dvbsub_handle_display_segment (dvb_subtitle_decoder.c:1720)
==1203168==    by 0x285024: dvbsub_decode (dvb_subtitle_decoder.c:1828)
==1203168==    by 0x2406AF: process_data (general_loop.c:648)
==1203168==    by 0x2416D0: general_loop (general_loop.c:1025)
==1203168==    by 0x1AC89A: api_start (ccextractor.c:214)
==1203168==    by 0x16EC03: main (ccextractor.c:536)

* changes: Document OCR memory leak fix
2020-01-21 08:19:19 -08:00
kdrag0n
a0b4e389f9 [FIX] EIA-608 screen clearing fix (#1203)
* eia608: Re-use constant rather than hard-coding length in arrays

Hard-coding them is less clear and more prone to breakage.

* eia608: Add and use constant for max number of rows

Hard-coding it everywhere is unclear and prone to breakage.

* eia608: Initialize colors and fonts properly with a loop

memset is for single-byte types; an enum is defined to be the size of an
int, so using memset to fill an array of enum values is incorrect.

Fix it by using a simple loop to fill the elements, as there is no
memset-like function for arbitrary item lengths in C.

GCC warns:

src/lib_ccx/ccx_decoders_608.c: In function ‘clear_eia608_cc_buffer’:
src/lib_ccx/ccx_decoders_608.c:111:3: warning: ‘memset’ used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
  111 |   memset(data->colors[i], context->settings->default_color, CCX_DECODER_608_SCREEN_WIDTH + 1);
      |   ^~~~~~
src/lib_ccx/ccx_decoders_608.c:112:3: warning: ‘memset’ used with length equal to number of elements without multiplication by element size [-Wmemset-elt-size]
  112 |   memset(data->fonts[i], FONT_REGULAR, CCX_DECODER_608_SCREEN_WIDTH + 1);
      |   ^~~~~~
2020-01-20 19:06:06 -08:00
Nils ANDRÉ-CHANG
2281051d3d Remove warning when calling paraof_ocrtext (#1199) 2020-01-19 16:51:41 -08:00
Willem
fc21280857 Merge pull request #1201 from NilsIrl/ocr_hardsubx_cmake_actions
[IMPROVEMENT] Test with OCR and HARDSUBX
2020-01-19 21:36:06 +01:00
Nils André-Chang
746806dcef Cmake with OCR and Hardsubx in different job 2020-01-19 20:30:04 +00:00
Nils André-Chang
812734fd2a Add dependencies 2020-01-19 20:00:03 +00:00
Nils ANDRÉ-CHANG
66d59e498b Make -ocrlang work (#1200) 2020-01-19 11:44:16 -08:00
Nils André-Chang
5599ce9eaf Test with OCR and HARDSUBX 2020-01-19 19:16:15 +00:00
Willem
2e2075ca52 Add GitHub Action for Windows
Adds a GitHub Action that will build CCExtractor for Windows with msbuild. It will build in Release mode and Debug mode, without OCR or other features enabled.
2020-01-19 15:34:25 +01:00
Sam Poder
c69d2db52b [FEATURE] Simple MacOS GUI (#1138)
* Create info.md

* Add files via upload

* Update

* Rename info.md to README.md

* Delete InstallCCExtractor.zip

* Add files via upload

* fix bugs

* Update InstallCCExtractorMacGUI.zip

* Create placeholder.md

* Add Source Files

* Create HowToGenerateApp

* Rename HowToGenerateApp to HowToGenerateApp.md

* Done Alert
2020-01-18 17:34:19 -08:00
Willem
54ecce8b86 Merge pull request #1193 from NilsIrl/cmake_github_action
[IMPROVEMENT] Add Cmake job to github action
2020-01-18 21:45:30 +01:00
Nils André-Chang
82b60988bb Parallelize 2020-01-18 20:20:40 +00:00
Nils André-Chang
ab1af7c678 Add Cmake job to github action 2020-01-18 20:16:15 +00:00
Nils ANDRÉ-CHANG
84ba7c5238 Fix segfault (#1192) 2020-01-18 12:15:40 -08:00
Willem
676be1f193 Add GitHub Action for Linux
Adds a GitHub Action that will build CCExtractor for Linux (Ubuntu in this case) using the shell script and the autoconf option.
2020-01-18 20:05:42 +01:00
Nils ANDRÉ-CHANG
e8cb55e739 [FIX] Fix free segfault (#1190)
* Fix free segfault

I restricted the scope and used free because the features of freep
aren't needed here.

Restricting the scope makes it clear when freeing the variable should be
done.

* Mention that freeing should be done
2020-01-18 09:29:58 -08:00
Nils ANDRÉ-CHANG
30613b224a Fix memory leak (#1187)
Addresses https://github.com/CCExtractor/ccextractor/pull/402#discussion_r368041348
2020-01-18 08:53:43 -08:00
Nils ANDRÉ-CHANG
19241744d7 [FEATURE] SCC and CCD encoder (#1154)
* Fix indentation, use switch instead of if

* Remove confusing comment

Enums are abstractions and should be used as such. They shouldn't be
used like integers.

* Return a const char* instead of char * allocated on heap

* Test return value inline

* Add SCC output

* Add CCD format

* Add channel header to CCD

* Return const pointer

* Revert formatting change

* Colour -> Color

* Fix formatting

* Move comment to relevant place

* Improve readability

* Fix formatting

* Fix erroneous comment

* Use different parity function not requiring GNU extension

* Use enum instead of int

* Fix bug

* Implement channel functionality

* Fix CI errors

* Fix CI build

* Add options to help menu

* Mention change in changelog

* Add file to build systems

* Remove uneeded link against zlib

* Remove the use of <stdbool.h> and use const char

* Rewrite SCC formatter

* Use fdprintf
2020-01-18 08:52:03 -08:00
Willem
27288ccf89 Merge pull request #1189 from NilsIrl/warning_filter_word
[IMPROVEMENT] Fix implicit declaration of function 'add_word'

Closes #1188
2020-01-18 17:31:17 +01:00
Nils André-Chang
34282c17b8 Fix implicit declaration of function 'add_word'
Fix #1188
2020-01-18 16:16:34 +00:00
Nils ANDRÉ-CHANG
227f149670 [FIX] Allow -dvblang that doesn't follow ISO 639-2 (#1183)
* Allow `-dvblang` that doesn't follow ISO 639-2

Fix #1161

* Allows 'und' to be specified to `-dvblang`
2020-01-16 12:03:13 -08:00
Nils ANDRÉ-CHANG
27477e9f7c [IMPROVEMENT] Remove warnings (#1186)
* [Warning] Make subtitle modification work on unsigned char *

* Remove LOG_DEBUG no side effect warning
2020-01-16 08:25:25 -08:00
Jacob Shin
b3018e083e [FIX] Add FT_Done_Face to destroy face objects after they're used (#1184)
* Add FT_Done_Face to destroy face objects after they're used

* Update CHANGES.TXT
2020-01-14 17:11:18 -08:00
Nils ANDRÉ-CHANG
96de55429d Remove freep warnings (#1182) 2020-01-14 11:22:31 -08:00
Nils ANDRÉ-CHANG
863eacc440 Revert "Remove freep warning (#1180)" (#1181)
This reverts commit 78249045f8.
2020-01-13 14:12:39 -08:00
Nils ANDRÉ-CHANG
78249045f8 Remove freep warning (#1180) 2020-01-13 12:16:42 -08:00
Nils ANDRÉ-CHANG
dad108b7e1 Fix wrong format string (#1177) 2020-01-13 07:54:15 -08:00
Dhrumil Patel
79f18b996b [FIX] Added the option to disable timestamps for WebVTT (#1176)
* Added the option to disable timestamps for WebVTT

* Mentioned in changelog

* Added the option to params.c

* Encoder checks its context nwo

* Encoder checks its context
2020-01-12 18:06:26 -08:00
Nils ANDRÉ-CHANG
987c5cd301 Remove useless nulling of pointer (#1171) 2020-01-09 17:36:10 -08:00
Nils ANDRÉ-CHANG
34d0df1d96 [Fix] Make -delay all output formats (#1167)
* Fix indentation

* Calculate subs_delay in encode_sub rather than in the individual encoders

Fix #1103

* Use precalculated times when sub->type == CC_TEXT

* Use calculate delay in encode_sub when sub->type == CC_608
2020-01-09 17:35:19 -08:00
Willem
1db731a7a8 Update CHANGES.TXT 2020-01-05 18:44:05 +01:00
Willem
af67596e66 Merge pull request #1139 from NilsIrl/filter_bad_words
Adds a built-in method to filter bad words to the program.
2020-01-05 18:41:37 +01:00
Jacob Shin
86f98ddf5f Used the INET_ADDRSTRLEN constant for network functions (#1172) 2020-01-04 07:34:10 +01:00
eshandhawan51
bba6c4fcfd [FIX] Solved issue #1131 (#1169)
* Removed invalid free condition for multiple files

* Apply suggestions from code review

statement to free pointer

Co-Authored-By: Nils ANDRÉ-CHANG <nils@nilsand.re>

Co-authored-by: Nils ANDRÉ-CHANG <nils@nilsand.re>
2020-01-02 17:56:02 +01:00
Nils André-Chang
af64fa8a3d Remove multi word profanity 2020-01-01 21:44:02 +00:00
Nils André-Chang
e1d3060232 Fix crash 2020-01-01 17:15:53 +00:00
Willem
3a1815163f Merge pull request #1164 from NilsIrl/patch-1
[IMPROVEMENT] Mention -DWITH_OCR in compilation instruction
2019-12-31 05:26:08 +01:00
Willem
0954b47a24 Merge pull request #1165 from jshin313/xp
[FIX] Change inet_ntop to inet_ntoa for Windows XP compatibility
2019-12-30 19:32:58 +01:00
Jacob Shin
594a83cc4e Update CHANGES.TXT 2019-12-30 11:59:58 -05:00
Jacob Shin
ecec3ea22b Change inet_ntop to inet_ntoa for Windows XP compatibility 2019-12-30 11:55:30 -05:00
Nils ANDRÉ-CHANG
f9cfc7219d Mention -DWITH_OCR 2019-12-30 14:12:15 +00:00
Jacob Shin
c854d25963 [FIX] Get rid of a few compilation warnings (#1160)
* Added underline support

* Added changes to CHANGES.TXT

* Delete CHANGES.TXT~

* Delete .CHANGES.TXT.un~

* Update CHANGES.TXT

* Changed strncpy to memcpy when the size of the data being transferred is known

* Add declaration of struct image_copy before function

* Used strdup for duplicating strings

* Added error checking for strdup
2019-12-29 22:26:30 +01:00
Nils André-Chang
4fe32b1482 Fix syntax error because of forgotten brace 2019-12-28 23:34:55 +00:00
Nils ANDRÉ-CHANG
5fcb31d279 Rename spell_correct to capitalization_list 2019-12-28 23:24:04 +00:00
Nils ANDRÉ-CHANG
b2d3a2fefc Fix error where wrong return valued is checked 2019-12-28 23:24:04 +00:00
Nils ANDRÉ-CHANG
70ac7f9a40 Sort both capitalization and profanity lists 2019-12-28 23:24:04 +00:00
Nils ANDRÉ-CHANG
f739d54cbc Remove checking if function is called twice 2019-12-28 23:24:04 +00:00
Nils ANDRÉ-CHANG
fc78fc3192 Rename fix_subtitles to correct_spelling_and_censor_words_608 2019-12-28 23:24:04 +00:00
Nils ANDRÉ-CHANG
b0e5eb03e1 Feedback 2019-12-28 23:24:04 +00:00
Nils ANDRÉ-CHANG
84cff4d6d8 Fix subtitles for more encoders 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
e5575a0f50 Remove useless wrappers 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
f4961a0bd8 Remove lower_spell list as it's useless 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
e3e810f34e Fix bug with asterisk 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
57eb1795aa Make a fix_subtitles function 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
37e4d4163f Fix '\0' in output file 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
59a8c7a049 Censor word when in dictionary 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
8ef89f6bf1 Fix double free error 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
2739602575 Add missing continue 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
a7d2264cc1 Use correct function 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
7d8499a7fb Rename profanity_file to filter_profanity_file. Dump params 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
99a12b8737 Add --kf option and parse files 2019-12-28 23:21:13 +00:00
Nils ANDRÉ-CHANG
5b29db341f Remove space before ';' 2019-12-28 22:56:40 +00:00
MarcusGaiusPompey
777ce98aa5 Initialize fatal_ftn before first use (#1155) 2019-12-27 17:47:59 +01:00
Nils ANDRÉ-CHANG
fe9c94d50c Make hardsubx_classifier non executable as it's a C file (#1158) 2019-12-27 09:37:38 +01:00
Jacob Shin
6d074928b6 [FIX] Added underline support for -out=spupng with EIA608/teletext (#1157)
* Added underline support

* Added changes to CHANGES.TXT

* Delete CHANGES.TXT~

* Delete .CHANGES.TXT.un~

* Update CHANGES.TXT
2019-12-26 22:47:52 +01:00
Jacob Shin
1e32bee8e5 [FIX] Added support for font colors and italics (#1132)
* Added support for <i> and <b> tags

* Deleted code support bold

* Added -italics flag to sepcify italics font

* Added function for initializing freetype font face objects

* Added support for color
2019-12-22 19:36:50 -08:00
Nils ANDRÉ-CHANG
6281e128aa Use shebang line that can work on different distributions (#1156) 2019-12-22 13:43:57 -08:00
Fonseverin
c1c0627dab [IMPROVEMENT] Add fatals to params without args (#1152)
* Update cnf file. Correct and comment constants.

* Add URLs to standards.

* Add fatals.

* Add brackets to if-else.

* Update CHANGES.
2019-12-21 19:58:05 -08:00
MarcusGaiusPompey
9cfc345041 [IMPROVEMENT] Removed redundant check_configuration_file function (#1153)
* Removed redundant function

* Updated changelog
2019-12-21 19:51:56 -08:00
Jacob Shin
f3a72bff3d Added back define to make building on Windows work again (#1151) 2019-12-19 11:13:56 -08:00
Fonseverin
e906585287 [IMPROVEMENT] Minor styling improvement. (#1149)
* Add, remove spaces. Optimise if-clause.

* Update cnf file. Correct and comment constants.

* Unite style. Add/Remove spaces. Correct errors.

* Add URLs to standards.

* Correct order in enum.
2019-12-15 09:39:43 -08:00
Carlos Fernandez Sanz
b27c6fe415 Removed duplicated MIN / MAX #define's 2019-12-15 09:37:17 -08:00
Sudoxo
5e888ee895 [FIX] Hang while processing video #1121 (#1146) 2019-12-10 12:53:21 -08:00
Sam Poder
c9f55f5a39 [FIX]Update utf8proc (#1145)
* Create info.md

* Add files via upload

* Update

* Rename info.md to README.md

* Delete InstallCCExtractor.zip

* Add files via upload

* fix bugs

* Update InstallCCExtractorMacGUI.zip

* Create placeholder.md

* Add Source Files

* Create HowToGenerateApp

* Rename HowToGenerateApp to HowToGenerateApp.md

* To Remove Previous Commits to Fork

* UpdateFreeType

* Revert "UpdateFreeType"

This reverts commit fee2da1615.

* hi

* Revert "hi"

This reverts commit dfcd3aec13.

* UpdateFreeType

* fixmistake

* reboot

* reboot
2019-12-10 08:21:19 -08:00
Sudoxo
1e9939bc8a [FIX] Segmentation fault on VOB #1128 (#1142) 2019-12-09 21:18:09 -08:00
Nils ANDRÉ-CHANG
df66746e89 [FIX] Make header respect -lf for the webvtt encoder (#1134)
* Make header respect `-lf`

* [ccx_encoders_webvtt.c] Use the ternary operator to select line endings

* Use sprintf for choosing line ending and use ternary operator

* Revert
2019-12-08 16:46:01 -08:00
Fonseverin
5dac23f156 [FIX] Compilation warnings (#1133)
* Add comments clarifying ccextractor.cnf and locale

* Comments on unobvious ctx entries

* no_rollup explanation in ccx_s_options

* Unified mprint format. Removed obvious comment.

* Commented out unused lines and corrected if-clause

* Changed unsigned char * to char *

* Returned to unsigned buffers

* Unsigned buf converted to signed

* Correct some lines causing warnings

* Added cases TODO. Some minor corrections.

* Better fixes for some warnings

* Convert explicit convert unsigned to signed

* Update CHANGES.TXT

* Update CHANGES.TXT

* Fix typos. Initialization for variables.

* Change comment on no_rollup. No more magic sizeof

* Fix typos. Delete question-comments.

* Change comments.

* Fix vital bug with wrong memset.

* No ugly defines.

* Stash change on extern lib. Correct internal files
2019-12-08 16:44:34 -08:00
Willem
a3148f07ac Merge pull request #1136 from NilsIrl/patch-1
[IMPROVEMENT] Fix typo
2019-12-06 12:06:18 +01:00
Nils ANDRÉ-CHANG
75e21feee3 Fix typo 2019-12-06 09:48:34 +00:00
grave-panda
334a87aed1 [IMPROVEMENT] Update FFMpeg guide to use markdown. (#1130)
* Rename FFMPEG.TXT to FFMPEG.md

* Update FFMPEG.md

update file to use markdown.
2019-12-02 18:37:59 -08:00
Willem
ee3418cd60 Merge pull request #1129 from sampoder/add-tv-samples
[IMPROVEMENT] Add TV Samples to README
2019-12-02 11:29:39 +01:00
Sam Poder
b9ca8a1291 Add TV Samples to README
For people new to the software it can be a challenge to use it for the first time. By adding this to the README they can see the file formats supported and how the software works without having to search for their own file. This will be especially helpful to the many new GCI students who likely don't have much experience in the TV industry but want to learn how the software works.
2019-12-02 16:56:33 +08:00
Prabodh Ranjan Swain
280b4308f7 [FIX] Fixed X-TIMESTAMP-MAP formatting error (#1126)
* Fixed X-TIMESTAMP-MAP formatting error

* Removed reformatting of whole file

* Removed reformatting of whole file
2019-11-25 21:30:16 -08:00
rboy1
45eec1c919 Fix for #1115 (#1123)
Sentence case crash (-sc)
2019-11-11 18:01:19 -08:00
rboy1
7ad5859629 Fix for crash while fixing sentence case (#1122)
Check for null pointer before extracting data
2019-11-11 17:59:56 -08:00
Willem
bdfe4ca25b Merge pull request #1110 from thealphadollar/improve_contributionmd
[IMPROVEMENT] Make COMPILATION.md Easier To Use
2019-10-19 13:37:10 +02:00
thealphadollar
3020fd24e7 Improve COMPILATION.md
- Improve the structure of package installation command to make it easy to copy and paste
- Improve the formatting of code blocks by mentioning language as specified by MD
2019-09-28 07:49:54 +05:30
Carlos Fernandez Sanz
0f2a5b3b96 Make CCExtractor great again (as in at least compile on Windows) 2019-09-22 15:03:05 -07:00
Rob
8fec59e753 [FEATURE] Added support for encoding into an MCC File. (CCExtractor#733) (#1097)
* [FEATURE] Added support for encoding into an MCC File. (CCExtractor#733)

* Missed deleting an unused variable declaration as part of a refactor.
2019-09-20 19:58:56 -07:00
Daniel Barea
7598225ee1 [FIX] Fix several memory leaks using Leptonica API for hardcoded subtitle extraction (#1105)
* Rewritten Tesseract and Leptonica imports

* Fixed memory leak extracting hardcoded subtitles

* Minor code enhancements and cleanups

* Fixed memory leak using function pixSauvolaBinarize

* Updated changelog
2019-09-12 08:24:42 -07:00
Eric Mesa
8a9d924fc1 **[FIX]** Enable RPM creation to work correctly (#1106)
* edited Makefile so that RPMs can actually be created

* added what I intend for the pull request to changes.txt
2019-09-11 21:44:11 -07:00
Justin Greer
2bcd993c0f [IMPROVEMENT] MXF caption frame rates (#1101)
* Decode cdp frame rates in mxf files for accurate caption timings.

* Update changelog re: MXF frame rate parsing.
2019-08-15 20:54:05 -07:00
djaydev
e461c14b48 Update OCR.md (#1100)
I had to add "r" or I would get "configure: WARNING: unrecognized options: "--enable-oc""
2019-08-15 08:42:12 -07:00
Richard
c9a6707fdc avfilterhraph.h merged with avfilter.h. (#1098) 2019-08-05 07:55:56 -07:00
Ray Foss
6cb70be4a4 Add RHEL based distros instructions. (#1094)
These are CentOS 7 based, but should work across the board, specifically including 8. I've tested in CENTOS 7 and Fedora 30
2019-06-11 15:29:46 -07:00
Ray Foss
403581462e [FIX] Remove webvtt styling when not using webvtt-full (#1092)
* no styling unless in full mode

* part 1 of moving style to here

* no style header unless requested with webvtt-full

* only one new line to support x-timestamp-map

* move x-timestamp-map up to abide by specifications

and support ffmpeg and brightcove

* remove stray new line, crlfs are added upstream

297 seems to contain a null bug

* don't write null characters to sub file

* needed space after -full mode style

* typo
2019-06-05 16:55:31 -07:00
Willem
9e212fa104 Merge pull request #1089 from MatejMecka/patch-1
Fix Video links for not converting to Markdown
2019-05-23 14:10:35 +02:00
Matej Plavevski
b6978f2fd8 Fix Video links for not converting to Markdown 2019-05-23 13:55:30 +02:00
Willem
513372978c Merge pull request #1088 from aadibajpai/patch-3
[IMPROVEMENT] Update badge with total download count
2019-05-22 09:44:01 +02:00
Aadi Bajpai
de9b198496 Update badge with total download count
And link to latest release
2019-05-22 12:29:24 +05:30
Willem
dac9de4d67 Update Makefile
Bump version
2019-05-21 20:41:08 +02:00
Willem
d56728bd7f Update configure.ac
Bump version
2019-05-21 20:39:54 +02:00
Willem
7fe8ab767c Update configure.ac
Bump version
2019-05-21 20:36:05 +02:00
Carlos Fernandez
403d2fd8a4 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2019-05-21 11:35:19 -07:00
Willem
a9c3207773 Update PKGBUILD
Bump version
2019-05-21 20:34:41 +02:00
Carlos Fernandez
bf478c0ee2 build and builddebug: Updated reference to package in Ubuntu (tesseract has been renamed) 2019-05-21 11:34:18 -07:00
Willem
324667b3e6 Update ccextractor.spec
Bump version
2019-05-21 20:33:52 +02:00
Willem
3ccb250d18 Update debian.sh
Bump version to 0.88
2019-05-21 20:32:44 +02:00
Willem
15b42e2d0c Update README.md
Remove the Google Code In from the readme
2019-05-21 20:28:27 +02:00
Carlos Fernandez Sanz
181b8650ab Added release date for 0.88 in CHANGES.TXT 2019-05-21 11:21:35 -07:00
Carlos Fernandez
724d756aa6 Version bump (0.87 -> 0.88) 2019-05-21 10:59:20 -07:00
Artem Fedoskin
2f096879d5 Fixes #1086 by adding -latrusmap option that maps Latin symbols to Cyrillic ones in some of the Russian Teletext files (#1087) 2019-05-18 10:41:34 -07:00
Artem Fedoskin
d3543ff1a2 Fixes #1084 by adding check for NULL string in ocr.c (#1085) 2019-04-30 17:45:31 -07:00
Carlos Fernandez Sanz
17a6779146 Moved tesseract .h files (which we use for windows builds) into their own directory so the reference is consistent when building on linux from a tesseract in a standard location. 2019-04-04 09:54:39 -07:00
Artem Fedoskin
116f308a0b Improve the way Tesseract is initialized in hardsubx. Fix segfault during the cleaning the frame data in hardsubx. (#1083) 2019-03-25 16:20:45 -07:00
Saurabh Shrivastava
414a57d97e [FIX] Travis build for macOS. Also add cmake build test. (#1063)
* Fix macOS travis build and remove linux builds.

* Add Apple logo for macOS build badge.

* Link the apple logo to travis build.

* Correct redundant compiler type.
2019-03-23 15:24:42 -07:00
Matej Plavevski
6d7c60fe14 [IMPROVEMENT]Update Protobuf-c (#1022)
* Update Protobuf-c

* Try changing place?

* Remove Spaces
2019-03-23 15:21:02 -07:00
Artem Fedoskin
718cf55131 [FEATURE] Added support for DVB inside MKV (#1082)
* [FIX] Fix incorrect comparison of strings for AVC codec id in .mkv

* Initial work on adding DVB support to .mkv

* [REQUEST] Finished adding support for DVB inside MKV (#1000)

* Update CHANGES.TXT
2019-03-23 08:27:34 -07:00
Artyom Fedoskin
4d24568a0b [FEATURE] Added support for EIA-608 inside MKV (#1080)
* Initial work on adding EIA-608 support to Matroska

* [REQUEST] Finished adding support for EIA-608 inside MKV (#1068)
2019-03-15 17:30:48 -07:00
Artyom Fedoskin
ab4f3d0d26 Fixes #1077 by adding check for empty streams (#1079) 2019-03-09 11:04:39 -08:00
cweickhmann
9f308271b9 Update COMPILATION.MD (#1073)
At least in Ubuntu 18.04 (possibly the related Debian version and newer Ubuntus) the package `tesseract-ocr-dev` does not exist anymore. It was replaced by `libtesseract-dev`.
2019-02-20 10:52:23 -08:00
Matej Plavevski
b2d97eb627 Add Google Code-in Winners / finalists (#1066) 2019-01-20 22:03:52 +01:00
ShouryaAggarwal
6209c63ccf [FIX] for issue #668 (Windows and Multicast) (#1059)
* Fixed udp multicast stream issue on windows

* Optimized OS detection for source multicast.

* Moved the udp read code to networking.c file
2018-12-26 18:09:41 +01:00
ShouryaAggarwal
fbf99e8a5e [FIX] #995 for All platforms and removed the icons folder in source package (#1057)
* Fixed the icon file not found error for windows and linux.

* Optimized distribution of icons and removed CSW dependency while running GUI

* Font and icons are now loaded directly from memory

* Added source icons and icons to C array convertor
2018-12-25 23:08:06 +01:00
Anshul Maheshwari
ebcd2bc9ca Merge pull request #1060 from anshul1912/master
Add more tapping points while debugging tesseract
2018-12-22 18:00:07 +05:30
Anshul Maheshwari
0b76cc1991 Add more tapping points while debugging tesseract
When OCR_DEBUG is defined, code will dump more images
to find root cause of failed OCR
2018-12-22 15:12:46 +05:30
Saurabh Shrivastava
9c20e0afb1 Don't add files with empty filename. Also better message for multiple input files. (#994) 2018-12-18 12:16:04 -08:00
MakarovGCI2018
74eefaeea7 [FEATURE] Added support for non-Latin characters in stdout (#1056)
* Update ccextractor.c

* Update CHANGES.TXT
2018-12-14 12:36:35 -08:00
MakarovGCI2018
5a8758fdd2 [FIX] Fixed a minor reportng stats bug (#1054)
* Fixed a minor bug

* Update CHANGES.TXT

* Update CHANGES.TXT
2018-12-12 10:35:56 -08:00
Samuel Deng
7b4bf0b15a **[FIX]** Fix typos; **[IMPROVEMENT]** Update .gitignore for Visual Studio databases (#1052)
* Fix many typos

* Ignore Visual Studio temporary project files

* Log previous 2 commits in CHANGES.TXT
2018-12-11 12:29:21 -08:00
Carlos Fernandez Sanz
be34781a64 Clarify bugfix on encoder being NULL (CHANGES.TXT) 2018-12-02 12:50:25 -08:00
MakarovGCI2018
e3c14991b3 [FIX] Fixed /dev/null bug (#1048)
* Update general_loop.c

* Update general_loop.c

* Update CHANGES.TXT

* Update general_loop.c
2018-12-02 12:48:18 -08:00
Anshul Maheshwari
38fc6e5623 Merge pull request #1038 from anshul1912/master
Add support for 4.0 tesseract
2018-11-07 18:53:45 +05:30
Anshul Maheshwari
5dbbe654f0 Add support for 4.0 tesseract 2018-11-07 14:13:36 +05:30
Anshul Maheshwari
5df1dbb922 Merge pull request #1037 from anshul1912/master
[IMPROVEMENT]Remove multiple RGB to grey conversion while OCR
2018-11-05 16:31:02 +05:30
Anshul Maheshwari
ef3d25c25b Indentocr.
Some Space and Indentation
2018-11-05 15:09:08 +05:30
Anshul Maheshwari
d22ab6f9a1 remove multiple RGB to grey conversion while OCR 2018-11-05 15:03:10 +05:30
Anshul Maheshwari
b8a15f6f9d Merge pull request #1036 from AZtheAsian/master
[IMPROVEMENT] Added missing options to help text
2018-11-05 07:46:22 +05:30
Alan Zhu
ebf06a9c2b Added missing options to help text 2018-11-04 10:46:07 -07:00
Anshul Maheshwari
04abf755c2 Merge pull request #1035 from T1duS/cmake_warnings
[FIX] Removed some CMake Warnings
2018-11-02 19:52:26 +05:30
T1duS
a99bc37d88 Removed some CMake Warnings 2018-11-02 14:48:49 +05:30
Matej Plavevski
1807ea9098 [IMPROVEMENT] Warn instead of Crash for Missing Final 0xFF Marker! (#1032)
* Less Harsher when Marker is missing

* Update changelog

* Skip Block by breaking

* Reference GitHub issue

* Forgot {}

* Update docs/CHANGES.TXT

Co-Authored-By: MatejMecka <matej.plavevski+github@gmail.com>
2018-11-01 17:07:53 -07:00
Anshul Maheshwari
0b29fc2329 Merge pull request #1031 from T1duS/autotool_warning_removal
[FIX] Autotool warning removal
2018-11-01 15:58:47 +05:30
T1duS
ced636025e Merge branch 'autotool_warning_removal' of https://github.com/T1duS/ccextractor into autotool_warning_removal 2018-11-01 15:05:41 +05:30
Aadi Bajpai
3cfe406a79 Add downloads badge for v0.87 (#1033) 2018-10-31 14:50:51 -07:00
Udit Sanghi
23a745dcec Merge pull request #1 from CCExtractor/master
.
2018-10-31 22:34:34 +05:30
T1duS
03b1f5bfd2 Removed compile warnings caused by autotools 2018-10-31 22:23:23 +05:30
Udit Sanghi
b8c1499111 Fixed some warnings (#1030) 2018-10-29 17:06:06 -07:00
T1duS
3189fc915e Fixed some warnings 2018-10-29 16:14:16 +05:30
MakarovGCI2018
127756b838 Added Visual Studio compilation guide (#1029)
* Updated COMPILATION.MD

* Delete COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD

* Update COMPILATION.MD
2018-10-28 20:26:44 -07:00
Anshul Maheshwari
e0b909a67e Correct compilation guide for ocr 2018-10-28 13:27:45 +05:30
Willem
475865b3be Merge pull request #1009 from saurabhshri/gci
[IMPROVEMENT] Add GCI banner and change text.
2018-10-27 12:28:05 +02:00
Willem
3a6fd3450d Merge pull request #1012 from saurabhshri/changelog
Add a requirement in PR to update changelog.
2018-10-27 12:27:33 +02:00
Matej Plavevski
81a00ddf55 [IMPROVEMENT]Update LibPNG to 1.6.35 (#1017)
* Update LibPNG to 1.6.35

* Convert to Unix format
2018-10-26 14:17:05 -07:00
Carlos Fernandez
4ff40f1be8 Fix compilation of utf8proc on Windows 2018-10-26 14:07:59 -07:00
Matej Plavevski
8861f7b40a Take 2 on upgrading (#1019) 2018-10-26 13:57:39 -07:00
Saurabh Shrivastava
96edd9031e Add a requirement in PR to update changelog. 2018-10-24 04:42:39 +05:30
Carlos Fernandez Sanz
11f87f2b6d Added all relevant changes since 0.86 2018-10-23 14:36:20 -07:00
Anshul Maheshwari
86de4151d2 Release 0.87 2018-10-23 06:45:02 +05:30
Saurabh Shrivastava
edae5a3cea Add GCI banner and change text.
- Corrected and improved text
- Added link to GCI post on ccextractor website.
2018-10-22 23:48:51 +05:30
Anshul Maheshwari
d3cc65ce4e Numberd string for Visual studio steps 2018-10-21 22:21:21 +05:30
Anshul Maheshwari
6593fc1d32 Highlight bash command as code blocks 2018-10-21 22:00:53 +05:30
Anshul Maheshwari
633a1e8bb1 Add few space as md recommendation
Add few space to show newline as newline through md.
2018-10-21 21:52:58 +05:30
Neetika Rathi
6a058e69e7 Renamed txt file to md 2018-10-21 21:27:42 +05:30
Anshul Maheshwari
e3c5156de9 Fix Compilation using cmake on linux (#1004)
Linux Distribution Details:
Distributor ID: Ubuntu
Description:    Ubuntu 14.04.5 LTS
Release:        14.04
Codename:       trusty

Failing cmake command:
cmake -DWITH_FFMPEG=ON ../src/

Passing cmake command:
cmake ../src/
2018-09-22 10:38:52 -07:00
Aadi Bajpai
8d379f7b5c Add GCI 2018 info (remove GSoC 2018) (#1003) 2018-09-20 17:35:46 -07:00
Krushan Bauva
45ed8456ee [FIX] Fix some minor memory leaks (#989)
* Destroy pix after use and release memory

* Free the frame and any dynamically allocated objects in it

* Fix typographical error

* Free the packet that was allocated by av_read_frame

* Add missing declarations
2018-07-17 22:45:27 -07:00
Naveen Saini
ef63d61f3d Update patch for windows priority with functions (#990)
* update for windows priority

* Update ccx_decoders_708.h

* to solve timing issue bugs

one of many instances where data is received without any window defined in decoder.

* Update ccx_decoders_708.c
2018-07-17 15:21:57 -07:00
Naveen Saini
6b1ad9951f Fix caption loss due to CW command (#991)
* add code to copy window data before CW command

* Update ccx_decoders_708.c

* Update ccx_decoders_708.c
2018-07-17 15:20:43 -07:00
Willem
c2c692fe0a Merge pull request #993 from aadibajpai/patch-1
[IMPROVEMENT] Update Logo
2018-07-17 08:17:20 +02:00
Aadi Bajpai
662299b324 [IMPROVEMENT] Update Logo
Replaced old logo with the new one.
2018-07-17 02:13:28 +05:30
Krushan Bauva
25a8b53ff5 Correct typographical error (#984) 2018-05-31 09:46:00 -07:00
atrottmann
466b50bca6 added missing break statement (#987) 2018-05-31 09:45:23 -07:00
Willem
1fac910c3e Merge pull request #980 from thealphadollar/correct_help
[IMPROVEMENTS] Update CLI help and HARDBUX Installation Instructions
2018-05-20 18:50:52 +02:00
thealphadollar
68e6f2616d Update instructions to install with HardSubx
Currently the instructions to install with hardsubx are vague and a new method was added in PR #966
This method makes installation with HARDSUBX easy and hence has been added to the documentation.
2018-05-15 12:39:23 +05:30
thealphadollar
d0d8529afa [FIX] Remove instance of o1 and o2 from help
There are no parameters as o1 or o2 in ccextractor but they have been mentioned in the help.

This commit removes such instances from file(s).

Change severity: trivial
2018-05-06 20:13:53 +05:30
Chris Lamb
b36429879d Make the build reproducible (#976)
Whilst working on the Reproducible Builds effort [0], we noticed
that ccextractor could not be built reproducibly.

This is due to it including the current date.

This was originally filed in Debian as #896867 [1] and uses the
SOURCE_DATE_EPOCH environment variable [2].

 [0] https://reproducible-builds.org/
 [1] https://bugs.debian.org/896867
 [2] https://reproducible-builds.org/specs/source-date-epoch/

Signed-off-by: Chris Lamb <lamby@debian.org>
2018-04-25 11:17:50 -07:00
Shivam Kumar Jha
c7bc2b78ac [IMPROVEMENT] Update libGPAC (#974)
* Add Updated GPAC

File changes have been directly inserted from libGPAC master into ccextractor's libGPAC.

This has resulted into removal of multiple custom functions and minor changes. These will be rectified in the next step of the updation.

Change severity: Very High

* Update libGPAC dependency

We use libGPAC for all our MP4 operations and, this commit updates it to the latest version.

All previous changes to the original library were restored post straight file updation and bugs have been removed.

change severity: very high

* Add Guide To Updating Dependencies

A small textual guide on how to update dependencies easily and efficiently.
2018-04-23 11:24:03 -07:00
Carlos Fernandez
1afe08af08 Extra line 2018-04-23 11:22:32 -07:00
Carlos Fernandez
52707267fc linux/builddebug: Added non-local directories to the incluye search path so we don't require a locally compiled tesseract or leptonica 2018-04-02 17:24:47 -07:00
Shivam Kumar Jha
e507b2092b [FIX] Correct -HARDSUBX Bug In CMake (#971) 2018-04-01 13:58:43 -07:00
Saurabh Shah
d23cb8571d Fix possible segfaults in hardsubx_classifier.c due to strdup (#963)
strdup will give a segmentation fault if the argument passed to it is
NULL. TessResultIteratorGetUTF8Text returns a char* which can be NULL
and we should not call strdup directly over it. Once we check if the
value returned is not NULL, then we can call strdup.
2018-03-16 11:22:20 -07:00
Shivam Kumar Jha
b2e83ea1a6 allow build with hardsubx using cmake (#966) 2018-03-13 12:40:09 -07:00
Shivam Kumar Jha
3d6a9f4d57 [IMPROVEMENT] Add LICENSE File (#959)
* [IMPROVEMENT] Add LICENSE File

We should be adding a LICENSE File to the root of the project. We do mention that we follow GPL v2 and hence can include it's declaration file.

* Rename LICENSE to LICENSE.txt
2018-03-12 11:19:47 -07:00
Saurabh Shah
86356ba4d2 Improve the start and end timestamps of extracted burned in captions (#962)
The start and end timestamps of extracted burned in captions are flawed
and off by a large difference. Also, the start time of the first burned
in caption extracted is always zero, which is not always the case. And
the extracted captions always appear in continuous timestamps.

This commit improves the start and end timestamps of the extracted
burned in captions and reduces the error significantly, bringing the
timestamps fairly close to the actual timings as they appear in the
media file.
2018-03-12 11:19:24 -07:00
Shivam Kumar Jha
801f9e8dc8 [IMPROVEMENT] Update COMPILATION.md (#960)
Add instructions to make the installation systemwide (on Linux) which can allow CCExtractor to be used from anywhere with just the below command in terminal:
`ccextractor [videofile]`
2018-03-08 13:33:52 -08:00
Carlos Fernandez
cbcedaf2bd Fixed crash with "-out=report" and "-out=null" 2018-03-07 17:26:08 -08:00
Shivam Kumar Jha
5ada966010 [FIX]-nocf not working with OCR'ing (#958)
* Fix -nocf not working with OCR'ing

* remove dvbcolor and nodvbcolor parameter
2018-03-06 13:16:26 -08:00
Shivam Kumar Jha
587f0b8609 Display quantisation mode in info box (#954) 2018-03-06 11:35:09 -08:00
Saurabh Shah
f46e3dcfc2 Fix segfault in add_cc_sub_text and initialize to NULL in init_encoder (#950)
This commit adds some checks to avoid segmentation faults.

* In `add_cc_sub_text()`, strdup will cause a segfault if we duplicate an
  empty string.

* In `init_encoder()`, initialize pointer fields to NULL to avoid random
  addressing so we can avoid illegal memory accessing and segfaults in
  other places.
2018-03-05 13:55:01 -08:00
Carlos Fernandez
a0e7ddd632 ccx_decoders_common.c: Copy data type when creating a copy of the subtitle structure 2018-02-28 19:42:38 +00:00
Krushan Bauva
3267c68c3b [FIX] Implicit declaration of these functions throws warning during build (#948)
* Declare gf_lang_find function in isom_write.c

* Declare gf_lang_get_3cc function in isom_write.c
2018-02-28 10:51:16 -08:00
Carlos Fernandez
c829c94e54 ccx_decoders_common.c: Properly release allocated resources on free_subtitle() 2018-02-27 22:28:01 +00:00
Carlos Fernandez
39b96cc544 Added a datatype member to struct cc_subtitle - needed so we can properly free all memory when void *data points to a structure that has its own pointers. 2018-02-27 14:15:52 -08:00
Carlos Fernandez
5a79b71e70 dvb_subtitle_decoder.c: When combining image regions verify that the offset is never negative.
ts_functions.c: Added an #ifdef block to save TS packets in a temp file. Just for debug purposes.
2018-02-27 20:48:13 +00:00
Saurabh Shah
57b230e91d Add instruction required to build ccextractor with HARDSUBX support (#946)
To build ccextractor with hardsubx support on linux, we need to configure
ccextractor with the `-enable-hardsubx` switch along with the
`ENABLE_HARDSUBX` flag passed during compilation with make. This commit
adds the missing configure instruction.
2018-02-27 12:17:14 -08:00
achaudhary997
b7a2aca34e Updated traivis.yml to fix osx build (#947) 2018-02-27 11:34:26 -08:00
Manish Mahalwal
cbda2deda2 [FIX] Add utf8proc src file to cmake, updated header file (#944)
* Add utf8proc source file to cmakelists

* Update utf8proc header file

* change "utf8proc.h" to "utf8proc/utf8proc.h"
2018-02-27 10:59:20 -08:00
Carlos Fernandez
779e9c64c1 Added required pointers on freep() calls 2018-02-27 02:49:11 +00:00
Carlos Fernandez
26d488a979 - Removed dvb_debug_traces_to_stdout and used the usual dbg_print instead
- struct cc_bitmap removed data[2] and replaced with two separate variables, since they are unrelated
2018-02-26 18:29:43 -08:00
Carlos Fernandez
ef7d4a2b4b Additional debug traces for DVB
Fix minor memory leak
2018-02-27 02:07:17 +00:00
Carlos Fernandez
c6102d3b2a Fix issue with displaying utf8proc version. 2018-02-26 11:17:13 -08:00
Manish Mahalwal
4d7d4cc109 Fix failing cmake due to liblept/tesseract header files (#941) 2018-02-26 11:11:59 -08:00
Manish Mahalwal
f717624bfa [FEATURE] Added version no. of libraries to --version (#939)
* Added version no. of libraries to --version

* Fix link

* ifdef used for tesseract/leptonica

* fix spelling
2018-02-25 09:16:48 -08:00
Carlos Fernandez Sanz
2114a80dbb Added GSoC to README.md (removed GCI) 2018-02-23 14:56:54 -08:00
Carlos Fernandez
5df3500a9f Added missing \n in params.c 2018-02-23 19:49:07 +00:00
Sourav Sahoo
393fbd30b0 [IMPROVEMENT] Corrected the tags file format of ctags in .gitignore (#908)
* Added tags file and removed the previosly wrongly writtern file

* Added .vscode to visual code section

* Added the .tags in .gitignore

* Changed *.tags to *.tags*
2018-02-23 11:36:59 -08:00
Shivam Kumar Jha
9dc1e0a9e2 Modify -quant 0 option (#932) 2018-02-23 11:34:43 -08:00
Carlos Fernandez
4875508f70 builddebug: Use -fsanitize=address -fno-omit-frame-pointer
write_dvb_sub(): Test for out of bounds and report details when this happens. Still doesn't fix the underlying issue but will help figure it out.
ocr.c: Solve malloc()/delete[] combinations that happened when operating on tesseract output. Now a malloc()'ed copy is immediately made, tesseract's results are unallocated using tesseract's delete function, and we continue using our own copy which is later free()'d.
2018-02-22 01:31:30 +00:00
Carlos Fernandez
b1c00233b3 ccx_decoders_common.c: Removed trivial memory leak.
ccx_encoders_srt.c: Made sure a pointer is non-NULL before dereferencing.
dvb_subtitle_decoder.c: Initialize pointer members to NULL when creating a structure.
lib_ccx.c: Initialize (memset 0) structure cc_subtitle after memory allocation.
README.TXT: Removed reference to sourceforge.
2018-02-21 20:31:09 +00:00
Carlos Fernandez
f85e65ba32 Updated mailing list (changed sourceforge for google groups) 2018-02-21 20:21:24 +00:00
Carlos Fernandez
150d2e7404 Add -quant (OCR quantization function) 2018-02-16 16:43:07 -08:00
Carlos Fernandez
2b997135e5 Added verboseness to error/warnings in dvb_subtitle_decoder.c 2018-02-16 14:38:54 -08:00
Carlos Fernandez
3e815ed590 Merge branch 'pr/n926_BPYap' (teletext fix)
Added verboseness to error/warnings on dvb_subtitle_decoder.c
2018-02-16 14:38:03 -08:00
Carlos Fernandez
550d3207ad Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2018-02-15 22:44:11 +00:00
Carlos Fernandez
f78a1fdf95 dvb_subtitle_decoder.c: Start work on passing invalid streams errors upstream (plus some warning messages) so we can eventually recover from this situation instead of crashing.
general_loop.c: Display warning on DVB parse error. We will still crash though.
2018-02-15 22:42:23 +00:00
Yap Boon Peng
b7545e0092 Fix code indentation 2018-02-14 12:39:43 +08:00
mkver
679a69f25c Update telxcc.c (#930)
Currently setting a colour doesn't necessarily add a space even though the specifications mandate it. This commit is designed to change that.
2018-02-13 17:04:25 -08:00
Carlos Fernandez
20e439f9d8 dvb_subtitle_decoder.c: Fix null pointer derefence when region==NULL in write_dvb_sub 2018-02-13 16:43:45 -08:00
BPYap
5d3e2cdbb9 Fixing Bug #922 2018-02-14 01:16:14 +08:00
BPYap
93859297c1 Fix Bug #922
Provide checks to characters between 0xA and 0xB and set them to 0x20 while  maintaining color information
2018-02-13 20:40:24 +08:00
BPYap
2258ab23ef Fix Bug #922
Provide checks to characters between 0xA and 0xB and set them to 0x20
2018-02-13 20:31:14 +08:00
BPYap
9b0c12a1c2 Fix Bug #922 2018-02-13 18:20:23 +08:00
BPYap
8ff8443b5e Fixing Bug #922 2018-02-13 17:37:30 +08:00
BPYap
6295496d15 replace all 0xA characters within startbox with 0x20 2018-02-12 16:42:27 +08:00
Shivam Kumar Jha
6e2ce11b26 Upgrade code to be compatible with Python 3 (#925) 2018-02-09 12:57:09 -08:00
BPYap
bcffe2abb9 solving [BUG] DVB Teletext subtitle incomplete #922
attempt to solve issue #922 by replacing 0xB and 0xA in the middle of row with space character
2018-02-08 19:29:36 +08:00
Aadi Bajpai
da132b379a [FIX] Prevent GitHub from caching the README badge (#921)
* Tried auto-updating badge

* Value in seconds making it update in 30m
2018-02-05 08:59:27 -08:00
Null
116656e62e Fix a minor spelling error in mp4.c (#924) 2018-02-05 08:58:39 -08:00
Carlos Fernandez
7be11b4e08 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2018-01-31 13:04:49 -08:00
Carlos Fernandez
8521819a46 Add missing return value to one of the returns in process_tx3g(). 2018-01-31 13:04:27 -08:00
lennonwoo
9a6529b17f Tidy CMakeLists & vcxproj (#920) 2018-01-25 08:25:00 -08:00
Carlos Fernandez Sanz
26e96f362a Update ISSUE_TEMPLATE.md 2018-01-24 18:19:59 -08:00
Carlos Fernandez
267abc2050 Corrected an issue with the Windows builds after the previous merge 2018-01-24 18:12:26 -08:00
Carlos Fernandez
dcc9d0c4af Merge branch 'pr/n916_lennonwoo' 2018-01-24 18:11:38 -08:00
Carlos Fernandez
4b5c01e3e7 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2018-01-24 17:41:33 -08:00
Carlos Fernandez
1ef7add534 - Added m2ts and -mxf to help screen
- Added MKV to demuxer_print_cfg (this was a bug)
- Added MXF to demuxer_print_cfg
 BugFix: "Out of order packets" error had wrong print() parameters
2018-01-24 17:41:14 -08:00
Aadi Bajpai
93ca53d489 Added SourceForge Downloads Badge (#913)
Review whenever convenient :)
2018-01-24 12:30:22 -08:00
Willem
604dd4d648 Merge pull request #917 from TheClashster/patch-2
[IMPROVEMENT] Display Sample-Platform Build Badges on README
2018-01-24 18:46:23 +01:00
Aadi Bajpai
4d5d0c9063 Add Sample-Platform Badges to Readme 2018-01-24 22:58:24 +05:30
Willem
7f2b20dc98 Merge pull request #906 from thealphadollar/master
[IMPROVEMENT] Update .gitignore
2018-01-24 11:22:23 +01:00
lennonwoo
7ad5c226e6 update python extension doc 2018-01-24 16:00:50 +08:00
lennonwoo
a3bb05242f update api_testing 2018-01-24 15:47:26 +08:00
lennonwoo
180da3ed5a update build scripts 2018-01-24 15:46:22 +08:00
lennonwoo
ddc7c197c8 refactor pass_cc_buffer_to_python 2018-01-24 15:43:43 +08:00
lennonwoo
da72afeb7c move asprintf to utility 2018-01-24 15:42:52 +08:00
lennonwoo
913432232d clear up which function as api 2018-01-24 15:42:23 +08:00
lennonwoo
84ce45b8f0 be quiet when in PYTHON_API mode 2018-01-24 15:34:02 +08:00
lennonwoo
b003ed7394 remove trivial signal_python_api & global array func 2018-01-24 15:31:27 +08:00
lennonwoo
941077c11c remove swig Auto-generated files 2018-01-24 15:25:04 +08:00
thealphadollar
2116c4a964 Modify git-ignore 2018-01-22 21:28:22 +05:30
Null
0a5111d7eb Fix incorrect path in XML (#904) 2018-01-18 21:51:56 -08:00
Carlos Fernandez
c4962114b6 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2018-01-16 16:21:28 -08:00
Carlos Fernandez
8050c34174 Updated CHANGES.TXT with "- New: mp4 tx3g & multitrack subtitles" 2018-01-16 16:20:59 -08:00
Null
f172c50d2b [IMPROVEMENT] Minor changes - Add default fonts in the --help (#899)
* Minor changes - Add default fonts in the --help

* Revise
2018-01-16 16:19:36 -08:00
Carlos Fernandez
12b438d05a VS solution: box_dump.c was listed twice, removed one of them.
Added minor comment in networking.c
2018-01-16 10:51:41 -08:00
Null
26215c258b Fix compilation error in CMake 2018-01-16 16:03:08 +08:00
Null
71ac0ad43a [FEATURE] Support mp4 tx3g & multitrack subtitles (#898)
* Support mp4 tx3g & multitrack subtitles

* Fix indentation

* Minor changes

* Add a comment
2018-01-13 16:42:14 -08:00
Null
ca026ecbaa [IMPROVEMENT] Fix some warnings (#896 ) (#897)
* Fix some warnings

* Fix more warnings

* Fix more warnings
2018-01-11 23:19:40 -08:00
Carlos Fernandez
5b124c0ce2 linux build script (non-debug): Don't hide warnings from compiler.
linux build script (debug): Display what's step of the build script
we're in.
2018-01-12 00:05:23 +00:00
Carlos Fernandez
355b57b26f Version push to 0.87
Moved history and people to AUTHORS.TXT from README.TXT
2018-01-11 12:42:18 -08:00
Matej Plavevski
eeccc74128 Add Google Code In 2017 Participants (#875) 2018-01-11 11:05:49 -08:00
Carlos Fernandez
8751363df8 Merge branch 'pr/n894_harrynull' 2018-01-11 10:51:59 -08:00
Willem
f729181262 Merge pull request #895 from harrynull/ci
[FIX] Fix Travis CI not executing `linux/build`
2018-01-11 11:51:49 +01:00
Null
a6f2f33ccf Fix compilation in MacOS 2018-01-11 18:31:27 +08:00
Null
bb18bdb932 Merge branch 'libgpac' of https://github.com/harrynull/ccextractor into libgpac 2018-01-11 17:42:24 +08:00
Null
4a7946ab7d Fix more compilation error 2018-01-11 17:41:56 +08:00
Null
3203ac14d3 Change to before_install 2018-01-11 16:48:36 +08:00
Null
98edef2233 Add more cd 2018-01-11 16:47:22 +08:00
Harry Null
732c1a3926 Fix compilation in Linux 2018-01-11 16:44:14 +08:00
Null
9c1e7c5c98 Fix travis 2018-01-11 16:26:48 +08:00
Null
19352fdd03 debug travis 2018-01-11 16:07:54 +08:00
Null
8b159cc64d Add GSOC files back 2018-01-11 15:01:10 +08:00
Null
8d9e54130d More build files 2018-01-11 15:01:02 +08:00
Null
a473ef2e92 merge 2018-01-11 14:45:32 +08:00
Null
dfb26f49a2 Modification to libgpac 0.7.1 2018-01-11 14:36:54 +08:00
Null
86a39802d3 Upgrade gpac to 0.7.1 2018-01-11 12:56:49 +08:00
Null
a5f17c318d Merge branch 'master' into libgpac 2018-01-10 19:59:52 +08:00
Null
90733963e5 Minor fix 2018-01-10 18:50:23 +08:00
Carlos Fernandez
5dc06c341c Corrected date on CHANGES.TXT for 0.86 - it's 2018 already! 2018-01-09 15:13:35 -08:00
Carlos Fernandez
03646de1a5 Added release date (today) to CHANGES.TXT
Edited docs/README.TXT a bit
2018-01-09 14:51:22 -08:00
Carlos Fernandez
811a9a4992 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2018-01-09 14:49:03 -08:00
Carlos Fernandez
fa545d2806 Added release date for 0.86 (today), and modified README.TXT a bit. 2018-01-09 14:48:34 -08:00
Saurabh Shrivastava
5079d57766 Move show specific dictionaries to separate repo. (#891)
Show specific decitionaries are moved to https://github.com/CCExtractor/show_specific_dictionaries .
2018-01-09 12:14:18 -08:00
Carlos Fernandez
2d7c1718b4 Added FT2_BUILD_LIBRARY to the preprocessor definitions in debug_full, release and release_full configurations in the VS solution files - required for building. 2018-01-09 12:04:44 -08:00
Carlos Fernandez
b947601083 Minor change in params.c - specify that the value for -font is a full path. 2018-01-09 11:04:54 -08:00
Carlos Fernandez
fe8ef083aa Updated CHANGES.TXT 2018-01-09 10:58:28 -08:00
Null
930ca716ca [FEATURE] FreeType-based text renderer (-out=spupng with teletext/EIA608) (#877)
* Implementation of text renderer

* Fix some characters being cut

* Fix encoding and other bugs

* Add black background & fix bugs

* Fix more bugs

* Change to relative path

* Add a font option & Default font for MacOS & Fix anti-aliasing

* Document -font & enlarge default canvas
2018-01-09 10:24:26 -08:00
Null
8f0294b763 Add FreeType (#876) 2018-01-09 10:24:06 -08:00
Hori75
cde884faae Merge branch 'master' of https://github.com/Hori75/ccextractor 2018-01-04 09:14:52 +07:00
Willem
377dc2a48d Merge pull request #873 from AbhijithGanesan/development
Change contributor guide link
2018-01-02 19:58:50 +01:00
Hori75
7a4f4a8f79 Remove thread.h (it doesn't needed) 2018-01-02 18:38:23 +07:00
Abhijith Ganesan
97cc3ee2a7 Change contributor guide link
Original link no longer exists; replaced with User link
2018-01-01 21:05:50 +00:00
DHSYongJun
12e38343f8 New README for GCI (#872)
Documentation: Write a better README for our GitHub page
2018-01-01 10:50:16 +01:00
Aadi Bajpai
151d04a870 [IMPROVEMENT] Minor typographical fix (#868)
Looks better now
2018-01-01 10:46:44 +01:00
Hori75
a5317799e8 Configuring the vs project solution 2017-12-31 19:40:38 +07:00
Hori75
d96b8e0e83 Modify makefile.am for linux and mac 2017-12-31 19:39:39 +07:00
Hori75
bbe2f33399 The real changes (This gonna be messy) 2017-12-31 19:36:08 +07:00
William
c8008b441a {IMPROVEMENT] Whitespace changes (Part of libGPAC update) (#870)
* Whitespace changes

* Update tools.h
2017-12-31 13:05:44 +01:00
Null
d4f3c9c6a1 Fix dvb subtitle not extracted if there's no display segment (#866) 2017-12-31 09:41:51 +01:00
Null
f7d16d846c Fix a bug that caused -out=spupng sometimes crashes (#864) 2017-12-31 09:39:02 +01:00
Null
dd4032e515 Fix a heap corruption in add_ocrtext2str (#865) 2017-12-31 09:20:30 +01:00
Null
84a9ea5572 Fix OCR issue caused by separated dvb subtitle regions (#857) 2017-12-28 09:19:04 +01:00
Alex Huang
5e8a5590ce reworks scanning newlines to look for content in a line (#858) 2017-12-28 08:33:59 +01:00
Null
fb55d6d6d3 [Fix] Put OCR specific code inside ifdef (#855)
* Fix failing travis build.

Removed debug code. No idea why it causes travis fails to build

* Fix debug code
2017-12-27 07:33:16 +01:00
Null
b0afb983c9 [FIX] Fix a crash while processing DVB subtitles (#850)
* Add more debug info

* Fix crash in dvb process
2017-12-26 10:08:26 +01:00
Null
e56bab67b8 [Fix] Fix DVB bug (Multiple-line subtitle; Missing last line) (#844)
* multiple line & trying to fix the missing last line

* Fix format; move code into loop

* Revert some format changes
2017-12-26 06:14:11 +01:00
Carlos Fernandez Sanz
f3fd6762c3 Update CHANGES.TXT 2017-12-24 06:11:15 +01:00
Chuck Wilson
59b8f81283 [FEATURE] Support for Source-Specific Multicast (#802)
* Support for Source-Specific Multicast

* fixing whitespace issues

* updating changelog
2017-12-24 06:10:14 +01:00
Theodore Fabian Rudy
44a9e8b2af Added Supports and replace some things (#843)
* Added Supports and replace some things

* Update README.md
2017-12-24 05:01:31 +01:00
Null
31c39eea55 [FIX] Fix crash when image passed into OCR is empty (#841)
* Fix crash when image passed into OCR is empty

* Avoid OCR
2017-12-23 01:20:41 +01:00
Manish Mahalwal
f9a0874e58 Fixed -sentencecap for teletext samples (#842) 2017-12-23 01:20:16 +01:00
Anshul Maheshwari
1858425944 Merge pull request #822 from MatejMecka/master
[IMPROVEMENT] Upgrade UTF8proc
2017-12-16 15:43:59 +05:30
Anshul Maheshwari
b04228f0fd Merge pull request #833 from harrynull/cmake-windows
[FIX] Make CMakeFiles work with Windows Visual Studio
2017-12-16 15:35:37 +05:30
Null
4263a341e1 Turn on optimization 2017-12-16 10:41:40 +08:00
Willem
8b9f7a929b Merge pull request #837 from MatejMecka/master
[IMPROVEMENT]Improve README a little
2017-12-14 21:10:13 +01:00
MatejMecka
4d7edfd687 Remove a part from README 2017-12-14 21:05:02 +01:00
MatejMecka
289e9ca02a Update something with the header 2017-12-14 20:17:34 +01:00
MatejMecka
d7ce96a5d0 Revert badge back 2017-12-14 19:59:55 +01:00
MatejMecka
bf13b5c1e0 Update README part 3 2017-12-14 17:11:41 +01:00
Saurabh Shrivastava
0b31e5d7a1 Add compilation section to readme and improve instructions. (#836)
Also moved remaining .md files in docs.
2017-12-13 13:01:10 -08:00
Willem
f78303abef Update README.md
Update links to relative files
2017-12-13 20:48:47 +01:00
Willem
06c735ba8a Merge pull request #835 from MatejMecka/master
[IMPROVEMENT] Update README and add Installation file.
2017-12-13 20:09:17 +01:00
MatejMecka
e7fab1da26 Fix Installation instructions. 2017-12-13 19:51:21 +01:00
MatejMecka
22ff01e2a5 Fix some stuff 2017-12-13 17:18:16 +01:00
Matej Plavevski
f8f0d91386 Update README and add Installation file.
Add INSTALLATION file and update README
2017-12-12 21:02:29 +01:00
Null
d70c81d1d1 Make CMake work with Windows 2017-12-12 17:45:04 +08:00
Null
8e53d91682 histogram (#832) 2017-12-11 20:53:42 -08:00
Carlos Fernandez Sanz
c32d350550 Added mxf demuxer to mac/Makefile.am 2017-12-07 13:31:53 -08:00
Carlos Fernandez Sanz
fe17cddec8 Fixed 8 warnings 2017-12-07 13:17:09 -08:00
Carlos Fernandez
33c4c5a021 Import teletext.h from lib_ccx.h for function definitions that are used from general_loop 2017-12-07 12:34:15 -08:00
Carlos Fernandez
04ea39073b Moved some declarations from telxcc.c to teletext.h - needed to avoid some warnings about implicit declarations.
gzguts.h: Moved UNIX-only include to #ifdef to prevent build errors on Windows.
2017-12-07 12:29:30 -08:00
Carlos Fernandez Sanz
d7aa1f1bf4 - Fix some warnings (wrong parameter type, implicit function declarations...) 2017-12-07 12:14:50 -08:00
Carlos Fernandez
2c0e21b28b Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-12-07 09:25:53 -08:00
Carlos Fernandez
487ac86d47 - Added MXF demuxer to Makefile 2017-12-07 09:25:26 -08:00
Matej Plavevski
726d87d15a Add leptonica and tesseract (#827) 2017-12-07 09:09:12 -08:00
Carlos Fernandez
d50e05315f The #ifdef PYTHON_API crap was preventing the write context to be deinitialized - meaning files not being closed, semaphores not deleted... 2017-12-06 18:25:22 -08:00
Null
3fb5bab343 Fix delay in DVB (#826) 2017-12-06 07:53:21 -08:00
Anshul Maheshwari
4bcd1edff5 Merge pull request #825 from anshul1912/master
Add MXF support
2017-12-06 14:14:46 +05:30
Anshul Maheshwari
42ab16405d Add MXF support 2017-12-06 11:36:41 +05:30
Null
c78db1dd24 Fix delay option (#824) 2017-12-05 08:45:47 -08:00
Willem
200b6a7eb9 Merge pull request #823 from MatejMecka/fixtravis
[FIX] Fix One of the Travis Builds
2017-12-04 23:50:20 +01:00
MatejMecka
410335c46d Fix One of the Travis Builds
Fix one of the builds for Travis

Revert "Upgrade UTF8proc"

This reverts commit 76dc969363.
2017-12-04 22:05:56 +01:00
MatejMecka
76dc969363 Upgrade UTF8proc 2017-12-04 21:41:53 +01:00
Willem
3772f83fe0 Update README.md 2017-12-04 21:25:43 +01:00
Willem
f2258e3eac Merge pull request #818 from MatejMecka/master
[IMPROVEMENT] Add Travis Ci
2017-12-04 11:19:06 +01:00
Matej Plavevski
a40c9e2ca1 Create .travis.yml
Update .travis.yml

Update .travis.yml
2017-12-03 21:36:14 +01:00
Null
241e2f5e14 Upgrade win_iconv (#815) 2017-12-02 20:52:29 -08:00
Matej Plavevski
afdf4e74be Upgrade zlib to 1.2.11 (#814) 2017-12-02 20:49:41 -08:00
govindbalaji-s
a9180719b6 Made fatal errror messages clearer (#811) 2017-12-01 07:38:41 -08:00
Matej Plavevski
88844fea42 Update LibPNG from 1.6.27 to 1.6.34 (#809) 2017-11-30 12:24:01 -08:00
HemangRajvanshy
f5700d5304 Making error messages clearer and less ambiguous. (#808) 2017-11-30 08:40:53 -08:00
Carlos Fernandez
dacd05b9fb Merge branch 'master' of https://github.com/CCExtractor/ccextractor
# Conflicts:
#	src/lib_ccx/ccx_encoders_python.c
2017-11-07 12:42:27 -08:00
Carlos Fernandez
5db65fa3a1 #fidef ENABLE_PYTHON in ccx_encoders_python.c. 2017-11-07 12:38:25 -08:00
Anshul Maheshwari
b7eb9e22d6 Make code compilable on windows 2017-11-05 18:52:58 +05:30
Saksham Gupta
b9de954690 Update dict_mr_robot.txt (#804) 2017-11-02 21:15:08 -07:00
Matej Plavevski
07c933e677 More Google Code In Participants (#801) 2017-11-01 15:48:51 -07:00
AlexBratosin2001
bdb8221213 Replace incorrect memset (#800) 2017-10-23 19:38:07 -07:00
Mayank Gupta
4573f5e8f6 [IMPROVEMENT]Modify Autoconf scripts to generate mac compatible tarball (#798)
* Modify Autoconf scripts to generate mac compatible tarball

* Update list of GSoC 2017 students
2017-10-18 17:33:58 -07:00
Evgeny Shulgin
a170f55a22 Merge pull request #799 from Izaron/something
Temporarily wrapped the Python API
2017-10-18 13:22:51 -07:00
Evgeny Shulgin
a6f0a07bf9 Merge branch 'master' into something 2017-10-18 13:22:23 -07:00
Evgeny Shulgin
1816894ccf Temporarily wrapped the Python API 2017-10-18 23:15:54 +03:00
Carlos Fernandez
07f289d1e0 Added missing function prototype.
Added ccextractor.h to solution.
2017-10-18 12:55:46 -07:00
Carlos Fernandez
b61918d516 Updated CHANGES.TXT 2017-10-13 15:26:16 -07:00
Vitor Massaru Iha
824dfeb166 [IMPROVEMENT] function header in "get_more_data" functions were standarlized. (#786)
* function header in "get_more_data" functions were standarlized.

* Unnecessary stream_mode check inside the while loop was removed.

* terminate_asap if condition was moved to while condition.

* Unnecessary condition was removed.
2017-10-13 14:53:37 -07:00
Vitor Massaru Iha
6aa90fc091 [gpacmp4/avilib.c]: Fix redefinition of VERSION and PACKAGE (#790)
compilation warnings before:

../src/gpacmp4/avilib.c:35:0: warning: "PACKAGE" redefined
 #define PACKAGE "GPAC/avilib"

<command-line>:0:0: note: this is the location of the previous definition
../src/gpacmp4/avilib.c:36:0: warning: "VERSION" redefined
 #define VERSION GPAC_FULL_VERSION
2017-10-13 14:47:27 -07:00
Saksham Gupta
5641c3116e Fixes Undeclared-Variable Warnings in extractor.c (#795)
Fixes https://github.com/CCExtractor/ccextractor/issues/780

Signed-off-by: Saksham Gupta <shucon01@gmail.com>
2017-10-13 14:42:29 -07:00
Mayank Gupta
e5c80a2c84 Add documentation for GUI (#796) 2017-10-13 14:42:03 -07:00
Mayank Gupta
fc3f505189 Add GUI for CCExtractor (GSoC 2017) (#794) 2017-10-11 14:11:39 -07:00
Mayank Gupta
ff950a035d Update Makefile.am for working Autotools build on mac (#791) 2017-10-09 12:30:42 -07:00
Jason Hancock
98114bd294 [IMPROVEMENT] Modify rpm.sh package script to preserve the source rpm (#788) 2017-10-06 16:35:35 -07:00
Jason Hancock
37d497d47a [IMPROVEMENT]: change rpm installation location from /usr/local/bin to /usr/bin (#789) 2017-10-06 16:35:07 -07:00
Jason Hancock
fae4957a71 Fix path to wrapper.h in Makefile.am (#787) 2017-10-06 16:34:56 -07:00
Saurabh Shrivastava
0d9872021d ¯\_(ツ)_/¯ Fix typo in name of Zlib directory in cmakefile. (#784) 2017-10-03 12:30:10 -07:00
Saurabh Shrivastava
30443a5b9a Use proper newlines while printing SRT from bitmaps. (#783)
Probably fixes #767 .
2017-10-03 08:24:34 -07:00
Saurabh Shrivastava
2eb5fd26de [FIX] Move files into appropriate directories & fix build scripts. (#781)
* Move wrappers and extracters inside src/ and update CMakeLists.

* Reflect change in path across build scripts.

* Remove redundant source file inclusion.

* Always use supplied libpng.
2017-10-02 12:16:04 -07:00
Hugh Mackworth
01852ef055 Compilation on the Mac (#777)
* Update README.md

* Delete README.MAC.TXT

No longer accurate given work done to integrate Mac into build processes.

* Change to use project's PNG/ZLIB libraries

* Fix Mac build command
Makes OCR an optional parameter
Adds python API file to build

* Update README.md
2017-10-02 11:59:00 -07:00
AlexBratosin2001
59c0de46e2 Fix Windows project files (#782) 2017-10-02 11:58:15 -07:00
Vinícius Lugão
f8d9e042bb Fix to output CC data when -out=raw is used (#775)
When the -out=raw option is used, the ccextractor jumped to spupng output
format, generating broken files in spupng format without CC data.
With this fix, now it generates CC data in McPoodle's Broadcast format.
2017-09-08 10:06:00 -07:00
Diptanshu Jamgade
0596d375b7 Python Extension Module (#773)
* Added self as contributor

* Added extension module documentation to docs/
2017-09-03 15:37:24 -07:00
Diptanshu Jamgade
47c5a6e73b Cleaning up the codebase and additional changes in Python SRT generator. (#771)
* Removed all extractors except the grid extractor.
Removed the call to transcript extractor in ccx_encoders_transcript.c

* Removed unnecessary array appening statements in python_grid_extractor.
WIP: switch in extractor.

* Added switch in g608 grid extractor.

* Deleted comments from wrappers.

* Refactored code in ccextractor.c and .h files.
Removed all the commented part.
Made proper changes according to the coding conventions.

* Removed calls to extractor from all the encoders.
The only call made to extractor is from ccx_encoders_python.c.

* Removed a comment from wrapper.c.
In init_write function of output.c added a call to free the output string returned by asprintf in case of
sending filename to callback function.

* Added calls to free the char* which is malloced by asprintf in
extractor.c
WIP: Free the global variable elements.

* Sample testing correctly for italics tag.
Also added a hack to print only 32 characters when unicode fails.
WIP: Font tag.

* Added support for handling font and italics in Python SRT generator.

* modified the font generator.
Also, added count method for checking blank strings in
python_srt_generator.

* Added free statements for avoiding memory leaks.

* added return code for failure of asprintf calls.

* Removing unnecessary code from api_testing.py

* Made modifications to Makefile and build script.

* Added recursive_tester.py
Autoconf builds successfully.

* BUG: Made change to get_line_encoded to encode the last \0 character in a
line. Otherwise the EOL characted is absent causing garbage value to be
present in SRT.

* Exporting the encoding of the captions from CCExtractor to Python so
that the python SRT generator can generate proper SRT files.

* Modified the include statement in extractor.h
2017-08-25 11:03:00 -07:00
Carlos Fernandez
022463b9a2 Moved ccx_encoders_python to right filter in project. 2017-08-21 14:25:35 -07:00
Saurabh Shrivastava
d19f471352 Correctly handle return codes. (#763)
Return code after parameter parsing were incorrectly handles leading to errors such as `Error: Invalid option to CCextractor Library`.
2017-08-21 14:11:19 -07:00
Saurabh Shrivastava
8f2f38bf07 Fix builddebug to include Python API changes. (#770) 2017-08-21 13:21:48 -07:00
Saurabh Shrivastava
4fe82abbfc Get commit hash and compilation date when built using cmake. (#764)
Who knew I would have to read so much documentation for such trivial task 😒
2017-08-20 08:55:09 -07:00
Mayank Gupta
32710eff1d Fix failing build with autoconf due to ccextractor.h (#765) 2017-08-20 08:54:51 -07:00
Diptanshu Jamgade
21eaa3de04 Python bindings with extraction of CE608 grid and writing to a SRT output. (#768)
* added python_extract to encoders_srt and the captions are being
extracted in needed format. Search for an alternative to asprintf

* Checking if the alternative to asprintf generate proper srts

* CC captions accessible via python script

* Removing python caption code from __wrap_write function

* removing old cc_to_python functions

* Removing python_subs structure and all the changes done for that struct

* Removing filename functions from ccextractor.*

* Renaming make_message to time_wrapper

* Applying to python_extract codebase: SSA format

* Added python_extract_time_based and done validation for ssa

* pplying python_extract_time_based: Done validation for srt and webvtt

* led attempt for SAMI support of python_extract. Code is commented

* Appluing python_extract_time_based: validate support for SMPTETT

* Added python_extract_transcript and made changes for time printing.

* added show_extracted_captions_wtih_timings function

* Added show_extracted_captions_with_timings to python script for testing
purpose.

* refactored extractors to api directory. commented out show captions in main()

* build and build library working for the extractors.

* made caption generator work with a 0.1 time sleep. Start refactoring

* added asprintf for windows.

* file being written in the running directory

* Auto -deletion of python temporary file

* Python captions printing status set to proper.

* termination of tail successful

* Writing successful for the sample

* Generating unalternating output

* adding api_support.py

* Adding bld_flags in build_api

* Added  to build_library

* Auto deletion of temporary file on SIGINT

* Discussing Seg fault with Izaron

* working for python and linux with samples. testing -out=pythonapi with stream

* Done adding bitmap support

* added -out=pythonapi support for bitmap

* Setting the messages_target to 0 for output = pythonapi

* Added wrapper for setting -out=pythonapi. Checking if -stdout value can be used in python.

* adding the cc_to_stdout=1 value for -out=pythonapi. Thus generation of output file has been avoided. May be needed to change in future.

* added extractor for g608 grid. removed sami extractor. need to work on overlap of -out=pythonapi and -out=g608

* Removed overlap of -out=pythonapi by adding -pythonapi and
signal_python_api global variable.

* added support for seperate c608 grid catching. Need to test the output
via python.

* added support for seperate printing of text font and color in CE608.
Need to make sure that the function is inbuilt.

* ADDED ce608 GRID SUPPORT FROM PYTHON
need to discuss whether to keep the print_cc_grid function specific to
the module or make it user accessible.
Mostly it would be better to make it user accessible.

* made changes in the call_from_python_api function such that only
api_options is needed to be passed.
An if statement before the call to g608_extractor has also been added.
Waiting for Carlos to comment on the output generated till this stage.

* added a signal_python_api check before calling every write function.
Thus basic writing output can be avoided.

* Commented all calls to python_extract_time_based.
making changes to python_extract_g608 to be called only from the point
when a g608 caption is detected.

* Added pass_cc_buffer_to_python in encoders_common.c temporarily
redefined get_*_encoded from static to normal
included the above functions in encoders_common.h

* Added if-else statement for switch in encode_sub function.
This is done mainly for making sure no output is generated in the api
call.

* Added ccx_encoders_python.c
Defined pass_cc_buffer_to_python in ccx_encoders_python.c
added if else statement in encode_sub's switch to make sure that the output is not generated in case of -pythonapi call

* Removed __wrap_write from the entire code base.
It's declaration and definition are only present in CCExtractor.*

* Commented out the /dev/null part in ccx_encoders_common.c.
Proceeding further on checking for file generation.

* Added output_filename in array global variable and is generated in
init_write function.
included ccextractor.h in output.c to access global variable
signal_python_api for avoiding output generation in init_write and
invalid free in dinit_write.

* Modified the definition of init_write function for accessing
signal_python_api.

* Deleted the commented part of /dev/null in ccx_encoders_common.c.

* Added target_message=0 in -pythonapi param parsing in param.c to avoid
the API from printing to STDOUT.
Deleted the commented part of -out=pythonapi.
Thinking of adding a different param for silencing the output when the
call is made from python api.

* Removed __wrap_write from ccextractor.c and ccextractor.h.

* Added ccx_to_python_g608 and modified api_support.py file.
added documentation in ccextractor.c.

* added the generate srt script. However, some random characters are
coming in first line. Need to talk about this.

* Added SRT generator for python.
Using string to remove the garbage value.
Add code for srt counter and also the start_time and end_time
conversion.

* removed the trash characters and added code to print the timings.
However, the last blank frame also results in a print. Need to take care
of this.

* rectified the mistake of writing only timings and not captions.
now next step is to just make the timings print properly

* some minor changes before diving into extracting srt_counter from the made codebase

* Added extraction of srt_counter in python_extract via fflush
srt_counter-value.
Need to modify the processing in python.

* Added the entire method to extract captions and generate srt files. Next, step would be a to define a concise function for writing the srt

* Processing into a srt working properly.
Next step is to add the information of font into the caption text.

* the data is getting generated for proper SRT counters.

* A turning point to the appraoch.
Added END OF FRAME line for printing the data for every particular
srt_counter.
Proceeding further with the generation of srt by data manipulation.

* some minor bugs but the output srt is being generated correctly. However, The font and colour encoding needs to be done.

* Taken care of random characters. Need to discuss this with Carlos. Moving further to font/color processing.

* Taken care of random characters. Need to discuss this with Carlos. Moving further to font/color processing.

* Added fflush and cleaned up the python code of srt generation

* Added <i> tag for italics.
Proceeding further with other types.

* Added the code to check for underline.
However, need to check how CCExtractor generates srt when both italics
and underline are present. For now a new line is added if both are
present.

* Shifting for making changes in th i/O work.

* Stable ouput for samples with italics is being generated.

* Added the PYTHONAPI macro definition and testing for its existence in the set_python_api function.

* build script for linux is working correctly.
Build_library is showing error of invalid def of set_pythonapi.
Moreover, extractor has some memory seg fault.

* Added mod to set a MACRO as my_python_api to set the callback function.
Till now all calls to the reporter are commented.
Working on getting the reporter to print the lines.

* Changes have been implemented to bring reporter in working state.
For now a constant string is passed from extractor. Need to make the
proper parsing possible.

* Changed the code in extractor such that entire grid is returned to the
callback function.
Need to provide this grid to the write function and also cleanup the
codebase.

* Writing the outputted srt in a file called "temp.srt".
Need to modify init_write to push filename that is to be created in
python using callback.

* Added code to get start and end time simultaneously.
entire SRT is getting generated.

* removed ccx_python_encoders.c

* Compiling and executing on Windows

* Moved definitions get_line_encoded, get_color_encoded, get_font_encoded from ccx_encoders_g608.c to ccx_encoders_common.c.
Also, deleted the static definition of get_font_encoded from
ccx_encoders_webvtt.c

* added a write statement in write_cc_bitmap_as_srt

* Rectified transfer of get_line_encoded, get_color_encoded and
get_font_encoded from ccx_decoders_common.c to ccx_encoders_common.c.
2017-08-20 08:54:35 -07:00
Carlos Fernandez Sanz
773ddf8bc2 Merge pull request #769 from Izaron/patch-1
**[IMPROVEMENT]** Added gui mode reports for Matroska decoder
2017-08-20 08:54:05 -07:00
Evgeny Shulgin
14e0d86df8 Added gui mode reports for Matroska decoder 2017-08-20 15:08:20 +03:00
Saurabh Shrivastava
333fb6eb6e Cleanly format the compiling documentation and cmake instructions. 2017-07-26 04:24:19 +05:30
Saurabh Shrivastava
da0893fdb3 Fix CMakeLists for MacOS and Linux.
With #742 and this, CCExtractor could be build across all three platforms using CMake.
2017-07-26 04:23:48 +05:30
Carlos Fernandez
ce2b680a43 Merge branch 'pr/n759_Abhinav95' 2017-07-21 11:25:24 -07:00
Abhinav95
b1cc95d972 Adding grayscale conversion for better OCR 2017-07-21 12:12:50 +05:30
Diptanshu8
10eb52e651 pushing 4 wrapper codes 2017-07-20 02:50:53 +00:00
Diptanshu8
13b3dadb45 Wrapper for debugdvbsub and pesheader 2017-07-20 02:50:53 +00:00
Diptanshu8
cff69bef5e added wrapper code for setstdout and setautoprogram 2017-07-20 02:50:53 +00:00
Carlos Fernandez
536082ae6e Merge branch 'pr/n751_Diptanshu8' 2017-07-19 10:59:56 -07:00
Diptanshu8
3f069b84c9 fixed -out=dvdraw sample error. 2017-07-18 04:48:08 +00:00
Carlos Fernandez
ddca8001cc Merge branch 'pr/n755_Abhinav95' 2017-07-17 11:44:11 -07:00
Diptanshu8
02b4427260 making changes to write wrapper 2017-07-17 08:59:00 +00:00
Abhinav95
ec5618dd1f Fixing end timestamp in DVB transcripts + spelling/readme improvements 2017-07-17 04:23:34 +05:30
Carlos Fernandez
e8f742a627 Corrected function prototype 2017-07-14 13:01:39 -07:00
Saurabh Shrivastava
45946e3ac9 Initialise timing for MP4 webvtt.
Fixes #753 .
2017-07-14 18:59:02 +05:30
Diptanshu8
e3e5f8b36e Apply write wrapper across entire database. 2017-07-13 07:26:49 +00:00
Diptanshu8
1435411861 Commenting out the file name related functions. 2017-07-13 05:48:14 +00:00
Diptanshu8
86b7e7348e Added extension to python_subs 2017-07-11 21:34:05 +00:00
Diptanshu8
d2bd2d1397 added basefilename to python_subs 2017-07-11 21:21:18 +00:00
Diptanshu8
8895b27552 CC being shown in python script. 2017-07-11 21:21:18 +00:00
Diptanshu8
57424857b0 Working on PR 2017-07-11 21:21:18 +00:00
Diptanshu8
2ced408994 build and build_library working correctly 2017-07-11 21:21:18 +00:00
Diptanshu8
976f01cee1 CCs to python_subs extracted properly 2017-07-11 21:17:46 +00:00
Diptanshu8
4d5f80a01d Found wrapper for write. Check file_handle and start processing. 2017-07-11 21:17:46 +00:00
Carlos Fernandez
0327e676dd Merge branch 'pr/n747_Diptanshu8' 2017-07-11 11:40:10 -07:00
Diptanshu8
91ea65d2a3 Removed ccextractor.pyc 2017-07-11 11:18:20 +00:00
Diptanshu8
de5fcf27f3 adding .pyc to gitignore 2017-07-06 23:34:08 +00:00
Diptanshu8
fe6813736c segregating the code and changing myarguments and argument_count. Also, gsoc directory has been created. 2017-07-06 22:58:23 +00:00
Diptanshu8
dc35af0bc0 Modifications to the code. 2017-07-06 22:22:59 +00:00
Carlos Fernandez
0c0bf1aafd -Added -nospupngocr (don't OCR bitmaps when generating spupng, faster) 2017-07-06 13:37:20 -07:00
Carlos Fernandez
62dab0dde9 Merge branch 'pr/n746_Abhinav95' 2017-07-06 12:59:26 -07:00
=
31a2d46996 Forcing -noru to cause deduplication in ISDB 2017-07-07 01:22:11 +05:30
Carlos Fernandez
710a205f99 Add support for file split on keyframe (-segmentonkeyonly)
Segmenting now doesn't destroys the whole encoding context, just closes and reopens the output file
Correct a wrong function prototype for process_hex()
OCR: Attempt to correctly deal with TessBaseAPIRecognize returning an error
Changed output for parse PMT to CCX_DMT_PMT instead of CCX_DMT_VERBOSE
2017-07-06 11:57:17 -07:00
Diptanshu8
69a956f3c2 removing api.so 2017-07-04 09:48:44 +00:00
Diptanshu8
7839403266 adding .so to .gitignore 2017-07-04 09:07:14 +00:00
Diptanshu8
6e50104da4 Cyclic rotation and python script argv passing solved 2017-06-28 21:35:32 +00:00
Diptanshu8
edb2431cf9 Cyclic rotation patch 2017-06-28 19:07:44 +00:00
Diptanshu8
67204d8e3c modifying the return value from main 2017-06-28 07:57:44 +00:00
Diptanshu8
a0047a9d3e changing the return status to EXIT_ON 2017-06-28 07:01:32 +00:00
Diptanshu8
cf4aa9021d Modified build_library script for generating the python module 2017-06-27 10:00:47 +00:00
Diptanshu8
3e99dc2955 Reflecting changes of library source code in the ccextractor source code 2017-06-27 10:00:47 +00:00
Diptanshu8
2ed2c27906 First phase evaluation outputs generated 2017-06-27 10:00:47 +00:00
Diptanshu8
a79bab670f modifying the test code for 5 samples 2017-06-27 10:00:47 +00:00
Diptanshu8
a7e2ac7686 Adding auto deletion of obj files to build_library 2017-06-27 10:00:47 +00:00
Diptanshu8
98e295e768 Commiting the build_library file 2017-06-27 10:00:47 +00:00
Diptanshu8
751a22fe68 Resolving merge conflicts in .gitignore 2017-06-27 10:00:47 +00:00
Diptanshu8
dba1d7b6eb Debugging the core dumped error 2017-06-27 10:00:07 +00:00
Diptanshu8
9fa2e3ebb0 adding the ccextractorapi.py file 2017-06-27 10:00:07 +00:00
Diptanshu8
f70f34e009 pushing the api code 2017-06-27 10:00:07 +00:00
Diptanshu8
644d26546c Facing VERSION attribute error in generated module 2017-06-27 10:00:07 +00:00
Diptanshu8
b0a0c92e50 library development done till api_start. Implementation of stop and status function is left. 2017-06-27 10:00:07 +00:00
diptanshuj@gmail.com
d3540ccc0a Made temp changes to initiate library coding 2017-06-27 10:00:07 +00:00
Carlos Fernandez
735f4392dd Merge branch 'pr/n742_saurabhshri' 2017-06-05 13:22:21 -07:00
Carlos Fernandez
95cd2370fc Merge branch 'pr/n743_anshul1912' 2017-06-05 13:21:24 -07:00
Carlos Fernandez
a0bea44274 Corrected CHANGES.TXT 2017-06-05 13:20:23 -07:00
Carlos Fernandez
989adde3ef Merge branch 'pr/n738_techfreakworm'
# Conflicts:
#	docs/CHANGES.TXT
2017-06-05 13:20:00 -07:00
Carlos Fernandez
399f59981d Added --analyzevideo to help page. 2017-06-05 11:00:03 -07:00
Anshul Maheshwari
37956ea4ef Adding info for ocr 2017-06-04 18:15:48 +05:30
Saurabh Shrivastava
51d936bc90 Fix CMake build for windows.
Thank you linker flags for eating my 3+ hours.
2017-06-03 22:32:15 +05:30
Carlos Fernandez
d9796410bc Added --analyzevideo 2017-06-02 12:32:45 -07:00
AlexBratosin2001
a842e1f7db Fix PTS and Length in preview section (Teletext subtitles) 2017-06-02 19:12:31 +03:00
Mayank Gupta
bc361a2e86 Add ocr, hardsubx and autotools support for mac 2017-05-11 06:19:16 +03:00
Carlos Fernandez Sanz
4d6ae27518 Merge pull request #734 from techfreakworm/rpm
[FEATURE]Add .rpm package generation script
2017-05-09 11:11:18 -07:00
Carlos Fernandez
adacb6235e Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-04-24 12:07:00 -07:00
Carlos Fernandez
78c410cf50 Added -nolevdist to disable automatic typo fixing in teletext 2017-04-24 12:06:33 -07:00
Mayank Gupta
f2755ae5bb Add .rpm package generation script 2017-04-18 16:20:15 +05:30
Carlos Fernandez Sanz
fdd5b6bf9d Merge pull request #725 from Amey-Jain/master
Issue 699 solved
2017-04-11 09:23:03 -07:00
Carlos Fernandez Sanz
4438e6c780 Merge pull request #728 from fandango-/arch
[FEATURE] Arch Linux installation script (create .pkg.tar.xz archive or install directly)
2017-04-11 08:05:38 -07:00
Amey Jain
1513b7c42f Timing for sample #70 corrected. 2017-04-11 10:44:44 +05:30
Carlos Fernandez
76eea831ca When NAL decoding fails, don't dump the whole decoded thing, limit to 160 bytes. 2017-04-10 17:11:45 -07:00
Carlos Fernandez
1b6e05083d - TS: Skip NULL packets
- TS: If we don't have pinfo don't pay attention to the current_next_indicator bit.
(fixes problem with The Lion Guard_20170321_09301000.ts). Not sure this fix is the correct one but that's what VLC does.
2017-04-10 11:46:01 -07:00
Abhinav Baid
fddede57fd Update README.md 2017-04-10 17:47:54 +05:30
Abhinav Baid
58f7345b42 Update CHANGES.TXT 2017-04-10 17:37:59 +05:30
Abhinav Baid
83704e306f Update gitignore for pkg.tar.xz files 2017-04-10 17:33:29 +05:30
Abhinav Baid
27311f53e6 Use sudo when available; fallback to su 2017-04-10 17:30:34 +05:30
Abhinav Baid
d3946450eb Fix small grammatical mistake 2017-04-10 11:50:57 +05:30
Abhinav Baid
96ca325a25 Use bash as shell; su instead of sudo; don't proceed if a command fails; change default to not installing 2017-04-10 11:46:15 +05:30
Abhinav Baid
90f94d6053 Merge branch 'master' of https://github.com/CCExtractor/ccextractor into arch 2017-04-10 11:22:11 +05:30
Amey Jain
1b3598b2fe Timing mis-match corrected. 2017-04-08 08:34:18 +05:30
Carlos Fernandez Sanz
58e6bac74d Merge pull request #729 from LucasYoung/master
[FEATURE] Add WebVTT output from Matroska (Issue #724)
2017-03-31 11:42:08 -07:00
LucasYoung
5ffa442c5f Removed changes to ccextractor.vcxproj 2017-03-31 04:37:52 -07:00
LucasYoung
dd5c1ee243 Added WebVTT output from Matroska 2017-03-30 00:09:40 -07:00
Abhinav Baid
4c73649b9c Add --enable-ocr as a configure option by default 2017-03-30 11:42:15 +05:30
Abhinav Baid
8878aebe0b Allow option to build .pkg.tar.xz archive without installing package 2017-03-30 11:25:41 +05:30
Abhinav Baid
236840919d Use gcc as the C compiler during configure 2017-03-30 11:24:55 +05:30
Abhinav Baid
ab5544691a Initial commit 2017-03-29 19:52:22 +05:30
Carlos Fernandez Sanz
19557551fe Merge pull request #717 from techfreakworm/master
[IMPROVEMENT]Make CCExtractor more linux standard compliant #678
2017-03-22 08:53:56 -07:00
Mayank Gupta
8aa6aac2a7 Updated execution permissions for cleanup script 2017-03-22 11:17:38 +05:30
Mayank Gupta
23e6e44073 Updated cleanup script for complete cleanup after build 2017-03-22 11:16:26 +05:30
Mayank Gupta
5342b83345 Updated execute permissions for autogen.sh and tarball.sh 2017-03-22 10:49:38 +05:30
Mayank Gupta
b80e466533 Changed text in AC_MSG_ERROR to remove configure error 2017-03-21 23:06:08 +05:30
Mayank Gupta
77b54feeb6 Added check for pkg-config m4 macros 2017-03-21 19:33:21 +05:30
Mayank Gupta
6b08c123e2 Redirected error stderr stream to stdout stream 2017-03-21 14:59:01 +05:30
Mayank Gupta
c710b4c9b2 Corrected wrong flag order of LDADD 2017-03-21 12:07:58 +05:30
Amey Jain
094a8f295a Issue 699 solved. 2017-03-21 01:06:00 +05:30
Carlos Fernandez Sanz
753ece23d6 Merge pull request #722 from barun511/webvtt_another_fix
[FIX] # 680 Add missing line terminator in webvtt
2017-03-20 10:44:47 -07:00
Amey Jain
d2bea3802e In ref. to issue 699. 20 ms timing mis-match.
modified:   src/lib_ccx/general_loop.c
	modified:   src/lib_ccx/telxcc.c
2017-03-20 14:03:41 +05:30
Mayank Gupta
3ecacd3fa9 Updated .gitignore and configure.ac 2017-03-19 23:25:06 +05:30
Barun Parruck
4f692138fc Add missing line terminator in webvtt
Fix #680
2017-03-18 19:43:30 +05:30
Mayank Gupta
cb42a5fc2a gitignore packages created from package creation scripts 2017-03-17 23:58:47 +05:30
Mayank Gupta
316f959234 Corrected minor mistake in configure.ac 2017-03-17 23:48:27 +05:30
Mayank Gupta
6fdd4b75e8 Updated .gitignore, README.md, added checks in configure.ac, updated package creation scripts 2017-03-17 23:18:26 +05:30
Mayank Gupta
486d7ea002 Changed location of scripts and made tarball creation to work 2017-03-17 17:29:54 +05:30
Carlos Fernandez Sanz
ce9416e943 Merge pull request #721 from Diptanshu8/matroska
**[FIX]** Fixes issue #705
2017-03-16 11:40:27 -07:00
Diptanshu8
baa70a104a char pointer misconception for newline characters 2017-03-16 16:09:06 +05:30
Diptanshu8
7212d6848a Refactoring 2017-03-16 15:58:38 +05:30
Diptanshu8
b723378eb2 Quick fixes 2017-03-16 02:39:25 +05:30
Diptanshu8
5889d1edcd Quick fixes 2017-03-16 02:29:26 +05:30
Diptanshu8
c11ff21499 Done with mkvlang options basic checking 2017-03-16 01:45:31 +05:30
Mayank Gupta
31c8f85ba5 Merge branch 'autoconf' 2017-03-16 00:06:29 +05:30
Diptanshu8
b08c6285e6 Checking of the mkvlan for last option is left and checking for '-' in the code is left 2017-03-15 23:37:43 +05:30
Mayank Gupta
56f4c14318 Corrected package version in debian.sh 2017-03-15 23:31:53 +05:30
Mayank Gupta
c44e9ecfed Modifications for man pages to work 2017-03-15 22:12:12 +05:30
Diptanshu8
6aaaf6d9ae MKVlang support for a multi-language extraction has been added. 2017-03-15 16:57:52 +05:30
Diptanshu8
3b0031c251 MKVlang support for a single language has been added. 2017-03-15 16:47:26 +05:30
Diptanshu8
75a010fe77 Static option _eng_ for only english subtitles has been set 2017-03-15 15:14:06 +05:30
Diptanshu8
712e44e26c Checking the concatenated videos for subtitle errors and resolved 2017-03-15 14:36:43 +05:30
Mayank Gupta
44adf6427b Corrected debian.sh and Makefile.am for package creation 2017-03-15 13:14:34 +05:30
Mayank Gupta
0bc2dbac4a Merge remote-tracking branch 'upstream/master' 2017-03-15 00:50:34 +05:30
Diptanshu8
9bbab6b595 ASS/SSA newline character covered 2017-03-15 00:48:45 +05:30
Mayank Gupta
460f4e9866 Changes to create release tarball scripts 2017-03-15 00:46:36 +05:30
Diptanshu8
f944c93faa yet to cover ass/ssa subtitiles' newline 2017-03-15 00:35:40 +05:30
Diptanshu8
2c7bfef0f0 ASS/SSA support added for avoiding newline chars at the beginning of sentence. 2017-03-15 00:35:40 +05:30
Diptanshu8
6f5b2aa360 Changes made to avoid breakline/newline at the beginning of sentence. 2017-03-15 00:35:40 +05:30
Diptanshu8
d08cad3642 Rebasing 2017-03-15 00:35:17 +05:30
Diptanshu8
d2d055dd37 Done with removal of MATROSKA_MAX_TRACKS 2017-03-15 00:33:58 +05:30
Diptanshu8
7feb705d73 Rebasing 2017-03-15 00:33:12 +05:30
Mayank Gupta
b0ec1a073b Added man pages for ccextractor 2017-03-15 00:30:42 +05:30
Carlos Fernandez Sanz
c0a40529a9 Merge pull request #714 from saurabhshri/Cleaning
[IMPROVEMENT] Cleaning and replacing redundant code with already existing functions.
2017-03-13 22:54:16 -07:00
Mayank Gupta
7bb9a5f783 Updated Makefile.am and configure.ac 2017-03-13 13:17:30 +05:30
Mayank Gupta
00522bc950 Updated README.md 2017-03-13 11:40:11 +05:30
Mayank Gupta
0632c3a023 Updated configure.ac 2017-03-13 10:54:00 +05:30
Mayank Gupta
6f9b8a6d42 Corrected autogen.sh 2017-03-13 10:18:13 +05:30
Mayank Gupta
72c797bf83 created autogen.sh 2017-03-13 10:11:51 +05:30
Mayank Gupta
5f12bd7538 Autoconf scripts added 2017-03-13 10:05:30 +05:30
Saurabh Shrivastava
7b4c3bb26d Removed depricated function. 2017-03-12 00:39:57 +05:30
Saurabh Shrivastava
4bcc79fdb7 Use already available functions.
Removed code with redundant functionality.
-extracting filename without extension
-generating timestamp for srt format from milliseconds
2017-03-12 00:28:33 +05:30
Carlos Fernandez Sanz
7d4a6fb8d3 Merge pull request #712 from canihavesomecoffee/contributor-guide
[IMPROVEMENT] Add templates for issues and PR's
2017-03-06 12:33:24 -08:00
canihavesomecoffee
5194dea1b8 Add contributors guide, update readme 2017-03-06 21:29:21 +01:00
Carlos Fernandez
5f510cdfa2 filename_non_ext instead of filename_without_ext in matroska.c 2017-03-06 11:32:06 -08:00
Carlos Fernandez Sanz
564aff23a6 Merge pull request #711 from Izaron/platform-lld-spec
Added multiplatform LLD and LLU specs
2017-03-06 10:21:21 -08:00
Evgeny Shulgin
f057a7db05 Added multiplatform LLD and LLU specs 2017-03-06 16:48:54 +03:00
Carlos Fernandez Sanz
47c0bdcd47 Merge pull request #709 from kapilkd13/master
modified subtitle file name genrated by matroska(mkv) - remove 'mkv'
2017-03-05 11:36:43 -08:00
kapil kumar
d80a4f4b3c modified subtitle file name genrated by matroska(mkv) -some changes 2017-03-05 23:57:57 +05:30
kapil kumar
fac11ec5ed modified subtitle file name genrated by matroska(mkv) 2017-03-05 23:45:43 +05:30
Carlos Fernandez Sanz
66e8b280de Merge pull request #707 from Izaron/matroska
Replaced %lld with %I64d
2017-03-05 10:05:05 -08:00
Carlos Fernandez Sanz
ceed0e42b3 Merge pull request #708 from Izaron/bytestream
Removed builtin code in Matroska
2017-03-05 10:03:54 -08:00
Evgeny Shulgin
564093e0ec Removed builtin code in Matroska 2017-03-05 20:18:13 +03:00
Evgeny
af3fff4034 Replaced %lld with %I64d 2017-03-05 15:05:07 +03:00
Carlos Fernandez
254ffb3f39 Updated GSoC status in README.md 2017-03-02 15:05:36 -08:00
Evgeny
a66f3c3973 Added "No captions" code support in Matroska 2017-03-02 16:46:10 +03:00
Evgeny
edaa3b828b Fixed bug with sub name in Windows 2017-03-02 16:38:47 +03:00
Evgeny
76cb7b91ee Added matroska.c to filters and fixed _MSC_VER 2017-03-02 16:18:03 +03:00
Evgeny Shulgin
2048827c45 Added time for the activity progress 2017-03-02 15:49:22 +03:00
Evgeny Shulgin
1f478cfb22 Added matroska warnings about "-out=" 2017-03-02 15:43:02 +03:00
Evgeny Shulgin
e74074ffd0 Removed matroska int and byte types 2017-03-02 14:44:43 +03:00
Evgeny Shulgin
20b557ff97 Matroska main part integrated 2017-03-01 21:50:20 +03:00
Evgeny Shulgin
28f84c768e Added Matroska decoder skeleton 2017-03-01 19:40:09 +03:00
Carlos Fernandez
1d281004e7 Merge branch 'pr/n700_saurabhshri' 2017-02-28 16:35:22 -08:00
Carlos Fernandez
f6f0f79954 Merge branch 'pr/n702_kapilkd13' 2017-02-28 16:34:32 -08:00
Carlos Fernandez
656bac1a6f Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-02-28 16:33:56 -08:00
Carlos Fernandez
66344c51fb Merge branch 'pr/n703_Izaron' 2017-02-28 16:33:44 -08:00
kapil kumar
23947beecc typo error 2017-02-28 21:14:22 +05:30
kapil kumar
876a442362 added usage for Levenshtein. fixed issue 701 2017-02-28 20:31:09 +05:30
Saurabh Shrivastava
4ef4a9b03e Teltext from .bin now honours -unixts
Also, fixed no caption found even when it was found from .bin for
teltext.
2017-02-26 23:28:17 +05:30
Saurabh Shrivastava
e9dd3d5845 Ignore CMake Build Stuff 2017-02-26 23:26:22 +05:30
Carlos Fernandez Sanz
eb24e34899 Merge pull request #697 from AlexBratosin2001/master
Sync Teletext to Private Stream 1 instead of all streams
2017-02-23 16:11:37 -08:00
Saurabh Shrivastava
b44fbefc97 Merge remote-tracking branch 'refs/remotes/CCExtractor/master' into ucla 2017-02-24 01:43:15 +05:30
AlexBratosin2001
cbffbb0358 Sync Teletext to Private Stream 1 instead of all streams 2017-02-23 16:33:07 +02:00
Evgeny Shulgin
52c7a8474f Added missing swscale library 2017-02-22 16:40:40 +03:00
Evgeny Shulgin
4f5f564b59 Automatically enable HARDSUBX whed FFMPEG used 2017-02-22 16:36:40 +03:00
Evgeny Shulgin
00ead85ab7 Fixed FFMPEG libs and set all libs non-static 2017-02-21 19:31:12 +03:00
Carlos Fernandez Sanz
9330b95147 Merge pull request #693 from Izaron/korean-fix
Rewritten output handler
2017-02-20 12:13:21 -08:00
Evgeny Shulgin
4ee9c847da Fixed CMake for OCR 2017-02-20 19:31:58 +03:00
Evgeny Shulgin
29180a95b1 Rewritten output handler 2017-02-20 17:39:43 +03:00
Carlos Fernandez Sanz
1124d82687 Merge pull request #685 from sidgairo18/fix_defects
Fix defects
2017-02-15 09:29:25 -08:00
Carlos Fernandez Sanz
795abf74ab Merge pull request #686 from barun511/webvtt_new_fix
Webvtt new fix
2017-02-15 09:27:35 -08:00
Carlos Fernandez Sanz
47cbdd9c1d Merge pull request #688 from Izaron/fedora-build
Updated "Compiling" section with Fedora
2017-02-15 09:26:38 -08:00
Evgeny Shulgin
9b527b7793 Updated "Compiling" section with Fedora 2017-02-15 20:20:42 +04:00
Barun Parruck
e919f1b9f5 Add comments, change array iteration 2017-02-15 01:49:58 +05:30
Barun Parruck
73f3c83940 Add -lf support | make line terminator consistent 2017-02-14 16:22:27 +05:30
Siddhartha Gairola
1425f426dc Update stbl_write.c 2017-02-14 12:28:17 +05:30
Siddhartha Gairola
3e37250d44 Update networking.c 2017-02-12 15:29:48 +05:30
sidgairo18
3be78775ca Fixing memory defects 2017-02-12 15:08:45 +05:30
Saurabh Shrivastava
ab1c7ab563 Fixed missing tpage number in UCLA from BIN. 2017-02-12 14:17:34 +05:30
Barun Parruck
7c2483d73e Fix webvtt formatting
The lack of CRLF after the header led to an invalid webvtt format.
2017-02-10 14:54:42 +05:30
Carlos Fernandez
1975848ecc Removed doku.php reference in README.md 2017-02-08 15:17:28 -08:00
Carlos Fernandez Sanz
e79506b303 Merge pull request #681 from Abhinav95/master
Template for upcoming additions to burned in extraction
2017-02-08 13:16:25 -08:00
Abhinav Shukla
9f32edad63 Merging change to Tesseract OEM mode with new hardsubx code 2017-02-09 02:40:01 +05:30
Abhinav Shukla
3278b31a8f Setting up tickertape parameter 2017-02-09 02:30:51 +05:30
Carlos Fernandez
ec6c7aede8 Typo in README.md 2017-02-07 15:09:46 -08:00
Carlos Fernandez
0516232bb3 Changes in README.md 2017-02-07 15:09:01 -08:00
Carlos Fernandez
44f97181ae Mentioned GSoC 2017 in README.md 2017-02-07 15:06:49 -08:00
Carlos Fernandez
cf762df972 Minor typo 2017-02-06 11:11:44 -08:00
Carlos Fernandez Sanz
6a90829744 Merge pull request #679 from saurabhshri/print_usage
Changes regarding output file name and minor print_usage corrections.
2017-02-06 11:09:37 -08:00
Saurabh Shrivastava
d37086a434 Removed depricated pramams and added missing one.
-o1 and -o2 were depricated in commit
0541a2fb62

--version was missing.
2017-02-06 03:21:34 +05:30
Saurabh Shrivastava
5ce5dc7fae Fix for NULL output filename.
If no -o is suppled with stdin/network etc, the output name generated
was NULL, leading to creation of files like `.srt` which were in
category of hidden files.
2017-02-06 03:11:53 +05:30
Saurabh Shrivastava
851894dceb Fatal if unable to open output file. 2017-02-06 03:11:53 +05:30
Saurabh Shrivastava
6837a1070b Printing end message after fatal. 2017-02-06 02:11:53 +05:30
Carlos Fernandez
dbad5f4cda Started 0.86 changelist. 2017-01-30 15:54:32 -08:00
Carlos Fernandez
482a20430d ffmpeg_intgr.c: Wrong directory was being used in ffmpeg #include. 2017-01-30 15:54:05 -08:00
Carlos Fernandez
85751cee2b dvbsub_parse_clut_segment: Changed return on fail (from 0 to -1, as expected by the caller). 2017-01-30 15:14:16 -08:00
Carlos Fernandez
12467815ae Closes #675
Moved final text after dinit().
2017-01-30 13:06:48 -08:00
Carlos Fernandez
883d5ccc77 Added windows/enc_temp_folder to .gitignore 2017-01-30 12:59:39 -08:00
Carlos Fernandez
0d7b33c362 Merge branch 'pr/n673_AlexBratosin2001' 2017-01-30 12:58:25 -08:00
Carlos Fernandez
79a6d3d04a Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-01-30 12:57:01 -08:00
Carlos Fernandez
6bbb649ee6 Fix crash on -xmltv -out=null due to not having an encoder context. 2017-01-30 12:56:44 -08:00
AlexBratosin2001
919d3ec3b0 Fixed timing error caused by last commit 2017-01-28 23:24:50 +02:00
AlexBratosin2001
c5ea59aeb1 Fixed huge memory leak related to OCR init 2017-01-28 22:20:08 +02:00
Carlos Fernandez Sanz
b3010afba7 Merge pull request #672 from canihavesomecoffee/ds_store-fix
Delete .DS_Store files and add it to .gitignore file
2017-01-28 10:28:26 -08:00
Willem
de958c9383 Delete .DS_Store files and add it to .gitignore file 2017-01-28 18:03:22 +01:00
Carlos Fernandez
1bd3a43dbe Added 0.85b to CHANGES.TXT 2017-01-26 17:08:01 -08:00
Carlos Fernandez
5fa83394a0 Merge branch 'pr/n670_Izaron' 2017-01-26 10:20:13 -08:00
Carlos Fernandez
51537e8725 networking, multicast: In linux, bind to the specific IP address of the multicast source.
OCR: init some variables that didn't have a default value.
2017-01-26 10:16:52 -08:00
Evgeny
071386d552 Fixed DLL requiring in non-full version 2017-01-26 12:27:39 +03:00
Carlos Fernandez
ec9a0985ce Rework signals 2017-01-24 11:06:09 -08:00
Carlos Fernandez
71dffd6eb3 Use TessDeleteText to delete strings received from Tesseract 2017-01-23 16:05:05 -08:00
Carlos Fernandez
560a88b0b9 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-01-23 12:34:06 -08:00
Carlos Fernandez
626717cc28 Added release date for 0.85 and new changes to CHANGES.TXT 2017-01-23 12:33:53 -08:00
Carlos Fernandez Sanz
8dc1964f8c Merge pull request #660 from Izaron/708-adventures
Fixed ttxt 708 files segfault
2017-01-23 10:05:10 -08:00
Evgeny Shulgin
d72e946213 Fixed ttxt 708 files segfault 2017-01-23 19:29:28 +04:00
Carlos Fernandez Sanz
24edbff859 Merge pull request #655 from saurabhshri/patch-1
Ability to extract Chapters from MP4 files.
2017-01-20 13:01:43 -08:00
Saurabh Shrivastava
77da2dc873 Added -chapters paramater for chapter extaraction from MP$. 2017-01-21 01:43:41 +05:30
Saurabh Shrivastava
d9414782b2 Function prototype for chapter extraction. 2017-01-21 01:38:12 +05:30
Saurabh Shrivastava
d1b127164e Option for chapter extraction. 2017-01-21 01:37:13 +05:30
Saurabh Shrivastava
db8d9c67b6 By default don't extract chapters. 2017-01-21 01:35:58 +05:30
Saurabh Shrivastava
c9a3a0c7f2 Extract chapters from mp4 file and write it in a file. 2017-01-21 01:33:59 +05:30
Saurabh Shrivastava
0e4d211eaf Extract chapter instead of subtitles if extract_chapters is True. 2017-01-21 01:29:04 +05:30
Carlos Fernandez
57daaf3e4d - Correct identing in ccextractor.c
- Correct return code for multiprogram transport streams
2017-01-19 15:14:54 -08:00
Carlos Fernandez
bc1e309b13 Added "CCX_DTVCC_C0_NUL" (do nothing, but prevent the "unhandled" warning) 2017-01-19 14:53:15 -08:00
AlexBratosin2001
09778b2d14 Sped up min_pts calculation (avoided lots of unnecessary loop iterations) 2017-01-19 22:06:39 +02:00
Evgeny
89c00a7e21 Added OEM mode parameter 2017-01-19 20:57:35 +03:00
Carlos Fernandez
bb026a7318 Merge branch 'pr/n649_sidgairo18' 2017-01-17 11:53:13 -08:00
maxkoryukov
566d1284f2 Remove SBS stuff from decoder_init 2017-01-15 23:55:41 +05:00
maxkoryukov
b5b2a7d70d Probably fix the maxkoryukov/ccextractor#1 : split to sentences
This version returns enough readable subs , splitted into sentences
2017-01-15 22:37:51 +05:00
maxkoryukov
93e407f4a5 Improve SBS: fix for #639 and non-gready similarity detection
* Use own SBS-context structure to store SBS-data (fix CCExtractor/ccextractor#639)
* Search for BEST match of new string and SBS-buffer (instead of first appropriate..)
* all tests are fixed and passed
2017-01-15 22:37:51 +05:00
maxkoryukov
ad7b141cc6 Tiny fixes 2017-01-15 22:37:51 +05:00
maxkoryukov
f23beab07e Fix error with uninitialed sbs_handled_len. Free sbs_buffer on dinit_encoder_context
* more debug for SBS
2017-01-15 22:37:51 +05:00
maxkoryukov
c582175d35 Wrap debug instructions in #ifdef 2017-01-15 22:37:51 +05:00
maxkoryukov
1b1a572f73 SBS: use Levenshtein distance to detect duplicates in subs
see maxkoryukov/ccextractor#1
2017-01-15 22:37:51 +05:00
maxkoryukov
7c9ffbbde9 Levenshtein for char * in utility.c
see maxkoryukov/ccextractor#1
2017-01-15 22:37:51 +05:00
maxkoryukov
5404108cc1 Some improvements for test-environment
see maxkoryukov/ccextractor#1
2017-01-15 22:37:51 +05:00
maxkoryukov
5c2d6956fd Fixed format specifiers for debug output 2017-01-15 22:37:51 +05:00
Siddhartha Gairola
fad623ed6a Update general_loop.c 2017-01-15 14:08:16 +05:30
Siddhartha Gairola
263dd2cb40 Update ccx_encoders_webvtt.c 2017-01-15 14:07:04 +05:30
sidgairo18
2d56d067e3 Fixed issue #648 2017-01-15 14:02:49 +05:30
Carlos Fernandez Sanz
759507f196 Merge pull request #645 from Izaron/708-adventures
708 fixes
2017-01-14 10:04:06 -08:00
Evgeny
e048c65cdb [CEA-708] Added BS command 2017-01-14 19:30:03 +03:00
Evgeny
41cd5f00bc Updated 708 window dump 2017-01-14 18:17:31 +03:00
AlexBratosin2001
7ab968c4a6 Merge remote-tracking branch 'upstream/master' 2017-01-13 23:35:59 +02:00
AlexBratosin2001
462f63a294 Fixed DVB multiprogram. 2017-01-13 23:35:34 +02:00
Carlos Fernandez
f75793c5e4 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-01-13 11:32:22 -08:00
Carlos Fernandez
521ee29ab8 Corrections in dvbcolor help and code comments. 2017-01-13 11:32:11 -08:00
Carlos Fernandez Sanz
91690f9453 Merge pull request #643 from Izaron/708-stuff
CEA-708 subtitle counter
2017-01-13 09:55:12 -08:00
Evgeny
aa0db3c528 Using correct CEA-708 subs counter 2017-01-13 20:34:50 +03:00
Evgeny
a727d2df26 Fixed hot bug with CEA-708 counter 2017-01-13 20:32:29 +03:00
Carlos Fernandez Sanz
8f818051b0 Merge pull request #642 from Izaron/708-stuff
CEA-708 improvements
2017-01-13 06:54:12 -08:00
Evgeny
1fb98118c6 Added CEA-708 counter for EXIT_NO_CAPTIONS 2017-01-13 16:40:08 +03:00
Evgeny
967e2bc695 decoder->tv refactoring 2017-01-13 16:15:43 +03:00
Evgeny
0855c0a41d Added support of SAMI and TTXT in CEA-708 2017-01-13 16:04:37 +03:00
Evgeny
bff384e677 Refactored CEA-708 symbol struct 2017-01-13 15:46:41 +03:00
Evgeny
3b2545cf82 Fixed timing bug catching in CEA-708 2017-01-13 14:06:49 +03:00
Evgeny
762ab7ce36 Merge remote-tracking branch 'refs/remotes/CCExtractor/master' 2017-01-13 10:30:35 +03:00
Naman Yadav
ab31e7b4d4 Check for language tesseract data in /usr/share/tessdata/ (#638)
Closes https://github.com/CCExtractor/ccextractor/issues/448
2017-01-12 15:07:28 -08:00
gonzaloUran
cd17aa3a53 make -ignoreptsjumps and -dvbcolor default (#637)
* default-arguments

* default-arguments
2017-01-12 14:44:56 -08:00
Carlos Fernandez
d99fda59a3 Merge branch 'pr/n640_Izaron' 2017-01-12 10:33:19 -08:00
Evgeny
ddbd03760b Added max macro for non-Visual Studio IDE 2017-01-12 21:27:36 +03:00
Evgeny
7078f10150 Fixed 708 pen handling from line by line to correct 2017-01-12 21:07:16 +03:00
Siddhartha Gairola
6c733e96c9 Fix defects (#630)
* Fixed memory leaks issue #615

* Fixed memory leaks issue #615

* Update lib_ccx.c

* Fixing issue #629

* Update networking.c
2017-01-11 13:06:56 -08:00
Saurabh Shrivastava
720008f9fb Update params.c 2017-01-12 02:03:08 +05:30
AlexBratosin2001
ade11eb80f Fixed DVB multiprogram (timing is still broken) 2017-01-11 22:30:39 +02:00
Saurabh Shrivastava
d8a6642d5f Exit if input source is stdin for MP4. 2017-01-12 01:58:25 +05:30
Saurabh Shrivastava
2464064226 Stop GPAC from analyzing if input source is stdin. 2017-01-12 01:54:31 +05:30
Carlos Fernandez
591d74d0c5 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-01-11 11:53:56 -08:00
Carlos Fernandez
14286a0025 Avoid calling fatal() on warning.
Minor indenting.
2017-01-11 11:53:43 -08:00
Evgeny
ddce5829d5 Musical note should replace 'Delete' in G0 Table 2017-01-11 19:18:32 +03:00
Evgeny
ebd9fc4bfe Minor mistake in bitsream fixed 2017-01-11 14:14:35 +03:00
gonzaloUran
6f8d99b39e pesheader-option (#628) 2017-01-10 16:56:30 -08:00
saurabhkapur
65634a18d1 Fixes #618 (#619) 2017-01-10 16:27:28 -08:00
Saurabh Shrivastava
827ace8dca SMPTE-TT : Removed appearance of garbage value in color code. (#625)
Also polished and improved existing code. More details : CCExtractor#620
2017-01-10 16:26:23 -08:00
AlexBratosin2001
baa5b0d14f Fixed error regarding last commit 2017-01-11 00:24:07 +02:00
AlexBratosin2001
b633491b91 Perfected DVB timing and cleaned up code 2017-01-11 00:00:06 +02:00
AlexBratosin2001
9f331b6a92 Merge remote-tracking branch 'upstream/master' 2017-01-10 23:59:16 +02:00
Carlos Fernandez
b5de22ff13 Solve crash caused by boxdestroy? 2017-01-10 13:06:34 -08:00
Carlos Fernandez
e4c9a95f7c Memory leaks in ocr.c 2017-01-10 12:59:23 -08:00
Carlos Fernandez
90001b6c23 Fix in memory free 2017-01-10 12:20:15 -08:00
Carlos Fernandez
19fec61902 Fix: Memory leak in ccx_encoders_smptett.c 2017-01-10 12:16:05 -08:00
Carlos Fernandez
acc63cc478 Merge branch 'pr/n623_Izaron' 2017-01-10 11:35:24 -08:00
Carlos Fernandez
7a4bcd3b52 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-01-10 11:18:41 -08:00
Carlos Fernandez
4017b59f97 Init saw_caption_block in decoder context. 2017-01-10 11:18:18 -08:00
Evgeny
407a40e32e Fixed italics and underline bit flags 2017-01-10 22:17:13 +06:00
Evgeny
714700f6b5 Fixed column count to correct values 2017-01-10 22:03:50 +06:00
Evgeny
d60baf1895 Added support of UTF16 2017-01-10 12:12:13 +06:00
Carlos Fernandez
14418d6fa1 Added +x to build-static.sh 2017-01-09 22:42:20 +00:00
Carlos Fernandez
2db16f09c7 Added static binary build script (linux). 2017-01-09 12:08:36 -08:00
Carlos Fernandez
178aa1de9c Typo in CHANGES.TXT 2017-01-09 11:41:48 -08:00
Carlos Fernandez
6716704dc3 Added more content to CHANGES.TXT 2017-01-09 11:08:27 -08:00
Carlos Fernandez
d7d7d62971 Updated CHANGES.TXT with the new stuff for 0.85. 2017-01-09 11:06:19 -08:00
AlexBratosin2001
737c0f4205 Added Alexandru Bratosin to README.TXT 2017-01-09 20:15:36 +02:00
Carlos Fernandez
766145275a Merge branch 'pr/n622_AlexBratosin2001'
Version change from 0.84 to 0.85.
2017-01-09 10:09:13 -08:00
Carlos Fernandez
7517a5448e Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2017-01-09 10:04:27 -08:00
Carlos Fernandez
80a0f1a1c1 Changed where to report bugs. 2017-01-09 10:04:10 -08:00
Abhinav Shukla
5933194570 Progress activity and more context 2017-01-09 21:14:52 +05:30
AlexBratosin2001
898ce5bf87 Fixed Teletext related issues (+DVB) and added other stuff 2017-01-09 16:14:05 +02:00
Abhinav Shukla
fbc7cb5452 Setting up skeleton for tickertext based burned in detection 2017-01-09 17:51:15 +05:30
Abhinav Shukla
5c646d214c Merge remote-tracking branch 'upstream/master' 2017-01-08 21:59:36 +05:30
Carlos Fernandez
6f11230a87 Merge branch 'master' of https://github.com/CCExtractor/ccextractor.git 2017-01-08 07:51:58 -08:00
Carlos Fernandez
a63acf4f45 Changed minor text in a debug function. 2017-01-08 07:51:44 -08:00
Evgeny
a41cd3f0c4 Added spaces as in 608 subs
Length of the string is equal to 32, and so subtitles are displayed in
VLC with spaces, not the text without spaces in the middle of screen.
2017-01-08 21:50:56 +06:00
Siddhartha Gairola
9fc0402b0f Fixed memory leaks. refer issue #615 (#617)
* Fixed memory leaks issue #615

* Fixed memory leaks issue #615

* Update lib_ccx.c
2017-01-08 07:29:15 -08:00
Anukul Sangwan
4852987aae return a non-zero return code if no subtitles were found (#553)
* return a non-zero return code if no subtitles were found

* fix for mp4
2017-01-08 01:08:45 -08:00
saurabhkapur
83632761a4 tesseract library file included in mac build command (#612)
* tesserac file added in mac build command

* tesserac file added in mac build command

* Delete .DS_Store
2017-01-07 11:52:39 -08:00
Evgeny Shulgin
2229a51b66 Added FFMPEG 3.0 what compatible with XP (#610) 2017-01-07 11:52:13 -08:00
Siddhartha Gairola
86a826edaf GPAC double free issue resolved. Issues 608 and 609 also resolved. (#611)
* A pointer was freed twice in file

* The double free issue in GPAC is updated according to the latest source code. Issues 608 and 609 also resolved.

* Update wtv_functions.c
2017-01-07 11:51:20 -08:00
cfsmp3
b6dc6dd876 Merge branch 'pr/n605_brooss' 2017-01-06 09:28:01 +01:00
Brooss
4ed60073d2 Fix for bad WTV timings (#452) 2017-01-06 15:12:18 +11:00
cfsmp3
7d01c963b8 Merge branch 'pr/n600_grave-w-grave' 2017-01-05 10:26:12 +01:00
cfsmp3
80047e536d Merge branch 'pr/n601_AlexBratosin2001' 2017-01-05 10:25:17 +01:00
cfsmp3
4bcabff630 Merge branch 'pr/n602_Lord-AJ' 2017-01-05 10:24:41 +01:00
cfsmp3
be4ad97832 Merge branch 'pr/n603_Izaron' 2017-01-05 10:23:43 +01:00
Evgeny
f2b8b43bae Added PAC row positioning 2017-01-05 11:38:04 +03:00
Lord-AJ
00f1ec7906 Added names of characters from S4E01 2017-01-05 02:02:25 +05:30
cfsmp3
f8aae84bc4 Merge branch 'pr/n599_Izaron' 2017-01-04 17:59:06 +01:00
AlexBratosin2001
2a7df734de Added -pesheader dump support for teletext packets too 2017-01-04 18:57:39 +02:00
Evgeny
4308247624 Added .css styling both inline and outline 2017-01-04 17:44:27 +03:00
grave-w-grave
5f19a9f89d Skip the packet if the adaptation field length is broken 2017-01-04 12:39:02 +11:00
Evgeny
34a21a931d Enabled the support of raw WebVTT 2017-01-03 22:57:28 +03:00
Evgeny
2b0c8ba7a0 Added WebVTT color and font support 2017-01-03 22:48:50 +03:00
Evgeny
2c30f5eb5b Fixed file invalidation
Don't need to add CRLF because file becomes invalid. We need to write
cue just after the timestamp.
2017-01-03 20:55:15 +03:00
cfsmp3
dfb7d8472c Merge branch 'pr/n598_Izaron' 2017-01-03 13:45:29 +01:00
Evgeny
d87b269bae Fixed bug with multiple headers 2017-01-02 20:43:17 +03:00
cfsmp3
c84c7b5fa0 Commented out nanoseconds in PCR 2017-01-02 18:27:00 +01:00
cfsmp3
03de867572 Attempt at solving the PTS overflow in the first 2 PTS set (Hercules.ts problem) 2017-01-02 18:25:35 +01:00
Evgeny
483540488a Added webvtt-full parameter 2017-01-02 20:22:02 +03:00
Evgeny
f55876514f Updated libpng from 1.6.26 to 1.6.27 2017-01-02 18:18:15 +03:00
cfsmp3
bf7ec06957 Made rollover_bits parts of decoder context 2017-01-02 15:07:35 +01:00
cfsmp3
874a850087 Correctly write PTS on debug 2017-01-02 14:28:22 +01:00
cfsmp3
4b0a455147 Merge branch 'pr/n594_Izaron' 2017-01-02 13:43:42 +01:00
Evgeny Shulgin
624f1722b6 The proposal to update the "Compiling" section 2017-01-02 13:22:19 +03:00
Evgeny
72f12bbff5 Hot fix for non-full configuration 2017-01-02 12:59:43 +03:00
cfsmp3
57ef958250 Corrected header directories in non-full versions. 2017-01-01 22:12:35 +01:00
cfsmp3
2814599943 Merge branch 'pr/n592_Izaron' 2017-01-01 21:53:20 +01:00
Evgeny
2942e84a6f Solved the Windows dependency hell 2017-01-01 21:34:43 +03:00
AlexBratosin2001
148a70ccb8 Fixed SPUPNG possible subtitle overlapping 2017-01-01 19:44:38 +02:00
AlexBratosin2001
5f0c6cb961 Merge remote-tracking branch 'upstream/master' 2017-01-01 19:43:22 +02:00
Lia
bb3ae7eb88 align comments (#573) 2017-01-01 16:34:53 +01:00
cfsmp3
64d2805c72 Merge branch 'pr/n588_emquantum' 2017-01-01 16:29:05 +01:00
cfsmp3
533599e5d4 Merge branch 'pr/n589_emquantum' 2017-01-01 16:27:57 +01:00
cfsmp3
cec527453a Merge branch 'pr/n587_emquantum' 2017-01-01 16:27:38 +01:00
AlexBratosin2001
2d35bbb4da Fixed SAMI unnecessary empty subtitle when extracting DVB subs (continuity check). 2016-12-31 16:50:57 +02:00
cfsmp3
7091101b04 Merge branch 'pr/n586_AlexBratosin2001' 2016-12-31 15:11:05 +01:00
emquantum
848ad08efc Create dict_smash.txt 2016-12-30 15:48:42 -05:00
emquantum
2620373fb6 Create dict_glee.txt 2016-12-30 15:39:06 -05:00
Saurabh Shrivastava
afc8a3d764 Added support for font color.
Font color support as per SMPTE-TT Specification.
2016-12-31 02:00:54 +05:30
Saurabh Shrivastava
6c1ba95f89 Proper string termination.
Replaced `strncpy` with `strncat` as `strncpy` doesn't copy null character and was causing undesirable effects.
2016-12-31 00:30:39 +05:30
emquantum
c5cd400eb8 Create dict_white_collar.txt 2016-12-30 13:45:19 -05:00
AlexBratosin2001
bd2868746d Fixed DVB to SAMI extraction (subtitle overlapping when subtitle had more than 1 line). 2016-12-30 20:43:42 +02:00
AlexBratosin2001
cc7692d7eb Fixed crash regarding memory leak 2016-12-30 19:11:33 +02:00
AlexBratosin2001
fd01232a26 Fixed SSA, SPUPNG and VTT timing and skipping of subtitles for SAMI and TTML. 2016-12-30 16:46:23 +02:00
AlexBratosin2001
5fa93764b2 Fixed SSA, SPUPNG and VTT timing and skipping of subtitles for SAMI and TTML. 2016-12-30 16:44:07 +02:00
cfsmp3
588ed3c77e Merge branch 'pr/n582_saurabhshri' 2016-12-30 08:50:57 +01:00
Saurabh Shrivastava
6e3f669c13 Corrected improper encoding.
It was present due to uninitialised variables. Also memory leak fixed. :)
2016-12-30 11:26:21 +05:30
Saurabh Shrivastava
23a9c5da4d Probable Memory Leak Fix. 2016-12-30 00:44:25 +05:30
Saurabh Shrivastava
344a3e633a Fixed minor mistake. 2016-12-29 23:33:42 +05:30
Saurabh Shrivastava
0a0881017a Revert Incorrect Commit. 2016-12-29 23:24:12 +05:30
Saurabh Shrivastava
76e0987c57 Adobe Premier Pro Compatibility and Proper HTML formatting.
These changes were required for Adobe Premier Pro Compatibility. Previous ones were ignored.

a. ITALICS:

This:
      <p ...>
        <span>Hello<i> italic</i>!
        <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/></span>
      </p>
Premiere expects:
      <p ...>
        <style/>
        <span>Hello 
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
        <span>italics
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px" tts:fontStyle="italic"/>
        </span>
        <span>!
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
      </p>

2.b. UNDERLINE:

This:
      <p ...>
        <span>Hello<u> underline</u>!
        <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/></span>
      </p>
Premiere expects:
      <p ...>
        <style/>
        <span>Hello 
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
        <span>underline
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px" tts:textDecoration="underline"/>
        </span>
        <span>!
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
      </p>
2016-12-29 23:13:01 +05:30
Saurabh Shrivastava
e0b6ae275d Adobe Premier Pro Compatibility and Proper HTML formatting.
These changes were required for Adobe Premier Pro Compatibility. Previous ones were ignored.

a. ITALICS:

This:
      <p ...>
        <span>Hello<i> italic</i>!
        <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/></span>
      </p>
Premiere expects:
      <p ...>
        <style/>
        <span>Hello 
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
        <span>italics
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px" tts:fontStyle="italic"/>
        </span>
        <span>!
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
      </p>

2.b. UNDERLINE:

This:
      <p ...>
        <span>Hello<u> underline</u>!
        <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/></span>
      </p>
Premiere expects:
      <p ...>
        <style/>
        <span>Hello 
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
        <span>underline
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px" tts:textDecoration="underline"/>
        </span>
        <span>!
          <style tts:backgroundColor="#000000FF" tts:fontSize="18px"/>
        </span>
      </p>
2016-12-29 23:08:45 +05:30
Saurabh Shrivastava
c9fe87014e Rebasing with current memory leak fix. 2016-12-29 22:53:40 +05:30
Saurabh Shrivastava
2b94b5e316 Updated SMPTE-TT header.
Updated header according to SMPTE-TT TTML guidelines.
2016-12-29 22:50:15 +05:30
AlexBratosin2001
0c4bf2a6b1 Added -ignoreptsjumps parameter to ignore pts jumps (needed for some formats). Updated help page (added documentation for latest parameters). Replaced some printf() calls with mprint(). 2016-12-29 17:18:42 +02:00
AlexBratosin2001
b9448026d7 Updated -debugdvbsub traces to get the most relevant info. 2016-12-29 15:36:26 +02:00
AlexBratosin2001
90745b07ac Updated -debugdvbsub traces to get the most relevant info. 2016-12-29 15:33:48 +02:00
AlexBratosin2001
40d97292d2 Fixed memory leak for DVB subs. 2016-12-29 13:16:09 +02:00
AlexBratosin2001
06fdd51104 Fixed memory leak for DVB subs. 2016-12-28 22:04:23 +02:00
AlexBratosin2001
a09a7e4930 Fixed DVB subtitle timing and added -debugdvbsub parameter for DVB sub debug traces. 2016-12-28 21:35:26 +02:00
AlexBratosin2001
121ac2bdfe Fixed DVB subtitle timing and added -debugdvbsub parameter for DVB sub debug traces 2016-12-28 21:06:55 +02:00
AlexBratosin2001
e9c088a86b Fixed DVB subtitle timing and added -debugdvbsub parameter for DVB sub debug traces 2016-12-28 20:55:10 +02:00
cfsmp3
d4e03ec759 Merge branch 'master' of github.com:CCExtractor/ccextractor 2016-12-27 04:29:28 +01:00
cfsmp3
2d767569c9 Removed ptr->bufferdatatype fromc all to pess_header_dump. 2016-12-27 04:28:53 +01:00
Evgeny Shulgin
eecec39725 Fix SubStation Alpha subtitles in bitmap (#571) 2016-12-26 21:32:06 +01:00
Evgeny Shulgin
99ec42bf35 Fixed lept msg severity in linux (#576) 2016-12-26 21:31:33 +01:00
Evgeny Shulgin
e2f6fce850 Added build script for .deb (#574)
* Added build script for .deb

* Added tip for users and automatically build
2016-12-26 21:31:00 +01:00
AlexBratosin2001
9f2cd33a82 Added -pesheader parameter for PES packet header dumping (#572)
* Added -pesheader parameter for PES packet header dumping

* Added -pesheader parameter for short PES Header dump to console

* Added -pesheader parameter for short PES Header dump to console

* Fixed DVB subs start time and added -pesheader parameter for PES Header dumping

* Fixed DVB sub start time and added -pesheader parameter for PES packet header dumping
2016-12-26 18:21:10 +01:00
ManveerBasra
556cc482d8 Added Manveer Basra to README.txt 2016-12-24 16:29:46 -08:00
Evgeny
fe9cd61d1d Fixed bad tesseract library 2016-12-24 22:19:23 +03:00
Evgeny
08b2bcb88b Added dependencies .dll-s and copy command 2016-12-24 18:26:06 +03:00
Evgeny
0072e98fad Fixed Tesseract OCR mode, removed Lept info msg 2016-12-24 18:23:23 +03:00
Evgeny
0befb3c5b1 Renamed tess version from 3.05 to 4.00 2016-12-24 10:59:52 +03:00
Evgeny
5e0f9a4898 Disabled info and warnings from Tesseract 2016-12-24 10:42:35 +03:00
Abhinav Shukla
588a8d6b7c Merge remote-tracking branch 'upstream/master' 2016-12-23 23:59:56 +05:30
Evgeny
331a64e387 Added working tesseract 4.00 2016-12-23 18:01:12 +03:00
Evgeny
f40781a9eb Updated zlib 2016-12-23 14:37:07 +03:00
Evgeny
490fedf463 Updated libpng from 1.6.10 to 1.6.26 2016-12-23 14:25:12 +03:00
Evgeny Shulgin
12466ef9cb Memory optimization in edit_distance (#564) 2016-12-23 09:31:17 +01:00
cfsmp3
c75fc7f1eb Readded the old windows directories to .gitignore to avoid crap being committed by mistake. 2016-12-22 19:59:55 +01:00
Evgeny
4c78e47404 Fixed mess in the filters 2016-12-22 18:37:02 +03:00
Evgeny
4b80441164 Renamed OCR to Full and copy ffmpeg DLLs to folder 2016-12-22 18:19:42 +03:00
cfsmp3
e2cc2f9fd7 ImageHasSafeExceptionHandlers>false 2016-12-22 09:21:24 +01:00
cfsmp3
b669733bd8 Added pre-build.bat to Release-OCR 2016-12-22 08:54:59 +01:00
Evgeny Shulgin
5a769fa22b Fixed bug with -udp address (#549) 2016-12-22 07:40:50 +01:00
cfsmp3
8bbf27526e Merge branch 'pr/n557_Izaron' 2016-12-22 07:29:29 +01:00
cfsmp3
4270450a03 Merge branch 'pr/n554_Izaron' 2016-12-22 07:18:17 +01:00
Evgeny
802360b008 Ported HARDSUBX to Windows 2016-12-18 19:42:23 +03:00
Evgeny
ccd11d38f8 Fixed -stdin option in Windows 2016-12-18 11:21:03 +03:00
Izaron
996c18fca0 Reopen: fixed memory leaks 2016-12-17 19:00:34 +03:00
Abhinav Shukla
0cb7cbcb49 Merge remote-tracking branch 'upstream/master' 2016-12-17 20:11:11 +05:30
Carlos Fernandez
19f12038cd Added new stuff to CHANGES.TXT 2016-12-16 10:50:04 -08:00
Carlos Fernandez
ebc7c070f3 Merge branch 'pr/n545_Izaron' 2016-12-16 10:44:14 -08:00
Carlos Fernandez
7acb3c3874 Version bump (to 0.84). Rename target name of the Windows OCR binaries. 2016-12-16 10:41:02 -08:00
Carlos Fernandez
e3ca2c3fbc Merge branch 'pr/n537_maxkoryukov' 2016-12-16 10:06:18 -08:00
Carlos Fernandez
ac1accce96 Merge branch 'pr/n544_Izaron' 2016-12-16 10:04:19 -08:00
Carlos Fernandez
c39185abab Merge branch 'pr/n541_saurabhshri' 2016-12-16 10:03:37 -08:00
Carlos Fernandez
6729c9c33f Merge branch 'pr/n524_DanilaFe' 2016-12-16 10:02:18 -08:00
Carlos Fernandez
4e4a7aa19e Merge branch 'pr/n522_DanilaFe' 2016-12-16 10:01:24 -08:00
Izaron
5d8ce8bfb8 Added configurable curl post url 2016-12-16 15:07:01 +03:00
Izaron
92b811b329 Added myself to credits 2016-12-16 14:02:37 +03:00
Saurabh Shrivastava
8c8a357aa7 Corrected filename. 2016-12-16 11:00:34 +05:30
Saurabh Shrivastava
f2770b4609 Corrected filename. 2016-12-16 10:59:20 +05:30
Danila Fedorin
7c21754b74 Update with upstream changes and fix merge conflict. 2016-12-15 14:05:08 -08:00
Carlos Fernandez
547a9e9fbf Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-12-15 13:53:57 -08:00
Carlos Fernandez
fd6a223876 Merge branch 'pr/n520_DanilaFe' 2016-12-15 13:53:37 -08:00
captions
2fd71f00d6 Minor gramatical error 2016-12-15 11:32:17 -08:00
Carlos Fernandez
0301a40560 Merge branch 'pr/n538_Izaron' 2016-12-15 11:31:38 -08:00
Carlos Fernandez
05da03a259 Changed dependency for OCR in release version - use non-debug version of tesseract 2016-12-15 10:24:34 -08:00
Saurabh Shrivastava
6e1763332a Added missing directory for protobuf-c.
Issue fixed : https://github.com/CCExtractor/ccextractor/issues/536
2016-12-15 23:00:40 +05:30
Evgeny Shulgin
5bbf27de9c Fixed linux/build script
Last PR #534
2016-12-15 21:07:29 +04:00
maxkoryukov
54f628f7a6 remove debug-files, increase verbosity for tests
fix maxkoryukov/ccextractor#1
2016-12-15 21:23:16 +05:00
maxkoryukov
4743eb46c1 Try to make changes compatible with upstream 2016-12-15 20:25:06 +05:00
maxkoryukov
66393a80f2 Break incoming subs into sentences (through a buffer), and remove duplicates 2016-12-15 20:21:43 +05:00
Danila Fedorin
c6c8de7357 Rename mstotime to millis_to_time in ccx_encoders_ssa. 2016-12-14 17:43:59 -08:00
Danila Fedorin
cbbbb20751 Merge remote-tracking branch 'upstream/master' into cleanup-ccx-common-h 2016-12-14 17:37:55 -08:00
Carlos Fernandez
604f4d8046 Ran dos2unix in all source file, 4 files had CRLF 2016-12-14 19:45:51 +00:00
Carlos Fernandez
579d5e0844 gpacmp4/isom_write.c: Added prototypes for two functions from error.c that aren't included from headers. 2016-12-14 19:11:37 +00:00
Carlos Fernandez
5c3e1d06c0 Merge branch 'pr/n530_Izaron' 2016-12-14 10:18:12 -08:00
Carlos Fernandez
d1663de1d7 Merge branch 'pr/n532_saurabhshri' 2016-12-14 10:17:26 -08:00
Saurabh Shrivastava
1571040eae Printing exact error for clean file.
Description and discussion here : https://github.com/CCExtractor/ccextractor/pull/529
2016-12-14 15:45:08 +05:30
Saurabh Shrivastava
8221a26c2d Fixed missing directory in Makefile for linux.
There was a missing directory /src/protobuf-c in makefile for linux installation.
2016-12-14 14:20:02 +05:30
Carlos Fernandez
9996836253 Removed LIBCURL in linux build script, since that stuff is not complete 2016-12-14 01:41:38 +00:00
Carlos Fernandez
7aaa1e3edb Corrected timing in Itunes
Added list of changes to CHANGES.TXT
2016-12-13 17:39:05 -08:00
Carlos Fernandez
532ecbd449 Fixed Itunes (well, MP4 generally really, maybe more) line repetition. 2016-12-13 13:32:28 -08:00
Carlos Fernandez
e66059f621 Merge branch 'pr/n525_DanilaFe' 2016-12-13 10:44:46 -08:00
Carlos Fernandez
3344040630 Merge branch 'pr/n521_DanilaFe' 2016-12-13 10:43:15 -08:00
Carlos Fernandez
6afe786cad Merge branch 'pr/n519_DanilaFe' 2016-12-13 10:42:29 -08:00
Carlos Fernandez
5f4307932a Merge branch 'pr/n518_DanilaFe' 2016-12-13 10:40:30 -08:00
Carlos Fernandez
4978c3e648 Merge branch 'pr/n517_DanilaFe' 2016-12-13 10:39:56 -08:00
Carlos Fernandez
f0223a2505 Merge branch 'pr/n516_DanilaFe' 2016-12-13 10:38:00 -08:00
Carlos Fernandez
727c407435 Merge branch 'pr/n515_DanilaFe' 2016-12-13 10:32:01 -08:00
Carlos Fernandez
c63a250f2b Merge branch 'pr/n526_Izaron' 2016-12-13 10:31:39 -08:00
Izaron
161b98ef43 Added Vagrantfile 2016-12-13 14:32:35 +03:00
Evgeny Shulgin
2417b74ea5 Fixed CMakeLists for build in cmake (#508) 2016-12-12 12:06:57 -08:00
Carlos Fernandez
620ed70246 Merge branch 'pr/n504_canihavesomecoffee' 2016-12-12 10:11:53 -08:00
Carlos Fernandez
847dc0080a Added 3 more TV show dictionaries. 2016-12-12 10:10:58 -08:00
Evgeny Shulgin
d12600e076 Added the line to check for broken subtitle
See this line https://github.com/CCExtractor/ccextractor/blob/master/src/lib_ccx/ccx_encoders_transcript.c#L276
2016-12-12 18:23:21 +04:00
Izaron
7d60f558d5 Fixed -unixts option 2016-12-12 17:18:24 +03:00
Danila Fedorin
5602e420fc Merge remote-tracking branch 'upstream/master' into cleanup-libccx-h 2016-12-11 14:24:55 -08:00
Danila Fedorin
7a25006fa5 Merge remote-tracking branch 'upstream/master' into cleanup-bistream-h 2016-12-11 14:22:31 -08:00
Danila Fedorin
5846093a7f Merge remote-tracking branch 'upstream/master' into refactor-msprint
# Conflicts:
#	src/lib_ccx/general_loop.c
2016-12-11 14:20:15 -08:00
Danila Fedorin
88bd1d9c16 Merge remote-tracking branch 'upstream/master' into cleanup-demuxer-h 2016-12-11 14:16:54 -08:00
Danila Fedorin
dc4797bb59 Merge remote-tracking branch 'upstream/master' into cleanup-getmoredata 2016-12-11 14:15:28 -08:00
Danila Fedorin
15b5bddab8 Merge remote-tracking branch 'upstream/master' into cleanup-string-cmp 2016-12-11 14:14:44 -08:00
Danila Fedorin
2b8444b78f Merge remote-tracking branch 'upstream/master' into cleanup-hardsubx 2016-12-11 14:13:49 -08:00
Danila Fedorin
f1d276c988 Merge upstream changes. 2016-12-11 14:13:27 -08:00
Danila Fedorin
b695a8c469 Merge upstream changes 2016-12-11 14:08:37 -08:00
Danila Fedorin
0d0cb83ed6 Merge upstream changes 2016-12-11 14:05:49 -08:00
Victoria Staada
943bb576a9 Added 3 TV series database as requested (#523)
* Create dict_how_to_get_away_with_murder.txt

* Create dict_stranger_things.txt

* Update dict_stranger_things.txt

* Create dict_sense8.txt
2016-12-11 11:03:25 -08:00
Danila Fedorin
08e9c3d596 Rename previously not renamed function. 2016-12-11 01:16:52 -08:00
Danila Fedorin
209e6ebb08 Rename more variables in avc_functions 2016-12-11 01:16:13 -08:00
Danila Fedorin
202e539cca Rename some variables in avc_functions 2016-12-11 01:12:29 -08:00
Danila Fedorin
edcd2df6bc Fix variable names in asf_functions.c 2016-12-11 01:11:25 -08:00
Danila Fedorin
3f679d72cf Fix typo in asf_cuntions.c. It looks like a typo from the surrounding
code.
2016-12-11 01:05:45 -08:00
Danila Fedorin
2e09541c84 Rename another function. 2016-12-11 01:05:35 -08:00
Danila Fedorin
6885f41b61 Add more underscores!!
I feel like my last several commits were just this.
2016-12-11 01:01:51 -08:00
Danila Fedorin
69d33e7483 Better name string_cmp2.
Seeing that it is passed as a function pointer, the _function keyword
was added to its name. Hopefully that helps.
2016-12-11 01:00:46 -08:00
Danila Fedorin
d175082520 Add more underscores to getmoredata functions. 2016-12-11 00:58:36 -08:00
Danila Fedorin
2ef31d2e08 Add underscores to function na,e. 2016-12-11 00:58:29 -08:00
Danila Fedorin
eab439d450 Add small changes to function names in ccx_demuxer.h. 2016-12-11 00:57:18 -08:00
Danila Fedorin
27417cda70 Rename more mstotime's to millis_to_time. 2016-12-10 23:22:10 -08:00
Danila Fedorin
2959d2a2ae Rename function to new name. 2016-12-10 23:22:03 -08:00
Danila Fedorin
289aecc1ed Refactor code surrounding print_mstime_static.
This should decrease code duplication - print_mstime2buf was doing
the same thing as mstime_sprintf, but with a hardcoded format and
a different return value. print_mstime (renamed to print_mstime_static)
could therefore be reworked into simply calling mstime_sprintf
 (renamed to print_mstime_buff).
2016-12-10 23:20:57 -08:00
Danila Fedorin
01d2ce36d1 Clarify usage function name (rename it to print_usage). 2016-12-10 23:17:18 -08:00
Danila Fedorin
cbea9ee045 Also add underscores to get_total_file_size. 2016-12-10 23:16:36 -08:00
Danila Fedorin
ea980a4150 Add underscores to debug_608toASC. 2016-12-10 23:12:37 -08:00
Danila Fedorin
2910b10f99 Rename mstotime to millis_to_time. 2016-12-10 23:11:01 -08:00
Danila Fedorin
d5f14d56f3 Add underscores to getfilesize. 2016-12-10 23:04:20 -08:00
Danila Fedorin
c3a80b086c Rename i, u, se, and ue functions. 2016-12-10 23:02:10 -08:00
Danila Fedorin
7616cef9cc Remove accidental code duplication. 2016-12-10 23:00:08 -08:00
Danila Fedorin
1f54d88867 Add hex_string_to_int function.
The only time I found hex2string being used was when it was
with two characters from char array. It therefore should be possible
to generalize the behavior of hex_to_int (which specifically takes two
characters) to take the array instead, in order to reduce code duplication.
I will refrain from moving this function into real application code in
case there is something I overlooked.
2016-12-10 23:00:04 -08:00
Danila Fedorin
12c388093f Rename hex2int to hex_to_int. 2016-12-10 23:00:00 -08:00
Danila Fedorin
190a4203eb Rename processhex to process_hex for consistency. 2016-12-10 22:59:52 -08:00
Evgeny Shulgin
0fdd608828 Fixed FPS switching (#510) 2016-12-10 21:30:54 -08:00
Abhinav Shukla
697132baf9 Fix #454 : Removed ugly debug statement with local path (#514) 2016-12-10 21:03:48 -08:00
Abhinav Shukla
b25a9f2ae4 Fix #454 : Removed ugly debug statement with local path 2016-12-11 10:01:55 +05:30
canihavesomecoffee
7b55f61396 Remove hardcoded references in project file, add relative ones instead 2016-12-10 08:38:58 +01:00
canihavesomecoffee
f3faaf06f8 Add VC.db file to gitignore
Also moves the folders to the correct place
2016-12-10 08:38:43 +01:00
Carlos Fernandez
6dc941d4e6 Changed platform target to v120_xp, fixed some missing dirs. 2016-12-09 14:02:10 -08:00
AlexBratosin2001
ce15155956 Updated GPAC library to v0.6.2 (#500)
Replaced GPAC.
2016-12-09 13:47:54 -08:00
Carlos Fernandez
8ab68b94ac Added detail in many error messages. 2016-12-09 11:07:29 -08:00
Carlos Fernandez
0753f23078 Merge branch 'pr/n490_saurabhshri' 2016-12-09 10:18:17 -08:00
Carlos Fernandez
859296885f Merge branch 'pr/n488_saurabhshri' 2016-12-09 10:16:42 -08:00
Saurabh Shrivastava
a37961ef9c Added more words. 2016-12-08 13:53:47 +05:30
Saurabh Shrivastava
8802c7ccc3 Minor correction in README.md
Removed a redundant line - "You can also".
2016-12-08 13:13:09 +05:30
Izaron
941d5544cf Fixed memory leak with XDS videos 2016-12-07 20:05:38 +03:00
Carlos Fernandez
d453d9327e Minor changes IN README.md 2016-12-05 12:44:57 -08:00
Carlos Fernandez
ac00131153 Merge branch 'pr/n456_TehTotalPwnage' 2016-12-05 12:38:45 -08:00
Carlos Fernandez
796cc6c1bb Merge branch 'pr/n478_Izaron' 2016-12-05 10:52:32 -08:00
Carlos Fernandez
8709e30841 Merge branch 'pr/n476_Izaron' 2016-12-05 10:43:24 -08:00
Carlos Fernandez
8f9f9a48a6 Merge branch 'pr/n475_Izaron' 2016-12-05 10:41:17 -08:00
Izaron
cfeb63b855 Fixed memory leaks 2016-12-04 22:11:08 +03:00
Izaron
904a3c26df Added two functions for ASS/SSA encoder 2016-12-03 20:23:22 +03:00
Izaron
764c890892 Fixed bug with -out=null 2016-12-03 11:01:50 +03:00
Juan Potato
3553b4da51 Fix Makefile compatibility issues with Raspberry pi
Along with adding utf8proc to actually be compiled
2016-12-02 21:14:25 -05:00
Carlos Fernandez
0f6a24a2fb Merge branch 'pr/n470_Izaron' 2016-12-02 12:26:24 -08:00
Carlos Fernandez
3704111fb0 Added new dictionaries, lots of corrections in documents. 2016-12-02 11:27:37 -08:00
Izaron
f8e378863e Added basic ASS/SSA encoding support 2016-12-02 20:24:12 +03:00
Sidhdharth
9526df1b84 Added 4 TV series database as requested.
Used a python script to generate all characters automatically just with the help of IMDB ID.
2016-12-02 19:59:56 +05:30
Deepraj Pandey
e9ea1ce659 Update ccextractor.c 2016-12-02 19:35:16 +05:30
Deepraj Pandey
9f4e6c3f06 Update ccextractor.c 2016-12-02 16:37:02 +05:30
Deepraj Pandey
4292ceb062 Update ccextractor.c 2016-12-02 14:10:25 +05:30
Deepraj Pandey
2609124346 Update using_cmake_build.txt 2016-12-02 14:02:45 +05:30
Deepraj Pandey
902da70ee3 Update OCR.txt 2016-12-02 14:01:53 +05:30
Deepraj Pandey
dd9243d459 Update MAILINGLIST.TXT 2016-12-02 13:53:36 +05:30
Deepraj Pandey
014181a653 Update HARDSUBX.txt 2016-12-02 13:46:57 +05:30
Deepraj Pandey
92bf709d65 Update G608.TXT 2016-12-02 13:33:36 +05:30
Deepraj Pandey
842ec2728e Update FRONTEND_COMMUNICATIONS.TXT 2016-12-02 11:04:09 +05:30
Deepraj Pandey
b8d0751509 Update FFMPEG.TXT 2016-12-02 11:02:24 +05:30
Deepraj Pandey
514909aa05 Update CHANGES.TXT 2016-12-02 11:00:10 +05:30
Deepraj Pandey
85922dbad7 Update dict_the.big.bang.theory.txt 2016-12-02 10:38:38 +05:30
Deepraj Pandey
a7b70df915 Update dict_mr_robot.txt 2016-12-02 10:34:03 +05:30
Deepraj Pandey
36237ef658 Update dict_greys.anatomy.txt 2016-12-02 10:30:28 +05:30
Michael Nguyen
743d3209e8 GCI: Updated README.md 2016-11-30 17:16:10 -08:00
canihavesomecoffee
814eaab300 Add utf8proc folder to the include directories
Regular debug & release have a missing folder
2016-11-29 22:38:04 +01:00
MatejMecka
40d277b666 Google Code In 2016 Modify README.MD 2016-11-29 20:57:25 +01:00
Carlos Fernandez
a1411968b8 Lots of corrections to text.
Added show-specific dictionaries for Grey's Anatomy, Mr. Robot and The Big Bang Theory (Code-in: Deborah Chan)
2016-11-29 10:24:16 -08:00
Xuhairong
2ba25a5922 Update ccextractor.cnf.sample
Corrected some spelling and grammar mistakes.
2016-11-29 17:05:35 +00:00
ManveerBasra
e54a5982e3 Changed Recognization to Recognition 2016-11-28 21:16:33 -08:00
ManveerBasra
203cf029d1 Fixed a minor mistake while correcting spelling 2016-11-28 20:38:07 -08:00
ManveerBasra
dac7e45651 Fixed multiple spelling mistakes 2016-11-28 19:53:09 -08:00
Juan Potato
0dc07384eb Various spelling corrections. 2016-11-28 17:04:59 -05:00
Carlos Fernandez
e58e166428 Removed two leftover files. 2016-11-11 12:46:33 -08:00
maxkoryukov
4ad45db655 Reordering definitions in the ccx_encoders_common.h to make it easier to read 2016-11-11 18:00:50 +05:00
Carlos Fernandez
2683896f41 Fixed missing separation between WebVTT header and body.
(unrelated to previous) added a bit more detail in an error message in the TS parser.
2016-11-04 11:00:06 -07:00
Carlos Fernandez
2e13d9f7a4 Fixes garbage in WebVTT (#438). 2016-11-03 13:28:55 -07:00
Carlos Fernandez
1f90648347 Fixes stupid bug in M2TS 2016-10-19 13:17:10 -07:00
yanwzh
6bf2185257 fixed: cmake build error for ubuntu 2016-10-18 17:55:42 +08:00
Carlos Fernandez
6f2becc42e Fixed OCR libraries dependencies for the release version in Windows. 2016-10-13 11:50:35 -07:00
Carlos Fernandez
cf21a1daee Fixed non-buffered reading from pipes. 2016-10-11 15:30:44 -07:00
Carlos Fernandez
79501bceec position_sanity_check: Don't fatal() on lseek error. 2016-10-11 13:26:39 -07:00
Carlos Fernandez
64d300d15e Fixed position_sanity_check for myth.c 2016-10-11 13:19:16 -07:00
Carlos Fernandez
eead25896f Fixed position_sanity_check 2016-10-11 13:17:39 -07:00
Hugh Mackworth
2f4fb88e07 Fix --stream option with Stdin 2016-10-11 11:48:43 -07:00
Hugh Mackworth
2ac200cdb2 Fix Mac build process
Remove unnecessary debug parameter
2016-10-11 11:46:22 -07:00
Hugh Mackworth
7f589ecade Fix Mac build process 2016-10-11 11:23:31 -07:00
Hugh Mackworth
0b4a368d03 Merge pull request #1 from CCExtractor/master
Updating to latest
2016-10-09 21:04:07 -07:00
Carlos Fernandez
e8da9bce72 Minor correction in params.c 2016-10-07 15:24:38 -07:00
Carlos Fernandez
e488c43eb7 Corrections in help screen 2016-10-07 15:22:59 -07:00
Carlos Fernandez
b1cb94ef71 Minor signal thing 2016-09-28 15:24:19 -07:00
Carlos Fernandez
56a6658bf1 Added terminate_asap to buffered_read_opt 2016-09-28 15:19:38 -07:00
Carlos Fernandez
4be8324767 Corrected signal handler setup 2016-09-28 15:17:31 -07:00
Carlos Fernandez
09a5593e2f Sigterm 2016-09-28 13:38:52 -07:00
Carlos Fernandez
c987b72033 Solves warning: format ‘%lld’ expects argument of type ‘long long int’, but argument 3 has
type ‘LLONG {aka long int}’
2016-09-28 13:37:45 -07:00
Carlos Fernandez
cbd894a634 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-09-28 13:30:44 -07:00
Carlos Fernandez
33273e3954 sigterm 2016-09-28 13:29:43 -07:00
Carlos Fernandez
f28698f71a Corrected linux/build 2016-09-28 13:05:09 -07:00
Carlos Fernandez
56719e7bcd Libcurl 2016-09-28 13:02:54 -07:00
Carlos Fernandez
febc9727bf Actual libcurl invocation 2016-09-28 12:56:15 -07:00
Carlos Fernandez
2ae3b8bca0 SIGTERM in general_loop 2016-09-28 12:16:09 -07:00
Carlos Fernandez
352283d0b6 SIGTERM stuff 2016-09-28 12:15:45 -07:00
Carlos Fernandez
f0696d396f SIGTERM stuff 2016-09-28 12:15:21 -07:00
Carlos Fernandez
8729ae1210 Disabling CURL in Windows 2016-09-28 12:13:50 -07:00
Carlos Fernandez
17c0d8125e - Adding SIGTERM 2016-09-28 12:13:22 -07:00
Carlos Fernandez
67a3ed3b57 Merging curl 2016-09-28 12:09:17 -07:00
Carlos Fernandez
342238325e - Start capture SIGTERM 2016-09-28 12:04:43 -07:00
Carlos Fernandez
080280c268 Added generates_files flag. 2016-09-26 15:22:09 -07:00
Carlos Fernandez
17dd6696df Initial libcurl integration work, linux only. Just groundwork, lots of dummy things yet. 2016-09-26 13:36:04 -07:00
Carlos Fernandez Sanz
d0a4851a67 Comment out enum 2016-09-21 15:30:42 -07:00
Carlos Fernandez
a09ba7c603 Work on 708: Changed DefineWindow behavior, only clear text of an existing window is style has changed.
Assigned presets in 708 attributes.
2016-09-21 13:05:28 -07:00
Carlos Fernandez Sanz
a734aa2b26 Fixed compilation in linux 2016-09-21 11:43:08 -07:00
Carlos Fernandez
1987951014 Fixed some more warnings 2016-09-21 11:35:50 -07:00
Carlos Fernandez
c8f6e8c084 Fixed some compilation warnings
Removed microutf8 files since we settled for a different UTF8 library
2016-09-21 11:30:42 -07:00
Carlos Fernandez
4101fe3880 Fixes #425 - the 708 decoder needs access the encoder. Reference was missing for .bin. 2016-09-20 16:04:33 -07:00
captions
9833090318 Removed leftover line that caused a (harmless) error message 2016-09-12 12:15:47 -07:00
Carlos Fernandez
8169d9863b Fixes #422 2016-09-12 11:50:15 -07:00
canihavesomecoffee
ecac87dc6a Fix regression introduced in 1a1e973
Due to an assignment instead of comparison, all types of codecs handled
after the DVD subtitles would now be processed as DVD subtitles...
2016-09-03 18:58:47 +02:00
Carlos Fernandez
4865649d0f Merge branch 'pr/n417_Abhinav95' 2016-08-26 17:34:39 -07:00
Abhinav Shukla
fe20494d30 Making output for one timestamp on one line 2016-08-25 17:24:18 -07:00
Carlos Fernandez
b356b1b0d6 Infrasstructure for split-by-sentence 2016-08-25 16:56:11 -07:00
Abhinav Shukla
fc3c841d03 Updating HardsubX build script 2016-08-25 16:25:25 -07:00
Abhinav Shukla
3d0d1df324 Removing unused variable 2016-08-25 16:18:01 -07:00
Carlos Fernandez
789ae74e0a New dir added to include list 2016-08-25 16:16:16 -07:00
Abhinav Shukla
bcccc5d1d5 Merge remote-tracking branch 'upstream/master' 2016-08-25 16:15:56 -07:00
Abhinav Shukla
5c7a766658 Merge remote-tracking branch 'upstream/master' 2016-08-25 16:06:23 -07:00
Carlos Fernandez
abdf5423f5 Missing dir for proto. 2016-08-25 16:05:50 -07:00
Carlos Fernandez
8775436cdb Stupid typo 2016-08-25 16:03:18 -07:00
Abhinav Shukla
3557ea50c6 Merge remote-tracking branch 'upstream/master' 2016-08-25 16:03:13 -07:00
Abhinav Shukla
2c4ae3ea42 Merge remote-tracking branch 'upstream/master' 2016-08-25 16:00:47 -07:00
Carlos Fernandez
e5e62fd2d9 Added protobuf-c to build 2016-08-25 15:57:33 -07:00
Abhinav Shukla
f3b2b05169 Fixing missing second line in ttxt for DVB 2016-08-25 15:45:33 -07:00
Carlos Fernandez
b00f8e75f6 Added dvb_subtitle_decoder.c to the project 2016-08-22 16:17:17 -07:00
bigharshrag
b604fe74b5 Fixed warning 2016-08-23 04:07:27 +05:30
bigharshrag
3082fb7a15 Cleaned code 2016-08-23 02:49:46 +05:30
bigharshrag
41ff65162b Fixed according to latest OCR 2016-08-23 02:32:35 +05:30
bigharshrag
f778beaafd Merge branch 'master' into dvd_subtitles 2016-08-23 01:48:48 +05:30
Abhinav Shukla
a018b55038 Final improvements to HardsubX 2016-08-19 08:38:21 -07:00
Abhinav Shukla
03b600fb67 Merge remote-tracking branch 'upstream/master' 2016-08-18 03:11:07 -07:00
Carlos Fernandez
358b8ef579 Initial backport of Oleg Kisselef's WITH_SHARING options. Most likely it breaks stuff. 2016-08-17 17:40:11 -07:00
Carlos Fernandez
b2c11f5984 Merge branch 'pr/n414_cfsmp3'
# Conflicts:
#	src/lib_ccx/ccx_decoders_708.c
2016-08-17 13:21:07 -07:00
Carlos Fernandez
fc14c61ac1 Merge branch 'pr/n411_Abhinav95' 2016-08-16 10:33:50 -07:00
Carlos Fernandez
c4073d1813 leptonica/tesseract version upgrade in release build (VS) 2016-08-16 10:33:17 -07:00
Abhinav Shukla
5d19fe127c Merge remote-tracking branch 'upstream/master' 2016-08-16 05:36:25 -07:00
Abhinav Shukla
a78f7d2a05 Only processing the stream specified by dvblang 2016-08-16 05:23:44 -07:00
Carlos Fernandez
78fa5c92bc Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-15 16:40:23 -07:00
Carlos Fernandez
cfa8f5ac31 Merge branch 'pr/n410_Abhinav95' 2016-08-15 16:36:50 -07:00
Abhinav Shukla
4b239fd2ee Adding ocrlang and dvblang, removing special cases for languages 2016-08-15 16:24:06 -07:00
Carlos Fernandez
676539cf8c Updated Tesseract and leptonica versions, included the files in the repo because there's a royal pain to find and/or build. 2016-08-15 16:15:50 -07:00
Carlos Fernandez
9bd4f5acdb Merge branch 'pr/n408_Abhinav95' 2016-08-15 15:18:52 -07:00
bigharshrag
438ce2f33a Fixed data over multiple packets 2016-08-15 05:45:04 +05:30
bigharshrag
cd4dfa5f76 Data spread over packets 2016-08-15 04:41:43 +05:30
Abhinav Shukla
0ad7ae6f66 Merge remote-tracking branch 'upstream/master' 2016-08-13 05:32:22 -07:00
Abhinav Shukla
625b477c23 Remove misleading comment 2016-08-12 15:09:08 -07:00
Carlos Fernandez
a75ba7edd4 Merge branch 'pr/n408_Abhinav95' 2016-08-12 15:07:59 -07:00
Abhinav Shukla
3a24024aaf Setting default language file directory to TESSERACT_PREFIX/tessdata 2016-08-12 15:05:52 -07:00
Carlos Fernandez
e4769fd0e6 Merge branch 'pr/n408_Abhinav95' 2016-08-12 10:23:13 -07:00
Abhinav Shukla
cd706068c8 Fixing problem with windows paths for traineddata 2016-08-12 03:40:59 -07:00
bigharshrag
05767df99d Fixed timing 2016-08-12 15:52:00 +05:30
Carlos Fernandez
00a2dcfa77 Merge branch 'pr/n407_Abhinav95' 2016-08-11 12:54:34 -07:00
Carlos Fernandez
a27c8019a6 Minor typo correction 2016-08-11 12:53:49 -07:00
Abhinav Shukla
3f27fd7dc1 Fixing error with path of tesseract traineddata 2016-08-11 12:21:24 -07:00
Abhinav Shukla
a83467f595 Removing excessive language notices 2016-08-11 11:58:46 -07:00
Abhinav Shukla
45d8c63a45 Adding 99 possible languages to DVB subtitle OCR 2016-08-11 11:17:39 -07:00
Abhinav Shukla
0fd7967e7f Improving documentation readability 2016-08-10 12:54:34 -07:00
Abhinav Shukla
e2f850192f Italic Detection and improved documentation 2016-08-10 09:33:08 -07:00
Abhinav Shukla
5a6dfd0c18 Added correct progress display and hue image filter for colored subs 2016-08-10 06:43:58 -07:00
Abhinav Shukla
ef8141c8a9 User documentation 2016-08-10 03:39:01 -07:00
bigharshrag
ebd2afa43f Used correct time data 2016-08-09 06:21:38 +05:30
Carlos Fernandez
89e98a6797 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 14:41:35 -07:00
Carlos Fernandez
5a4aa337b3 Removed compile warnings. 2016-08-08 14:41:32 -07:00
Carlos Fernandez
7687576e1f Pushed version number to 0.82 2016-08-08 14:13:48 -07:00
Carlos Fernandez
8ca63e02eb Merge branch 'pr/n405_Abhinav95' 2016-08-08 14:05:23 -07:00
Carlos Fernandez
b924f7a323 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 12:14:49 -07:00
Carlos Fernandez
606a5a56bc Final changes for the max 80 characters correction in help screen. 2016-08-08 12:14:45 -07:00
Carlos Fernandez
3ee6aa61a7 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 12:07:03 -07:00
Carlos Fernandez
af8a79757e More changes in help screen 2016-08-08 12:05:30 -07:00
Carlos Fernandez
a047262eb6 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 12:03:12 -07:00
Carlos Fernandez
e920c06dc0 More help screen changes. 2016-08-08 12:02:27 -07:00
Carlos Fernandez
c208c2aed2 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 11:59:18 -07:00
Carlos Fernandez
ed8592f5f8 More changes in help screen 2016-08-08 11:59:12 -07:00
Carlos Fernandez
8c85870c25 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 11:57:19 -07:00
Carlos Fernandez
f5bb5fba27 More changes in help screen. 2016-08-08 11:56:56 -07:00
Carlos Fernandez
43acdce578 More help screen changes. 2016-08-08 11:55:48 -07:00
Abhinav Shukla
92b899dc2a Cleanliness is all. 2016-08-08 11:55:31 -07:00
Carlos Fernandez
6fa1dc7b30 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 11:53:48 -07:00
Carlos Fernandez
f4c60c3aed More help screen corrections. 2016-08-08 11:53:41 -07:00
Carlos Fernandez
35ae5599c1 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 11:50:58 -07:00
Carlos Fernandez
057a1d11b1 More help corrections 2016-08-08 11:42:31 -07:00
Carlos Fernandez
5fe2c8c8a7 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-08 11:37:35 -07:00
Carlos Fernandez
030635a86b help screen corrections 2016-08-08 11:36:23 -07:00
Carlos Fernandez
af248ffd24 Correction in help screen 2016-08-08 11:26:40 -07:00
bigharshrag
6b52cabbac Fixed incorrect bitmap allocation 2016-08-08 01:25:30 +05:30
Abhinav Shukla
fa60e2ad68 Fixing build error when HardsubX not enabled 2016-08-05 12:33:17 -07:00
Abhinav Shukla
b2784bd3da Dumping HardsubX parameters 2016-08-05 12:21:35 -07:00
Carlos Fernandez
8bfcb24d15 Added new lines to bitmap -> transcript 2016-08-05 11:02:46 -07:00
Abhinav Shukla
1aaf8b465f Fixing bug with time interval in packet processing 2016-08-04 00:34:58 -07:00
Abhinav Shukla
ffb85d90a3 Update Makefile to check if libav present 2016-08-03 20:09:19 -07:00
Carlos Fernandez
6a4f196940 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-08-01 11:31:50 -07:00
Abhinav Shukla
37d9f93eb5 Merge remote-tracking branch 'upstream/master' 2016-08-01 11:26:29 -07:00
Abhinav Shukla
39559413f5 Fix errors with classifiers, write debug display function 2016-08-01 11:19:48 -07:00
Carlos Fernandez
8c845f5135 Merge branch 'pr/n403_canihavesomecoffee' 2016-08-01 10:46:04 -07:00
Carlos Fernandez
ef7ad10fdb Merge branch 'pr/n404_Abhinav95' 2016-08-01 10:45:30 -07:00
Carlos Fernandez
e6ca4274ec Undo part of the clever capitalization stuff. Doesn't work well for TV stations that are all caps except for hard of hearing notes. 2016-08-01 10:44:44 -07:00
Abhinav Shukla
d97c51139f Freeing result iterator after a frame is processed 2016-08-01 00:02:05 -07:00
Abhinav Shukla
52dfc82054 Setting up different levels of subtitle classifiers 2016-07-31 23:54:48 -07:00
Abhinav Shukla
773fb63f92 Accurate time base conversion, changing file structure 2016-07-31 23:21:44 -07:00
Abhinav Shukla
7ce4c4008c Adding HardsubX parameter parsing 2016-07-31 22:11:13 -07:00
Abhinav Shukla
178db0a085 Merge remote-tracking branch 'upstream/master' 2016-07-31 13:51:20 -07:00
Abhinav Shukla
bc833c47dc Setting up other hardsubx parameters 2016-07-31 13:51:13 -07:00
Abhinav Shukla
617b4658b4 Creating new macro and updating Makefile, hadnling errors 2016-07-31 12:13:45 -07:00
canihavesomecoffee
98a79e0e55 Re-add the VERSION_FILE_PRESENT flag 2016-07-30 18:37:32 +02:00
Carlos Fernandez
e21f787125 Merge branch 'master' of https://github.com/CCExtractor/ccextractor 2016-07-29 16:49:01 -07:00
Carlos Fernandez
878106386c Corrected capitalization stuff: Don't touch lines that already have both upper and lower case (i.e. assume they are correct already) and don't force new sentence between frames as that's just not correct. Long sentences will take more than one frame for sure. 2016-07-29 16:41:06 -07:00
Abhinav Shukla
3c870eef7d Freeing color detection tables 2016-07-26 23:21:27 -07:00
Abhinav Shukla
4f3d61a7eb Fixing memory leaks in frame processing 2016-07-26 23:17:14 -07:00
Carlos Fernandez
470e80a194 Minor correction in help screen 2016-07-25 16:16:05 -07:00
bigharshrag
e44852a7e4 Added Palette 2016-07-26 03:04:19 +05:30
Abhinav Shukla
92885635ed Generating a basic, correctly timed output file with simple OCR 2016-07-23 18:18:32 -07:00
Abhinav Shukla
722cecf0d3 Resolving merge conflict with dvbcolor and syncing with upstream 2016-07-22 19:51:51 -07:00
Abhinav Shukla
6a9d7e7da2 Fixing segfault for empty OCR line 2016-07-22 17:54:32 -07:00
Abhinav Shukla
22ec6b96b2 Fixing compilation warnings for image_copy struct 2016-07-22 07:01:05 -07:00
Abhinav Shukla
0c8471438a Fixing error with invalid next size and correctly formatting SRT/VTT color output 2016-07-22 06:46:11 -07:00
Abhinav Shukla
3a74381318 Modifying API call not present in tesseract 3.03 2016-07-21 16:13:59 -07:00
Abhinav Shukla
0fcf9298bb Merge remote-tracking branch 'upstream/master' into dvbcolor 2016-07-21 15:29:12 -07:00
Carlos Fernandez
c8bd4e22a5 Added missing library for OCR 2016-07-21 14:59:56 -07:00
Carlos Fernandez
a918bc2798 Added OCR to linux build script
Skip another to the SCTE 57 2003 sections in PMT
2016-07-21 14:56:17 -07:00
Abhinav Shukla
9d81498bd9 Moving tesseract handle to hardsubx ctx 2016-07-21 14:54:33 -07:00
Carlos Fernandez
45a37f390b Added OCR stuff 2016-07-21 14:44:23 -07:00
Abhinav Shukla
a2eced9201 Color output in <font> tags for SRT and WebVTT 2016-07-20 12:08:42 -07:00
Abhinav Shukla
36b08c5d74 Working color detection and output for SRT from DVB 2016-07-19 19:25:46 -07:00
bigharshrag
5c0ee9ca53 Use different structs 2016-07-18 18:49:22 +05:30
Abhinav Shukla
f280feff65 Passing DVB region background color 2016-07-16 06:16:42 -07:00
Abhinav Shukla
f37e944ac4 Fixed errors in word color recognition 2016-07-09 20:39:38 -07:00
Abhinav Shukla
cf281c22ce Detecting color of every word 2016-07-07 19:42:10 -07:00
Abhinav Shukla
4d8cb4d4d4 Creating copy of image before quantization 2016-07-07 15:15:59 -07:00
Abhinav Shukla
7426841b5e Basic structure for color detection in DVB 2016-07-07 04:00:20 -07:00
Abhinav Shukla
89efdb6178 Sauvola Binarization, color independent thresholding 2016-07-06 19:15:44 -07:00
Abhinav Shukla
a37a08ef70 Merge remote-tracking branch 'upstream/master' 2016-07-06 19:08:34 -07:00
Carlos Fernandez
db80caf74a - WebVTT from DVB
- Added X-TIMESTAMP-MAP to the WebVTT header
2016-07-06 12:18:40 -07:00
Carlos Fernandez
672010711d spung: Embedded OCR comments inside the spupng tags (instead of below), much more readable. 2016-07-05 15:20:28 -07:00
Carlos Fernandez
a4cea54db8 Passed CRLF to OCR 2016-07-05 14:48:54 -07:00
Carlos Fernandez
ddbf8124a8 Merge branch 'pr/n399_canihavesomecoffee' 2016-07-05 10:34:22 -07:00
canihavesomecoffee
5655db6cd2 Update Linux & Mac build
Update the Linux & Mac build to reflect the changes made for the version
file.
2016-07-05 18:10:19 +02:00
canihavesomecoffee
c764ced536 Update compile info 2016-07-05 18:05:44 +02:00
canihavesomecoffee
9f4bff884f Update build script for windows
-
2016-07-05 18:01:47 +02:00
canihavesomecoffee
b002d58259 Updates git ignore for new file
Add a git ignore for the "real" dynamic header file, remove the ignore
for the other one.
2016-07-05 17:57:47 +02:00
Anshul Maheshwari
bdf3f6d833 close #394 Illegal .srt being created from DVBs 2016-07-03 23:59:19 +05:30
Anshul Maheshwari
7f183193d5 resolve Tesseract fail cause spngpng to fail #391
Signed-off-by: Anshul Maheshwari <er.anshul.maheshwari@gmail.com>
2016-07-03 23:03:23 +05:30
Anshul Maheshwari
d3ae186f6b Merge remote-tracking branch 'origin/master' 2016-07-03 13:05:09 +05:30
Anshul Maheshwari
18cd92e5a8 close #397 Incorrect progress display 2016-07-03 13:04:17 +05:30
Abhinav Shukla
0e4281d4a2 Merge remote-tracking branch 'upstream/master' 2016-06-30 14:29:04 -07:00
Abhinav Shukla
7d19fa971d Setting up encoder 2016-06-30 14:19:50 -07:00
Anshul Maheshwari
a66c5aae06 Correcting y offset 2016-06-28 23:25:20 +05:30
Anshul Maheshwari
eea1792f0e Fix memory leakage in spupng
Signed-off-by: Anshul Maheshwari <er.anshul.maheshwari@gmail.com>
2016-06-27 00:54:44 +05:30
Anshul Maheshwari
e6bd773762 Fixing missing lines in spupng
Signed-off-by: Anshul Maheshwari <er.anshul.maheshwari@gmail.com>
2016-06-27 00:33:06 +05:30
bigharshrag
2cc8ecf20a More restructuring 2016-06-22 23:47:32 +05:30
bigharshrag
9ca3dc91ee Structure changes 2016-06-22 15:56:54 +05:30
Abhinav Shukla
98f1e15666 Limiting sutitle region to bottom 25% of the frame 2016-06-21 07:31:01 -07:00
Abhinav Shukla
c61e787bca Midterm evaluation - initial code 2016-06-21 06:48:10 -07:00
bigharshrag
89eba5e3e8 Removed extra debugging statements 2016-06-21 01:14:08 +05:30
bigharshrag
8bc1782589 Fixes for Major bugs in decoding RLE 2016-06-20 20:03:21 +05:30
bigharshrag
93a55bff8c Create bitmap 2016-06-20 03:44:50 +05:30
bigharshrag
39dec02dc4 Getting 4 bits data 2016-06-20 02:50:33 +05:30
bigharshrag
24dc763c4f Decode Run length encoding 2016-06-20 00:47:15 +05:30
bigharshrag
38f9f65ad8 small restructuring 2016-06-18 19:01:27 +05:30
bigharshrag
eaed758aa0 Critical bug fix 2016-06-18 00:34:24 +05:30
bigharshrag
705766a5ee Removed extra debugging statements 2016-06-17 23:39:28 +05:30
bigharshrag
23a99c04e9 Added processing of control packets 2016-06-17 23:02:37 +05:30
bigharshrag
1a1e9732b9 Init DVD sub decoder 2016-06-16 23:48:03 +05:30
Abhinav Shukla
2dfa3778cb Added seeking to a frame at a particular time instead of linear iteration 2016-06-16 04:57:21 -07:00
Abhinav Shukla
d99dc4c6f8 Setting up binary neighbourhood search workflow and other helpers 2016-06-15 17:01:44 -07:00
Abhinav Shukla
8507a842be Added HSV colorspace conversion 2016-06-15 16:25:39 -07:00
bigharshrag
4cafcc053e Fixes to reading PES header 2016-06-16 02:29:13 +05:30
bigharshrag
30e2c7117c Process PES header for subtitles 2016-06-15 23:58:09 +05:30
Abhinav Shukla
8e5b9b2655 Merge remote-tracking branch 'upstream/master' 2016-06-14 12:02:08 -07:00
Abhinav Shukla
954724e12a Added vertical edge detection and morphology to get subtitle ROI 2016-06-14 11:59:10 -07:00
Carlos Fernandez
97dd511452 Merge branch 'pr/n390_rkuchumov' 2016-06-14 10:57:09 -07:00
Kuchumov Ruslan
93b1e64896 skipping redundant bytes at the end of tx3g atom 2016-06-14 09:44:35 +03:00
Abhinav Shukla
13db1dfbfa Fixing error with frame numbers of parsed packets, now parsing video only 2016-06-09 16:31:25 -07:00
Abhinav Shukla
bc40119b72 Basic text output 2016-06-03 12:55:56 -07:00
Abhinav Shukla
65587815ff Basic video frame processing with ffmpeg 2016-06-02 13:57:08 -07:00
Abhinav Shukla
c3eabcfd96 Setting up ffmpeg frame processing (-s in the Makefile to reduce executable size) 2016-06-01 17:37:58 -07:00
Abhinav Shukla
204543af9a Setting up preliminary HardsubX context 2016-05-30 14:38:50 -07:00
Abhinav Shukla
c8345643c6 Adding HardsubX workflow 2016-05-30 10:44:55 -07:00
kisselef
7190af4e79 implemented all predefined window styles 2015-09-28 22:42:14 +03:00
kisselef
2898584ee3 Merge branch 'master' into feature-cea-708 2015-09-28 22:35:24 +03:00
1097 changed files with 362738 additions and 99400 deletions

7
.clang-format Normal file
View File

@@ -0,0 +1,7 @@
BreakBeforeBraces: Allman
ColumnLimit: 0
IndentCaseLabels: true
IndentWidth: 8
TabWidth: 8
UseTab: Always
SortIncludes: false

37
.dockerignore Normal file
View File

@@ -0,0 +1,37 @@
# Build artifacts
linux/ccextractor
linux/rust/
linux/*.o
linux/*.a
mac/ccextractor
mac/rust/
build/
build_*/
# Git
.git/
.github/
# IDE
.vscode/
.idea/
*.swp
*.swo
# Docker
docker/
# Documentation (not needed for build)
docs/
*.md
!README.md
# Test files
*.ts
*.mp4
*.mkv
*.srt
*.vtt
# Plans
plans/

36
.github/CONTRIBUTING.md vendored Normal file
View File

@@ -0,0 +1,36 @@
# Contributors Guide
Please read and understand the contribution guide before creating an issue or pull request. We would like to thank [Nishad TR](https://github.com/nishad) for their contributor's guide, upon which we based ours.
## Etiquette
This project is open source, and as such, we (the maintainers) give our **free time** to build, maintain and **provide user support** for the CCExtractor program. We make the code freely available in the hope that it will be of use to other developers and users. It would be extremely unfair for us to suffer abuse or anger for our hard work.
Please be considerate towards the developers and other users when raising issues or presenting pull requests.
It's the duty of the maintainer to ensure that all submissions to the project are of sufficient quality to benefit the project. Many developers have different skillsets, strengths, and weaknesses. Respect the decision of the maintainers, and do not be upset or abusive if your submission is not used.
## Viability
When requesting or submitting new features, first consider whether it might be useful to others. Open source projects are used by many developers, who may have entirely different needs to your own. Think about whether or not your feature is likely to be used by other users of the project.
## Procedure
**Before filing an issue**:
- Attempt to replicate the problem, to ensure that it wasn't a coincidental incident.
- Check to make sure your feature suggestion isn't already present within the project.
- Check the pull requests tab to ensure that the bug doesn't have a fix in progress.
- Check the pull requests tab to ensure that the feature isn't already in progress.
**Before submitting a pull request**:
- Ensure that your submission is [viable](#viability) for the project.
- Check the codebase to ensure that your feature doesn't already exist.
- Check the pull requests to ensure that another person hasn't already submitted the feature or fix.
## Technical requirements
- Before Submitting your Pull Request, merge `master` with your new branch and fix any conflicts. (Make sure you don't break anything in development!)
- Commit Unix line endings.
- Make sure to reasonably test your code. We have a sample platform that runs a test-suite for you, but it only covers a general set of tests.

47
.github/ISSUE_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,47 @@
Please prefix your issue with one of the following: [BUG], [PROPOSAL], [QUESTION].
To get the version of CCExtractor, you can use `--version`.
If this issue is related to the flutter GUI, please make the issue on the GUI repo [here](https://github.com/CCExtractor/ccextractorfluttergui/issues/new)
Please check all that apply and **remove the ones that do not**.
In the necessary information section, if this is a regression (something that used to work does not work anymore), make sure to specify the last known working version.
Only specify the minimum number of arguments needed to reproduce the issue.
In the additional information section, describe your problem.
Please make the affected input file available for us (no screenshots, those don't help!). Public links to Dropbox, Google Drive, etc, are all fine. If it is not possible to make it available publicly, send us a private invitation (both Dropbox and Google Drive allow that). In this case we will download the file and upload it to the private developer repository. Methods to send the private invitation to us can be found [here](https://ccextractor.org/public:general:support#email).
Do **not** upload your file to any location that will require us to sign up or endure a wait list, slow downloads, etc. If your upload expires make sure you keep it active somehow (replace links if needed). Keep in mind that while we go over all tickets some may take a few days, and it's important we have the file available when we actually need it.
Make sure to enable notifications in GitHub so you get notifications about your ticket. We may need to ask questions and we do everything inside GitHub's system.
Once you have read all of the instructions **delete all the text from here to the top**.
CCExtractor version: {replace with the version}
# In raising this issue, I confirm the following:
- [ ] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md).
- [ ] I have checked that the bug-fix I am reporting can be replicated, or that the feature I am suggesting isn't already present.
- [ ] I have checked that the issue I'm posting isn't already reported.
- [ ] I have checked that the issue I'm porting isn't already solved and no duplicates exist in [closed issues](https://github.com/CCExtractor/ccextractor/issues?q=is%3Aissue+is%3Aclosed) and in [opened issues](https://github.com/CCExtractor/ccextractor/issues)
- [ ] I have checked the pull requests tab for existing solutions/implementations to my issue/suggestion.
- [ ] I have used the latest available version of CCExtractor to verify this issue exists.
- [ ] I have ticked all the boxes in this section and to prove it I'm deleting the section completely to remove boilerplate text.
# Necessary information
- Is this a regression (i.e. did it work before)? {YES/NO}
- What platform did you use? {Window/Linux/Mac}
- What were the used arguments? `{replace with the arguments}`
# Video links
* {Replace with a link to a video file}
# Additional information
{issue content here, replace this line with your issue content}

21
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,21 @@
<!-- Please prefix your pull request with one of the following: **[FEATURE]** **[FIX]** **[IMPROVEMENT]**. -->
**In raising this pull request, I confirm the following (please check boxes):**
- [ ] I have read and understood the [contributors guide](https://github.com/CCExtractor/ccextractor/blob/master/.github/CONTRIBUTING.md).
- [ ] I have checked that another pull request for this purpose does not exist.
- [ ] I have considered, and confirmed that this submission will be valuable to others.
- [ ] I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
- [ ] I give this submission freely, and claim no ownership to its content.
- [ ] **I have mentioned this change in the [changelog](https://github.com/CCExtractor/ccextractor/blob/master/docs/CHANGES.TXT).**
**My familiarity with the project is as follows (check one):**
- [ ] I have never used CCExtractor.
- [ ] I have used CCExtractor just a couple of times.
- [ ] I absolutely love CCExtractor, but have not contributed previously.
- [ ] I am an active contributor to CCExtractor.
---
{pull request content here}

9
.github/dependabot.yml vendored Normal file
View File

@@ -0,0 +1,9 @@
version: 2
updates:
- package-ecosystem: github-actions
directory: "/"
schedule:
interval: daily
time: "10:00"
timezone: America/Los_Angeles
open-pull-requests-limit: 10

157
.github/workflows/build_appimage.yml vendored Normal file
View File

@@ -0,0 +1,157 @@
name: Build Linux AppImage
on:
# Build on releases
release:
types: [published]
# Allow manual trigger
workflow_dispatch:
inputs:
build_type:
description: 'Build type (all, minimal, ocr, hardsubx)'
required: false
default: 'all'
# Build on pushes to workflow file for testing
push:
paths:
- '.github/workflows/build_appimage.yml'
- 'linux/build_appimage.sh'
jobs:
build-appimage:
runs-on: ubuntu-22.04
strategy:
fail-fast: false
matrix:
build_type: [minimal, ocr, hardsubx]
steps:
- name: Check if should build this variant
id: should_build
run: |
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
INPUT_TYPE="${{ github.event.inputs.build_type }}"
if [ "$INPUT_TYPE" = "all" ] || [ "$INPUT_TYPE" = "${{ matrix.build_type }}" ]; then
echo "should_build=true" >> $GITHUB_OUTPUT
else
echo "should_build=false" >> $GITHUB_OUTPUT
fi
else
echo "should_build=true" >> $GITHUB_OUTPUT
fi
- name: Checkout repository
if: steps.should_build.outputs.should_build == 'true'
uses: actions/checkout@v6
- name: Install base dependencies
if: steps.should_build.outputs.should_build == 'true'
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
build-essential \
cmake \
pkg-config \
wget \
file \
libfuse2 \
zlib1g-dev \
libpng-dev \
libjpeg-dev \
libfreetype-dev \
libxml2-dev \
libcurl4-gnutls-dev \
libssl-dev \
clang \
libclang-dev
- name: Install OCR dependencies
if: steps.should_build.outputs.should_build == 'true' && (matrix.build_type == 'ocr' || matrix.build_type == 'hardsubx')
run: |
sudo apt-get install -y --no-install-recommends \
tesseract-ocr \
libtesseract-dev \
libleptonica-dev \
tesseract-ocr-eng
- name: Install FFmpeg dependencies (HardSubX)
if: steps.should_build.outputs.should_build == 'true' && matrix.build_type == 'hardsubx'
run: |
sudo apt-get install -y --no-install-recommends \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libswscale-dev \
libswresample-dev \
libavfilter-dev \
libavdevice-dev
- name: Install Rust toolchain
if: steps.should_build.outputs.should_build == 'true'
uses: dtolnay/rust-toolchain@stable
- name: Cache GPAC build
if: steps.should_build.outputs.should_build == 'true'
id: cache-gpac
uses: actions/cache@v5
with:
path: /usr/local/lib/libgpac*
key: gpac-v2.4.0-ubuntu22
- name: Build and install GPAC
if: steps.should_build.outputs.should_build == 'true' && steps.cache-gpac.outputs.cache-hit != 'true'
run: |
git clone -b v2.4.0 --depth 1 https://github.com/gpac/gpac
cd gpac
./configure
make -j$(nproc) lib
sudo make install-lib
sudo ldconfig
- name: Update library cache
if: steps.should_build.outputs.should_build == 'true'
run: sudo ldconfig
- name: Build AppImage
if: steps.should_build.outputs.should_build == 'true'
run: |
cd linux
chmod +x build_appimage.sh
BUILD_TYPE=${{ matrix.build_type }} ./build_appimage.sh
- name: Get AppImage name
if: steps.should_build.outputs.should_build == 'true'
id: appimage_name
run: |
case "${{ matrix.build_type }}" in
minimal)
echo "name=ccextractor-minimal-x86_64.AppImage" >> $GITHUB_OUTPUT
;;
ocr)
echo "name=ccextractor-x86_64.AppImage" >> $GITHUB_OUTPUT
;;
hardsubx)
echo "name=ccextractor-hardsubx-x86_64.AppImage" >> $GITHUB_OUTPUT
;;
esac
- name: Test AppImage
if: steps.should_build.outputs.should_build == 'true'
run: |
chmod +x linux/${{ steps.appimage_name.outputs.name }}
linux/${{ steps.appimage_name.outputs.name }} --version
- name: Upload AppImage artifact
if: steps.should_build.outputs.should_build == 'true'
uses: actions/upload-artifact@v6
with:
name: ${{ steps.appimage_name.outputs.name }}
path: linux/${{ steps.appimage_name.outputs.name }}
- name: Upload to Release
if: steps.should_build.outputs.should_build == 'true' && github.event_name == 'release'
uses: softprops/action-gh-release@v2
with:
files: linux/${{ steps.appimage_name.outputs.name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

283
.github/workflows/build_deb.yml vendored Normal file
View File

@@ -0,0 +1,283 @@
name: Build Linux .deb Package
on:
# Build on releases
release:
types: [published]
# Allow manual trigger
workflow_dispatch:
inputs:
build_type:
description: 'Build type (all, basic, hardsubx)'
required: false
default: 'all'
# Build on pushes to workflow file for testing
push:
paths:
- '.github/workflows/build_deb.yml'
jobs:
build-deb:
runs-on: ubuntu-24.04
strategy:
fail-fast: false
matrix:
build_type: [basic, hardsubx]
steps:
- name: Check if should build this variant
id: should_build
run: |
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
INPUT_TYPE="${{ github.event.inputs.build_type }}"
if [ "$INPUT_TYPE" = "all" ] || [ "$INPUT_TYPE" = "${{ matrix.build_type }}" ]; then
echo "should_build=true" >> $GITHUB_OUTPUT
else
echo "should_build=false" >> $GITHUB_OUTPUT
fi
else
echo "should_build=true" >> $GITHUB_OUTPUT
fi
- name: Checkout repository
if: steps.should_build.outputs.should_build == 'true'
uses: actions/checkout@v6
- name: Get version
if: steps.should_build.outputs.should_build == 'true'
id: version
run: |
# Extract version from source or use tag
if [ "${{ github.event_name }}" = "release" ]; then
VERSION="${{ github.event.release.tag_name }}"
VERSION="${VERSION#v}" # Remove 'v' prefix if present
else
# Extract version from lib_ccx.h (e.g., #define VERSION "0.96.5")
VERSION=$(grep -oP '#define VERSION "\K[^"]+' src/lib_ccx/lib_ccx.h || echo "0.96")
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "Building version: $VERSION"
- name: Install base dependencies
if: steps.should_build.outputs.should_build == 'true'
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
build-essential \
cmake \
pkg-config \
zlib1g-dev \
libpng-dev \
libjpeg-dev \
libfreetype-dev \
libxml2-dev \
libcurl4-gnutls-dev \
libssl-dev \
clang \
libclang-dev \
tesseract-ocr \
libtesseract-dev \
libleptonica-dev \
patchelf
- name: Install FFmpeg dependencies (HardSubX)
if: steps.should_build.outputs.should_build == 'true' && matrix.build_type == 'hardsubx'
run: |
sudo apt-get install -y --no-install-recommends \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libswscale-dev \
libswresample-dev \
libavfilter-dev \
libavdevice-dev
- name: Install Rust toolchain
if: steps.should_build.outputs.should_build == 'true'
uses: dtolnay/rust-toolchain@stable
- name: Cache GPAC build
if: steps.should_build.outputs.should_build == 'true'
id: cache-gpac
uses: actions/cache@v5
with:
path: ~/gpac-install
key: gpac-abi-16.4-ubuntu24-deb
- name: Build GPAC
if: steps.should_build.outputs.should_build == 'true' && steps.cache-gpac.outputs.cache-hit != 'true'
run: |
git clone -b abi-16.4 --depth 1 https://github.com/gpac/gpac
cd gpac
./configure --prefix=/usr
make -j$(nproc)
make DESTDIR=$HOME/gpac-install install-lib
- name: Install GPAC to system
if: steps.should_build.outputs.should_build == 'true'
run: |
sudo cp -r $HOME/gpac-install/usr/lib/* /usr/lib/
sudo cp -r $HOME/gpac-install/usr/include/* /usr/include/
sudo ldconfig
- name: Build CCExtractor
if: steps.should_build.outputs.should_build == 'true'
run: |
mkdir build && cd build
if [ "${{ matrix.build_type }}" = "hardsubx" ]; then
cmake ../src -DCMAKE_BUILD_TYPE=Release -DWITH_OCR=ON -DWITH_HARDSUBX=ON
else
cmake ../src -DCMAKE_BUILD_TYPE=Release -DWITH_OCR=ON
fi
make -j$(nproc)
- name: Test build
if: steps.should_build.outputs.should_build == 'true'
run: ./build/ccextractor --version
- name: Create .deb package structure
if: steps.should_build.outputs.should_build == 'true'
run: |
VERSION="${{ steps.version.outputs.version }}"
VARIANT="${{ matrix.build_type }}"
if [ "$VARIANT" = "basic" ]; then
PKG_NAME="ccextractor_${VERSION}_amd64"
else
PKG_NAME="ccextractor-${VARIANT}_${VERSION}_amd64"
fi
mkdir -p ${PKG_NAME}/DEBIAN
mkdir -p ${PKG_NAME}/usr/bin
mkdir -p ${PKG_NAME}/usr/lib/ccextractor
mkdir -p ${PKG_NAME}/usr/share/doc/ccextractor
mkdir -p ${PKG_NAME}/usr/share/man/man1
# Copy binary
cp build/ccextractor ${PKG_NAME}/usr/bin/
# Copy GPAC library
cp $HOME/gpac-install/usr/lib/libgpac.so* ${PKG_NAME}/usr/lib/ccextractor/
# Set rpath so ccextractor finds bundled libgpac
patchelf --set-rpath '/usr/lib/ccextractor:$ORIGIN/../lib/ccextractor' ${PKG_NAME}/usr/bin/ccextractor
# Copy documentation
cp docs/CHANGES.TXT ${PKG_NAME}/usr/share/doc/ccextractor/changelog
cp LICENSE.txt ${PKG_NAME}/usr/share/doc/ccextractor/copyright
gzip -9 -n ${PKG_NAME}/usr/share/doc/ccextractor/changelog
# Generate man page
help2man --no-info --name="closed captions and teletext subtitle extractor" \
./build/ccextractor > ${PKG_NAME}/usr/share/man/man1/ccextractor.1 2>/dev/null || true
if [ -f ${PKG_NAME}/usr/share/man/man1/ccextractor.1 ]; then
gzip -9 -n ${PKG_NAME}/usr/share/man/man1/ccextractor.1
fi
# Create control file
if [ "$VARIANT" = "basic" ]; then
PKG_DESCRIPTION="CCExtractor - closed captions and teletext subtitle extractor"
else
PKG_DESCRIPTION="CCExtractor (with HardSubX) - closed captions and teletext subtitle extractor"
fi
INSTALLED_SIZE=$(du -sk ${PKG_NAME}/usr | cut -f1)
# Determine dependencies based on build variant (Ubuntu 24.04)
if [ "$VARIANT" = "hardsubx" ]; then
DEPENDS="libc6, libtesseract5, liblept5, libcurl3t64-gnutls, libavcodec60, libavformat60, libavutil58, libswscale7, libavdevice60, libswresample4, libavfilter9"
else
DEPENDS="libc6, libtesseract5, liblept5, libcurl3t64-gnutls"
fi
cat > ${PKG_NAME}/DEBIAN/control << CTRL
Package: ccextractor
Version: ${VERSION}
Section: utils
Priority: optional
Architecture: amd64
Installed-Size: ${INSTALLED_SIZE}
Depends: ${DEPENDS}
Maintainer: CCExtractor Development Team <carlos@ccextractor.org>
Homepage: https://www.ccextractor.org
Description: ${PKG_DESCRIPTION}
CCExtractor is a tool that extracts closed captions and teletext subtitles
from video files and streams. It supports a wide variety of input formats
including MPEG, H.264/AVC, H.265/HEVC, MP4, MKV, WTV, and transport streams.
.
This package includes a bundled GPAC library for MP4 support.
CTRL
# Remove leading spaces from control file
sed -i 's/^ //' ${PKG_NAME}/DEBIAN/control
# Create postinst to update library cache
cat > ${PKG_NAME}/DEBIAN/postinst << 'POSTINST'
#!/bin/sh
set -e
ldconfig
POSTINST
chmod 755 ${PKG_NAME}/DEBIAN/postinst
# Create postrm to update library cache
cat > ${PKG_NAME}/DEBIAN/postrm << 'POSTRM'
#!/bin/sh
set -e
ldconfig
POSTRM
chmod 755 ${PKG_NAME}/DEBIAN/postrm
# Set permissions
chmod 755 ${PKG_NAME}/usr/bin/ccextractor
chmod 755 ${PKG_NAME}/usr/lib/ccextractor
find ${PKG_NAME}/usr/lib/ccextractor -name "*.so*" -exec chmod 644 {} \;
# Build the .deb
dpkg-deb --build --root-owner-group ${PKG_NAME}
echo "deb_name=${PKG_NAME}.deb" >> $GITHUB_OUTPUT
- name: Test .deb package
if: steps.should_build.outputs.should_build == 'true'
run: |
VERSION="${{ steps.version.outputs.version }}"
VARIANT="${{ matrix.build_type }}"
if [ "$VARIANT" = "basic" ]; then
PKG_NAME="ccextractor_${VERSION}_amd64"
else
PKG_NAME="ccextractor-${VARIANT}_${VERSION}_amd64"
fi
# Install and test (apt handles dependencies automatically)
sudo apt-get update
sudo apt-get install -y ./${PKG_NAME}.deb
ccextractor --version
- name: Get .deb filename
if: steps.should_build.outputs.should_build == 'true'
id: deb_name
run: |
VERSION="${{ steps.version.outputs.version }}"
VARIANT="${{ matrix.build_type }}"
if [ "$VARIANT" = "basic" ]; then
echo "name=ccextractor_${VERSION}_amd64.deb" >> $GITHUB_OUTPUT
else
echo "name=ccextractor-${VARIANT}_${VERSION}_amd64.deb" >> $GITHUB_OUTPUT
fi
- name: Upload .deb artifact
if: steps.should_build.outputs.should_build == 'true'
uses: actions/upload-artifact@v6
with:
name: ${{ steps.deb_name.outputs.name }}
path: ${{ steps.deb_name.outputs.name }}
- name: Upload to Release
if: steps.should_build.outputs.should_build == 'true' && github.event_name == 'release'
uses: softprops/action-gh-release@v2
with:
files: ${{ steps.deb_name.outputs.name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

275
.github/workflows/build_deb_debian13.yml vendored Normal file
View File

@@ -0,0 +1,275 @@
name: Build Debian 13 .deb Package
on:
# Build on releases
release:
types: [published]
# Allow manual trigger
workflow_dispatch:
inputs:
build_type:
description: 'Build type (all, basic, hardsubx)'
required: false
default: 'all'
# Build on pushes to workflow file for testing
push:
paths:
- '.github/workflows/build_deb_debian13.yml'
jobs:
build-deb:
runs-on: ubuntu-latest
container:
image: debian:trixie
strategy:
fail-fast: false
matrix:
build_type: [basic, hardsubx]
steps:
- name: Check if should build this variant
id: should_build
run: |
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
INPUT_TYPE="${{ github.event.inputs.build_type }}"
if [ "$INPUT_TYPE" = "all" ] || [ "$INPUT_TYPE" = "${{ matrix.build_type }}" ]; then
echo "should_build=true" >> $GITHUB_OUTPUT
else
echo "should_build=false" >> $GITHUB_OUTPUT
fi
else
echo "should_build=true" >> $GITHUB_OUTPUT
fi
- name: Install git and dependencies for checkout
if: steps.should_build.outputs.should_build == 'true'
run: |
apt-get update
apt-get install -y git ca-certificates
- name: Checkout repository
if: steps.should_build.outputs.should_build == 'true'
uses: actions/checkout@v6
- name: Get version
if: steps.should_build.outputs.should_build == 'true'
id: version
run: |
# Extract version from source or use tag
if [ "${{ github.event_name }}" = "release" ]; then
VERSION="${{ github.event.release.tag_name }}"
VERSION="${VERSION#v}" # Remove 'v' prefix if present
else
# Extract version from lib_ccx.h (e.g., #define VERSION "0.96.5")
VERSION=$(grep -oP '#define VERSION "\K[^"]+' src/lib_ccx/lib_ccx.h || echo "0.96")
fi
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "Building version: $VERSION"
- name: Install base dependencies
if: steps.should_build.outputs.should_build == 'true'
run: |
apt-get install -y --no-install-recommends \
build-essential \
cmake \
pkg-config \
zlib1g-dev \
libpng-dev \
libjpeg-dev \
libfreetype-dev \
libxml2-dev \
libcurl4-gnutls-dev \
libssl-dev \
clang \
libclang-dev \
tesseract-ocr \
libtesseract-dev \
libleptonica-dev \
patchelf \
curl
- name: Install FFmpeg dependencies (HardSubX)
if: steps.should_build.outputs.should_build == 'true' && matrix.build_type == 'hardsubx'
run: |
apt-get install -y --no-install-recommends \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libswscale-dev \
libswresample-dev \
libavfilter-dev \
libavdevice-dev
- name: Install Rust toolchain
if: steps.should_build.outputs.should_build == 'true'
run: |
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
echo "$HOME/.cargo/bin" >> $GITHUB_PATH
- name: Build GPAC
if: steps.should_build.outputs.should_build == 'true'
run: |
git clone -b abi-16.4 --depth 1 https://github.com/gpac/gpac
cd gpac
./configure --prefix=/usr
make -j$(nproc)
make install-lib
ldconfig
- name: Build CCExtractor
if: steps.should_build.outputs.should_build == 'true'
run: |
export PATH="$HOME/.cargo/bin:$PATH"
mkdir build && cd build
if [ "${{ matrix.build_type }}" = "hardsubx" ]; then
cmake ../src -DCMAKE_BUILD_TYPE=Release -DWITH_OCR=ON -DWITH_HARDSUBX=ON
else
cmake ../src -DCMAKE_BUILD_TYPE=Release -DWITH_OCR=ON
fi
make -j$(nproc)
- name: Test build
if: steps.should_build.outputs.should_build == 'true'
run: ./build/ccextractor --version
- name: Create .deb package structure
if: steps.should_build.outputs.should_build == 'true'
id: create_deb
run: |
VERSION="${{ steps.version.outputs.version }}"
VARIANT="${{ matrix.build_type }}"
if [ "$VARIANT" = "basic" ]; then
PKG_NAME="ccextractor_${VERSION}_debian13_amd64"
else
PKG_NAME="ccextractor-${VARIANT}_${VERSION}_debian13_amd64"
fi
mkdir -p ${PKG_NAME}/DEBIAN
mkdir -p ${PKG_NAME}/usr/bin
mkdir -p ${PKG_NAME}/usr/lib/ccextractor
mkdir -p ${PKG_NAME}/usr/share/doc/ccextractor
mkdir -p ${PKG_NAME}/usr/share/man/man1
# Copy binary
cp build/ccextractor ${PKG_NAME}/usr/bin/
# Copy GPAC library
cp /usr/lib/libgpac.so* ${PKG_NAME}/usr/lib/ccextractor/
# Set rpath so ccextractor finds bundled libgpac
patchelf --set-rpath '/usr/lib/ccextractor:$ORIGIN/../lib/ccextractor' ${PKG_NAME}/usr/bin/ccextractor
# Copy documentation
cp docs/CHANGES.TXT ${PKG_NAME}/usr/share/doc/ccextractor/changelog
cp LICENSE.txt ${PKG_NAME}/usr/share/doc/ccextractor/copyright
gzip -9 -n ${PKG_NAME}/usr/share/doc/ccextractor/changelog
# Create control file
if [ "$VARIANT" = "basic" ]; then
PKG_DESCRIPTION="CCExtractor - closed captions and teletext subtitle extractor"
else
PKG_DESCRIPTION="CCExtractor (with HardSubX) - closed captions and teletext subtitle extractor"
fi
INSTALLED_SIZE=$(du -sk ${PKG_NAME}/usr | cut -f1)
# Determine dependencies based on build variant (Debian 13 Trixie)
if [ "$VARIANT" = "hardsubx" ]; then
DEPENDS="libc6, libtesseract5, libleptonica6, libcurl3t64-gnutls, libavcodec61, libavformat61, libavutil59, libswscale8, libavdevice61, libswresample5, libavfilter10"
else
DEPENDS="libc6, libtesseract5, libleptonica6, libcurl3t64-gnutls"
fi
cat > ${PKG_NAME}/DEBIAN/control << CTRL
Package: ccextractor
Version: ${VERSION}
Section: utils
Priority: optional
Architecture: amd64
Installed-Size: ${INSTALLED_SIZE}
Depends: ${DEPENDS}
Maintainer: CCExtractor Development Team <carlos@ccextractor.org>
Homepage: https://www.ccextractor.org
Description: ${PKG_DESCRIPTION}
CCExtractor is a tool that extracts closed captions and teletext subtitles
from video files and streams. It supports a wide variety of input formats
including MPEG, H.264/AVC, H.265/HEVC, MP4, MKV, WTV, and transport streams.
.
This package includes a bundled GPAC library for MP4 support.
Built for Debian 13 (Trixie).
CTRL
# Remove leading spaces from control file
sed -i 's/^ //' ${PKG_NAME}/DEBIAN/control
# Create postinst to update library cache
cat > ${PKG_NAME}/DEBIAN/postinst << 'POSTINST'
#!/bin/sh
set -e
ldconfig
POSTINST
chmod 755 ${PKG_NAME}/DEBIAN/postinst
# Create postrm to update library cache
cat > ${PKG_NAME}/DEBIAN/postrm << 'POSTRM'
#!/bin/sh
set -e
ldconfig
POSTRM
chmod 755 ${PKG_NAME}/DEBIAN/postrm
# Set permissions
chmod 755 ${PKG_NAME}/usr/bin/ccextractor
chmod 755 ${PKG_NAME}/usr/lib/ccextractor
find ${PKG_NAME}/usr/lib/ccextractor -name "*.so*" -exec chmod 644 {} \;
# Build the .deb
dpkg-deb --build --root-owner-group ${PKG_NAME}
echo "deb_name=${PKG_NAME}.deb" >> $GITHUB_OUTPUT
- name: Test .deb package
if: steps.should_build.outputs.should_build == 'true'
run: |
VERSION="${{ steps.version.outputs.version }}"
VARIANT="${{ matrix.build_type }}"
if [ "$VARIANT" = "basic" ]; then
PKG_NAME="ccextractor_${VERSION}_debian13_amd64"
else
PKG_NAME="ccextractor-${VARIANT}_${VERSION}_debian13_amd64"
fi
# Install and test (apt handles dependencies automatically)
apt-get update
apt-get install -y ./${PKG_NAME}.deb
ccextractor --version
- name: Get .deb filename
if: steps.should_build.outputs.should_build == 'true'
id: deb_name
run: |
VERSION="${{ steps.version.outputs.version }}"
VARIANT="${{ matrix.build_type }}"
if [ "$VARIANT" = "basic" ]; then
echo "name=ccextractor_${VERSION}_debian13_amd64.deb" >> $GITHUB_OUTPUT
else
echo "name=ccextractor-${VARIANT}_${VERSION}_debian13_amd64.deb" >> $GITHUB_OUTPUT
fi
- name: Upload .deb artifact
if: steps.should_build.outputs.should_build == 'true'
uses: actions/upload-artifact@v6
with:
name: ${{ steps.deb_name.outputs.name }}
path: ${{ steps.deb_name.outputs.name }}
- name: Upload to Release
if: steps.should_build.outputs.should_build == 'true' && github.event_name == 'release'
uses: softprops/action-gh-release@v2
with:
files: ${{ steps.deb_name.outputs.name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

96
.github/workflows/build_docker.yml vendored Normal file
View File

@@ -0,0 +1,96 @@
name: Build CCExtractor Docker Images
on:
workflow_dispatch:
push:
paths:
- '.github/workflows/build_docker.yml'
- 'docker/**'
- '**.c'
- '**.h'
- '**CMakeLists.txt'
- '**.cmake'
- 'src/rust/**'
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.github/workflows/build_docker.yml'
- 'docker/**'
- '**.c'
- '**.h'
- '**CMakeLists.txt'
- '**.cmake'
- 'src/rust/**'
jobs:
build_minimal:
name: Docker build (minimal)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build minimal image
uses: docker/build-push-action@v6
with:
context: .
file: docker/Dockerfile
build-args: |
BUILD_TYPE=minimal
USE_LOCAL_SOURCE=1
tags: ccextractor:minimal
load: true
cache-from: type=gha,scope=docker-minimal
cache-to: type=gha,mode=max,scope=docker-minimal
- name: Test minimal image
run: |
docker run --rm ccextractor:minimal --version
echo "Minimal build successful"
build_ocr:
name: Docker build (ocr)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build OCR image
uses: docker/build-push-action@v6
with:
context: .
file: docker/Dockerfile
build-args: |
BUILD_TYPE=ocr
USE_LOCAL_SOURCE=1
tags: ccextractor:ocr
load: true
cache-from: type=gha,scope=docker-ocr
cache-to: type=gha,mode=max,scope=docker-ocr
- name: Test OCR image
run: |
docker run --rm ccextractor:ocr --version
echo "OCR build successful"
build_hardsubx:
name: Docker build (hardsubx)
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build HardSubX image
uses: docker/build-push-action@v6
with:
context: .
file: docker/Dockerfile
build-args: |
BUILD_TYPE=hardsubx
USE_LOCAL_SOURCE=1
tags: ccextractor:hardsubx
load: true
cache-from: type=gha,scope=docker-hardsubx
cache-to: type=gha,mode=max,scope=docker-hardsubx
- name: Test HardSubX image
run: |
docker run --rm ccextractor:hardsubx --version
echo "HardSubX build successful"

117
.github/workflows/build_linux.yml vendored Normal file
View File

@@ -0,0 +1,117 @@
name: Build CCExtractor on Linux
on:
workflow_dispatch:
push:
paths:
- '.github/workflows/build_linux.yml'
- '**.c'
- '**.h'
- '**CMakeLists.txt'
- '**.cmake'
- '**Makefile**'
- 'linux/**'
- 'package_creators/**'
- 'src/rust/**'
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.github/workflows/build_linux.yml'
- '**.c'
- '**.h'
- '**CMakeLists.txt'
- '**.cmake'
- '**Makefile**'
- 'linux/**'
- 'package_creators/**'
- 'src/rust/**'
jobs:
build_shell:
runs-on: ubuntu-latest
steps:
- name: Install dependencies
run: sudo apt update && sudo apt-get install libgpac-dev libtesseract-dev libavcodec-dev libavdevice-dev libx11-dev libxcb1-dev libxcb-shm0-dev
- uses: actions/checkout@v6
- name: build
run: ./build -hardsubx
working-directory: ./linux
- name: Display version information
run: ./ccextractor --version
working-directory: ./linux
- name: Prepare artifacts
run: mkdir ./linux/artifacts
- name: Copy release artifact
run: cp ./linux/ccextractor ./linux/artifacts/
- uses: actions/upload-artifact@v6
with:
name: CCExtractor Linux build
path: ./linux/artifacts
build_autoconf:
runs-on: ubuntu-latest
steps:
- name: Install dependencies
run: sudo apt update && sudo apt-get install libgpac-dev
- uses: actions/checkout@v6
- name: run autogen
run: ./autogen.sh
working-directory: ./linux
- name: configure
run: ./configure --enable-debug
working-directory: ./linux
- name: make
run: make
working-directory: ./linux
- name: Display version information
run: ./ccextractor --version
working-directory: ./linux
cmake:
runs-on: ubuntu-latest
steps:
- name: Install dependencies
run: sudo apt update && sudo apt-get install libgpac-dev
- uses: actions/checkout@v6
- name: cmake
run: mkdir build && cd build && cmake ../src
- name: build
run: make -j$(nproc)
working-directory: build
- name: Display version information
run: ./build/ccextractor --version
cmake_ocr_hardsubx:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Install dependencies
run: sudo apt update && sudo apt install libgpac-dev libtesseract-dev libavformat-dev libavdevice-dev libswscale-dev yasm
- name: cmake
run: |
mkdir build && cd build
cmake -DWITH_OCR=ON -DWITH_HARDSUBX=ON ../src
- name: build
run: |
make -j$(nproc)
working-directory: build
- name: Display version information
run: ./build/ccextractor --version
build_rust:
runs-on: ubuntu-latest
steps:
- name: Install dependencies
run: sudo apt update && sudo apt-get install libgpac-dev
- uses: actions/checkout@v6
- name: cache
uses: actions/cache@v5
with:
path: |
src/rust/.cargo/registry
src/rust/.cargo/git
src/rust/target
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: ${{ runner.os }}-cargo-
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: build
run: cargo build
working-directory: ./src/rust

View File

@@ -0,0 +1,154 @@
name: Build Linux (System Libs)
on:
# Build on releases
release:
types: [published]
# Allow manual trigger
workflow_dispatch:
inputs:
build_type:
description: 'Build type (all, basic, hardsubx)'
required: false
default: 'all'
# Build on pushes to workflow file for testing
push:
paths:
- '.github/workflows/build_linux_systemlibs.yml'
- 'linux/build'
permissions:
contents: write
jobs:
build-systemlibs:
runs-on: ubuntu-22.04
strategy:
fail-fast: false
matrix:
build_type: [basic, hardsubx]
steps:
- name: Check if should build this variant
id: should_build
run: |
if [ "${{ github.event_name }}" = "workflow_dispatch" ]; then
INPUT_TYPE="${{ github.event.inputs.build_type }}"
if [ "$INPUT_TYPE" = "all" ] || [ "$INPUT_TYPE" = "${{ matrix.build_type }}" ]; then
echo "should_build=true" >> $GITHUB_OUTPUT
else
echo "should_build=false" >> $GITHUB_OUTPUT
fi
else
echo "should_build=true" >> $GITHUB_OUTPUT
fi
- name: Checkout repository
if: steps.should_build.outputs.should_build == 'true'
uses: actions/checkout@v6
- name: Install base dependencies
if: steps.should_build.outputs.should_build == 'true'
run: |
sudo apt-get update
sudo apt-get install -y --no-install-recommends \
build-essential \
pkg-config \
zlib1g-dev \
libpng-dev \
libfreetype-dev \
libutf8proc-dev \
libgpac-dev \
libtesseract-dev \
libleptonica-dev \
tesseract-ocr-eng \
clang \
libclang-dev
- name: Install FFmpeg dependencies (HardSubX)
if: steps.should_build.outputs.should_build == 'true' && matrix.build_type == 'hardsubx'
run: |
sudo apt-get install -y --no-install-recommends \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libswscale-dev \
libswresample-dev \
libavfilter-dev \
libavdevice-dev \
libxcb1-dev \
libxcb-shm0-dev \
libx11-dev \
liblzma-dev
- name: Install Rust toolchain
if: steps.should_build.outputs.should_build == 'true'
uses: dtolnay/rust-toolchain@stable
- name: Build with system libraries
if: steps.should_build.outputs.should_build == 'true'
run: |
cd linux
if [ "${{ matrix.build_type }}" = "hardsubx" ]; then
./build -system-libs -hardsubx
else
./build -system-libs
fi
- name: Verify build
if: steps.should_build.outputs.should_build == 'true'
run: |
./linux/ccextractor --version
echo "=== Library dependencies ==="
ldd ./linux/ccextractor | grep -E 'freetype|png|utf8proc|tesseract|leptonica' || true
- name: Get output name
if: steps.should_build.outputs.should_build == 'true'
id: output_name
run: |
case "${{ matrix.build_type }}" in
basic)
echo "name=ccextractor-linux-systemlibs-x86_64" >> $GITHUB_OUTPUT
;;
hardsubx)
echo "name=ccextractor-linux-systemlibs-hardsubx-x86_64" >> $GITHUB_OUTPUT
;;
esac
- name: Package binary
if: steps.should_build.outputs.should_build == 'true'
run: |
mkdir -p package
cp linux/ccextractor package/
# Create a simple README for the package
cat > package/README.txt << 'EOF'
CCExtractor - System Libraries Build
=====================================
This build uses system libraries (dynamic linking).
Required system packages (Debian/Ubuntu):
sudo apt install libgpac12 libtesseract5 libleptonica6 \
libpng16-16 libfreetype6 libutf8proc3
For HardSubX builds, also install:
sudo apt install libavcodec60 libavformat60 libswscale7 libavfilter9
Run with: ./ccextractor --help
EOF
tar -czvf ${{ steps.output_name.outputs.name }}.tar.gz -C package .
- name: Upload artifact
if: steps.should_build.outputs.should_build == 'true'
uses: actions/upload-artifact@v6
with:
name: ${{ steps.output_name.outputs.name }}
path: ${{ steps.output_name.outputs.name }}.tar.gz
- name: Upload to Release
if: steps.should_build.outputs.should_build == 'true' && github.event_name == 'release'
uses: softprops/action-gh-release@v2
with:
files: ${{ steps.output_name.outputs.name }}.tar.gz
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

177
.github/workflows/build_mac.yml vendored Normal file
View File

@@ -0,0 +1,177 @@
name: Build CCExtractor on Mac
on:
workflow_dispatch:
push:
paths:
- '.github/workflows/build_mac.yml'
- '**.c'
- '**.h'
- '**CMakeLists.txt'
- '**.cmake'
- '**Makefile**'
- 'mac/**'
- 'package_creators/**'
- 'src/rust/**'
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.github/workflows/build_mac.yml'
- '**.c'
- '**.h'
- '**CMakeLists.txt'
- '**.cmake'
- '**Makefile**'
- 'mac/**'
- 'package_creators/**'
- 'src/rust/**'
jobs:
build_shell:
runs-on: macos-latest
steps:
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac
- uses: actions/checkout@v6
- name: build
run: ./build.command
working-directory: ./mac
- name: Display version information
run: ./ccextractor --version
working-directory: ./mac
- name: Prepare artifacts
run: mkdir ./mac/artifacts
- name: Copy release artifact
run: cp ./mac/ccextractor ./mac/artifacts/
- uses: actions/upload-artifact@v6
with:
name: CCExtractor mac build
path: ./mac/artifacts
build_shell_system_libs:
# Test building with system libraries via pkg-config (for Homebrew formula compatibility)
runs-on: macos-latest
steps:
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac freetype libpng protobuf-c utf8proc zlib
- uses: actions/checkout@v6
- name: build with system libs
run: ./build.command -system-libs
working-directory: ./mac
- name: Display version information
run: ./ccextractor --version
working-directory: ./mac
build_autoconf:
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool gpac
- name: run autogen
run: ./autogen.sh
working-directory: ./mac
- name: configure
run: ./configure --enable-debug
working-directory: ./mac
- name: make
run: make
working-directory: ./mac
- name: Display version information
run: ./ccextractor --version
working-directory: ./mac
cmake:
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- name: dependencies
run: brew install gpac
- uses: actions/checkout@v6
- name: cmake
run: mkdir build && cd build && cmake ../src
- name: build
run: make -j$(nproc)
working-directory: build
- name: Display version information
run: ./build/ccextractor --version
cmake_ocr_hardsubx:
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac ffmpeg
- name: cmake
run: |
mkdir build && cd build
cmake -DWITH_OCR=ON -DWITH_HARDSUBX=ON ../src
- name: build
run: |
make -j$(nproc)
working-directory: build
- name: Display version information
run: ./build/ccextractor --version
build_shell_hardsubx:
# Test build.command with -hardsubx flag (burned-in subtitle extraction)
runs-on: macos-latest
steps:
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac ffmpeg
- uses: actions/checkout@v6
- name: build with hardsubx
run: ./build.command -hardsubx
working-directory: ./mac
- name: Display version information
run: ./ccextractor --version
working-directory: ./mac
- name: Verify hardsubx support
run: |
# Check that -hardsubx is recognized (will fail if not compiled in)
./ccextractor -hardsubx --help 2>&1 | head -20 || true
working-directory: ./mac
build_autoconf_hardsubx:
# Test autoconf build with HARDSUBX enabled (fixes issue #1173)
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- name: Install dependencies
run: brew install pkg-config autoconf automake libtool tesseract leptonica gpac ffmpeg
- name: run autogen
run: ./autogen.sh
working-directory: ./mac
- name: configure with hardsubx
run: |
# Set Homebrew paths for configure to find libraries
export HOMEBREW_PREFIX="$(brew --prefix)"
export LDFLAGS="-L${HOMEBREW_PREFIX}/lib"
export CPPFLAGS="-I${HOMEBREW_PREFIX}/include"
export PKG_CONFIG_PATH="${HOMEBREW_PREFIX}/lib/pkgconfig"
./configure --enable-hardsubx --enable-ocr
working-directory: ./mac
- name: make
run: make
working-directory: ./mac
- name: Display version information
run: ./ccextractor --version
working-directory: ./mac
- name: Verify hardsubx support
run: |
# Check that -hardsubx is recognized
./ccextractor -hardsubx --help 2>&1 | head -20 || true
working-directory: ./mac
build_rust:
runs-on: macos-latest
steps:
- uses: actions/checkout@v6
- name: cache
uses: actions/cache@v5
with:
path: |
src/rust/.cargo/registry
src/rust/.cargo/git
src/rust/target
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: ${{ runner.os }}-cargo-
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: build
run: cargo build
working-directory: ./src/rust

51
.github/workflows/build_snap.yml vendored Normal file
View File

@@ -0,0 +1,51 @@
name: Build CCExtractor Snap
on:
workflow_dispatch:
release:
types: [published]
jobs:
build_snap:
name: Build Snap package
runs-on: ubuntu-22.04
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Install snapd
run: |
sudo apt update
sudo apt install -y snapd
- name: Start snapd
run: |
sudo systemctl start snapd.socket
sudo systemctl start snapd
- name: Install Snapcraft
run: |
sudo snap install core22
sudo snap install snapcraft --classic
- name: Show Snapcraft version
run: snapcraft --version
- name: Build snap
run: sudo snapcraft --destructive-mode
- name: List generated snap
run: ls -lh *.snap
- name: Upload snap as workflow artifact
uses: actions/upload-artifact@v6
with:
name: CCExtractor Snap
path: "*.snap"
- name: Upload snap to GitHub Release
if: github.event_name == 'release'
uses: softprops/action-gh-release@v2
with:
files: "*.snap"

142
.github/workflows/build_windows.yml vendored Normal file
View File

@@ -0,0 +1,142 @@
name: Build CCExtractor on Windows
env:
RUSTFLAGS: -Ctarget-feature=+crt-static
VCPKG_DEFAULT_TRIPLET: x64-windows-static
VCPKG_COMMIT: ab2977be50c702126336e5088f4836060733c899
on:
workflow_dispatch:
push:
paths:
- ".github/workflows/build_windows.yml"
- "**.c"
- "**.h"
- "**CMakeLists.txt"
- "**.cmake"
- "windows/**"
- "src/rust/**"
pull_request:
types: [opened, synchronize, reopened]
paths:
- ".github/workflows/build_windows.yml"
- "**.c"
- "**.h"
- "**CMakeLists.txt"
- "**.cmake"
- "windows/**"
- "src/rust/**"
jobs:
build:
runs-on: windows-2022
steps:
- name: Check out repository
uses: actions/checkout@v6
- name: Setup MSBuild.exe
uses: microsoft/setup-msbuild@v2.0.0
with:
msbuild-architecture: x64
# Install GPAC (fast, ~30s, not worth caching complexity)
- name: Install gpac
run: choco install gpac --version 2.4.0 --no-progress
# Use lukka/run-vcpkg for better caching
- name: Setup vcpkg
uses: lukka/run-vcpkg@v11
id: runvcpkg
with:
vcpkgGitCommitId: ${{ env.VCPKG_COMMIT }}
vcpkgDirectory: ${{ github.workspace }}/vcpkg
vcpkgJsonGlob: 'windows/vcpkg.json'
# Cache vcpkg installed packages separately for faster restores
- name: Cache vcpkg installed packages
id: vcpkg-installed-cache
uses: actions/cache@v5
with:
path: ${{ github.workspace }}/vcpkg/installed
key: vcpkg-installed-${{ runner.os }}-${{ env.VCPKG_COMMIT }}-${{ hashFiles('windows/vcpkg.json') }}
restore-keys: |
vcpkg-installed-${{ runner.os }}-${{ env.VCPKG_COMMIT }}-
- name: Install vcpkg dependencies
if: steps.vcpkg-installed-cache.outputs.cache-hit != 'true'
run: ${{ github.workspace }}/vcpkg/vcpkg.exe install --x-install-root ${{ github.workspace }}/vcpkg/installed/
working-directory: windows
# Cache Rust/Cargo artifacts
- name: Cache Cargo registry
uses: actions/cache@v5
with:
path: |
~/.cargo/registry
~/.cargo/git
key: ${{ runner.os }}-cargo-registry-${{ hashFiles('**/Cargo.lock') }}
restore-keys: |
${{ runner.os }}-cargo-registry-
# Cache Cargo build artifacts - rust.bat sets CARGO_TARGET_DIR to windows/
# which results in artifacts at windows/x86_64-pc-windows-msvc/
- name: Cache Cargo build artifacts
uses: actions/cache@v5
with:
path: ${{ github.workspace }}/windows/x86_64-pc-windows-msvc
key: ${{ runner.os }}-cargo-build-${{ hashFiles('**/Cargo.lock') }}-${{ hashFiles('src/rust/**/*.rs') }}
restore-keys: |
${{ runner.os }}-cargo-build-${{ hashFiles('**/Cargo.lock') }}-
${{ runner.os }}-cargo-build-
- name: Setup Rust toolchain
uses: dtolnay/rust-toolchain@stable
- name: Install Win 10 SDK
uses: ilammy/msvc-dev-cmd@v1
# Build Release-Full
- name: Build Release-Full
env:
LIBCLANG_PATH: "C:\\Program Files\\LLVM\\lib"
LLVM_CONFIG_PATH: "C:\\Program Files\\LLVM\\bin\\llvm-config"
BINDGEN_EXTRA_CLANG_ARGS: -fmsc-version=0
VCPKG_ROOT: ${{ github.workspace }}/vcpkg
run: msbuild ccextractor.sln /p:Configuration=Release-Full /p:Platform=x64
working-directory: ./windows
- name: Display Release version information
run: ./ccextractorwinfull.exe --version
working-directory: ./windows/x64/Release-Full
- name: Upload Release artifact
uses: actions/upload-artifact@v6
with:
name: CCExtractor Windows Release build
path: |
./windows/x64/Release-Full/ccextractorwinfull.exe
./windows/x64/Release-Full/*.dll
# Build Debug-Full (reuses cached Cargo artifacts)
- name: Build Debug-Full
env:
LIBCLANG_PATH: "C:\\Program Files\\LLVM\\lib"
LLVM_CONFIG_PATH: "C:\\Program Files\\LLVM\\bin\\llvm-config"
BINDGEN_EXTRA_CLANG_ARGS: -fmsc-version=0
VCPKG_ROOT: ${{ github.workspace }}/vcpkg
run: msbuild ccextractor.sln /p:Configuration=Debug-Full /p:Platform=x64
working-directory: ./windows
- name: Display Debug version information
continue-on-error: true
run: ./ccextractorwinfull.exe --version
working-directory: ./windows/x64/Debug-Full
- name: Upload Debug artifact
uses: actions/upload-artifact@v6
with:
name: CCExtractor Windows Debug build
path: |
./windows/x64/Debug-Full/ccextractorwinfull.exe
./windows/x64/Debug-Full/ccextractorwinfull.pdb
./windows/x64/Debug-Full/*.dll

57
.github/workflows/format.yml vendored Normal file
View File

@@ -0,0 +1,57 @@
name: Format sourcecode
on:
push:
paths:
- '.github/workflows/format.yml'
- 'src/**.c'
- 'src/**.h'
- 'src/rust/**'
tags-ignore: # ignore push via new tag
- '*.*'
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.github/workflows/format.yml'
- 'src/**.c'
- 'src/**.h'
- 'src/rust/**'
jobs:
format:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
- name: Format code
run: |
find src/ -type f -not -path "src/thirdparty/*" -not -path "src/lib_ccx/zvbi/*" -name '*.c' -not -path "src/GUI/icon_data.c" | xargs clang-format -i
git diff-index --quiet HEAD -- || (git diff && exit 1)
format_rust:
runs-on: ubuntu-latest
strategy:
matrix:
workdir: ['./src/rust', './src/rust/lib_ccxr']
defaults:
run:
working-directory: ${{ matrix.workdir }}
steps:
- uses: actions/checkout@v6
- name: cache
uses: actions/cache@v5
with:
path: |
${{ matrix.workdir }}/.cargo/registry
${{ matrix.workdir }}/.cargo/git
${{ matrix.workdir }}/target
key: ${{ runner.os }}-cargo-${{ hashFiles('${{ matrix.workdir }}/Cargo.lock') }}
restore-keys: ${{ runner.os }}-cargo-
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
components: rustfmt, clippy
- name: dependencies
run: sudo apt update && sudo apt install libtesseract-dev libavformat-dev libavdevice-dev libswscale-dev yasm
- name: rustfmt
run: cargo fmt --all -- --check
- name: clippy
run: |
cargo clippy -- -D warnings

15
.github/workflows/homebrew.yml vendored Normal file
View File

@@ -0,0 +1,15 @@
name: Bump Homebrew Formula
on:
release:
types: [published]
jobs:
homebrew:
runs-on: ubuntu-latest
steps:
- name: Update Homebrew formula
uses: dawidd6/action-homebrew-bump-formula@v7
with:
token: ${{ secrets.HOMEBREW_GITHUB_API_TOKEN }}
formula: ccextractor

136
.github/workflows/publish_chocolatey.yml vendored Normal file
View File

@@ -0,0 +1,136 @@
# Publish to Chocolatey Community Repository
#
# PREREQUISITES:
# 1. Create a Chocolatey account at https://community.chocolatey.org/account/Register
# 2. Get your API key from https://community.chocolatey.org/account
# 3. Add the API key as repository secret: CHOCOLATEY_API_KEY
#
# Reference: https://docs.chocolatey.org/en-us/create/create-packages-quick-start
name: Publish to Chocolatey
on:
release:
types: [released]
workflow_dispatch:
inputs:
release_tag:
description: 'Release tag to publish (e.g., v0.96.1)'
required: true
type: string
jobs:
publish:
runs-on: windows-latest
steps:
- name: Checkout repository
uses: actions/checkout@v6
- name: Get version from tag
id: version
shell: bash
run: |
TAG="${{ github.event.inputs.release_tag || github.event.release.tag_name }}"
# Strip 'v' prefix if present
VERSION="${TAG#v}"
echo "version=$VERSION" >> $GITHUB_OUTPUT
echo "tag=$TAG" >> $GITHUB_OUTPUT
- name: Download MSI from release
shell: pwsh
run: |
$version = "${{ steps.version.outputs.version }}"
$tag = "${{ steps.version.outputs.tag }}"
$msiUrl = "https://github.com/CCExtractor/ccextractor/releases/download/$tag/CCExtractor.$version.msi"
Write-Host "Downloading MSI from: $msiUrl"
Invoke-WebRequest -Uri $msiUrl -OutFile "CCExtractor.msi"
# Calculate SHA256 checksum
$hash = (Get-FileHash -Path "CCExtractor.msi" -Algorithm SHA256).Hash
Write-Host "SHA256: $hash"
echo "MSI_CHECKSUM=$hash" >> $env:GITHUB_ENV
- name: Update nuspec version
shell: pwsh
run: |
$version = "${{ steps.version.outputs.version }}"
$nuspecPath = "packaging/chocolatey/ccextractor.nuspec"
$content = Get-Content $nuspecPath -Raw
$content = $content -replace '<version>.*</version>', "<version>$version</version>"
Set-Content -Path $nuspecPath -Value $content
Write-Host "Updated nuspec to version $version"
- name: Update install script
shell: pwsh
run: |
$version = "${{ steps.version.outputs.version }}"
$tag = "${{ steps.version.outputs.tag }}"
$checksum = $env:MSI_CHECKSUM
$installScript = "packaging/chocolatey/tools/chocolateyInstall.ps1"
$content = Get-Content $installScript -Raw
# Update URL
$newUrl = "https://github.com/CCExtractor/ccextractor/releases/download/$tag/CCExtractor.$version.msi"
$content = $content -replace "url64bit\s*=\s*'[^']*'", "url64bit = '$newUrl'"
# Update checksum
$content = $content -replace "checksum64\s*=\s*'[^']*'", "checksum64 = '$checksum'"
Set-Content -Path $installScript -Value $content
Write-Host "Updated install script with URL and checksum"
- name: Build Chocolatey package
shell: pwsh
run: |
cd packaging/chocolatey
choco pack ccextractor.nuspec
# List the generated package
Get-ChildItem *.nupkg
- name: Test package locally
shell: pwsh
run: |
cd packaging/chocolatey
$nupkg = Get-ChildItem *.nupkg | Select-Object -First 1
Write-Host "Testing package: $($nupkg.Name)"
# Install from local package
choco install ccextractor --source="'.;https://community.chocolatey.org/api/v2/'" --yes --force
# Verify installation
$ccx = Get-Command ccextractor -ErrorAction SilentlyContinue
if ($ccx) {
Write-Host "CCExtractor found at: $($ccx.Source)"
& ccextractor --version
} else {
Write-Host "CCExtractor not found in PATH, checking Program Files..."
$exePath = Join-Path $env:ProgramFiles "CCExtractor\ccextractor.exe"
if (Test-Path $exePath) {
& $exePath --version
}
}
- name: Push to Chocolatey
shell: pwsh
env:
CHOCOLATEY_API_KEY: ${{ secrets.CHOCOLATEY_API_KEY }}
run: |
cd packaging/chocolatey
$nupkg = Get-ChildItem *.nupkg | Select-Object -First 1
Write-Host "Pushing $($nupkg.Name) to Chocolatey..."
choco push $nupkg.Name --source="https://push.chocolatey.org/" --api-key="$env:CHOCOLATEY_API_KEY"
Write-Host "Package submitted to Chocolatey! It may take some time to be moderated and published."
- name: Upload package artifact
uses: actions/upload-artifact@v6
with:
name: chocolatey-package
path: packaging/chocolatey/*.nupkg

38
.github/workflows/publish_winget.yml vendored Normal file
View File

@@ -0,0 +1,38 @@
# Publish to Windows Package Manager (winget)
#
# PREREQUISITES:
# 1. CCExtractor must already have ONE version in winget-pkgs before this works
# - Submit the initial manifest manually from packaging/winget/
# - PR to: https://github.com/microsoft/winget-pkgs
#
# 2. Create a fork of microsoft/winget-pkgs under the CCExtractor organization
# - https://github.com/CCExtractor/winget-pkgs (needs to be created)
#
# 3. Create a GitHub Personal Access Token (classic) with 'public_repo' scope
# - Add as repository secret: WINGET_TOKEN
#
# Reference: https://github.com/vedantmgoyal9/winget-releaser
name: Publish to WinGet
on:
release:
types: [released]
workflow_dispatch:
inputs:
release_tag:
description: 'Release tag to publish (e.g., v0.96.1)'
required: true
type: string
jobs:
publish:
runs-on: windows-latest
steps:
- name: Publish to WinGet
uses: vedantmgoyal9/winget-releaser@v2
with:
identifier: CCExtractor.CCExtractor
installers-regex: '\.msi$' # Only use the MSI installer
token: ${{ secrets.WINGET_TOKEN }}
release-tag: ${{ github.event.inputs.release_tag || github.event.release.tag_name }}

137
.github/workflows/release.yml vendored Normal file
View File

@@ -0,0 +1,137 @@
name: Upload releases
on:
release:
types:
- created
permissions:
contents: write
env:
RUSTFLAGS: -Ctarget-feature=+crt-static
VCPKG_DEFAULT_TRIPLET: x64-windows-static
VCPKG_DEFAULT_BINARY_CACHE: C:\vcpkg\.cache
VCPKG_COMMIT: ab2977be50c702126336e5088f4836060733c899
jobs:
build_windows:
runs-on: windows-2022
steps:
- name: Check out repository
uses: actions/checkout@v6
- name: Get the version
id: get_version
run: |
# Extract version from tag, strip 'v' prefix and everything after first dash
VERSION=${GITHUB_REF/refs\/tags\/v/}
VERSION=${VERSION%%-*}
# Save display version for filenames (e.g., 0.96.1)
echo ::set-output name=DISPLAY_VERSION::$VERSION
# Count dots to determine version format
DOTS="${VERSION//[^.]}"
PART_COUNT=$((${#DOTS} + 1))
# MSI requires 4-part version (major.minor.build.revision)
if [ "$PART_COUNT" -eq 2 ]; then
MSI_VERSION="${VERSION}.0.0"
elif [ "$PART_COUNT" -eq 3 ]; then
MSI_VERSION="${VERSION}.0"
else
MSI_VERSION="${VERSION}"
fi
echo ::set-output name=VERSION::$MSI_VERSION
shell: bash
- name: Setup MSBuild.exe
uses: microsoft/setup-msbuild@v2.0.0
with:
msbuild-architecture: x64
- name: Install gpac
run: choco install gpac --version 2.4.0
- name: Setup vcpkg
run: mkdir C:\vcpkg\.cache
- name: Cache vcpkg
id: cache
uses: actions/cache@v5
with:
path: |
C:\vcpkg\.cache
key: vcpkg-${{ runner.os }}-${{ env.VCPKG_COMMIT }}
- name: Build vcpkg
run: |
git clone https://github.com/microsoft/vcpkg
./vcpkg/bootstrap-vcpkg.bat
- name: Install dependencies
run: ${{ github.workspace }}/vcpkg/vcpkg.exe install --x-install-root ${{ github.workspace }}/vcpkg/installed/
working-directory: windows
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Install Win 10 SDK
uses: ilammy/msvc-dev-cmd@v1
- name: build Release-Full
env:
LIBCLANG_PATH: "C:\\Program Files\\LLVM\\lib"
LLVM_CONFIG_PATH: "C:\\Program Files\\LLVM\\bin\\llvm-config"
CARGO_TARGET_DIR: "..\\..\\windows"
BINDGEN_EXTRA_CLANG_ARGS: -fmsc-version=0
VCPKG_ROOT: ${{ github.workspace }}/vcpkg
run: msbuild ccextractor.sln /p:Configuration=Release-Full /p:Platform=x64
working-directory: ./windows
- name: Copy files to directory for installer
run: mkdir installer; cp ./x64/Release-Full/ccextractorwinfull.exe ./installer; cp ./x64/Release-Full/*.dll ./installer
working-directory: ./windows
- name: Download tessdata for OCR support
run: |
mkdir -p ./installer/tessdata
# Download English traineddata from tessdata_fast (smaller, faster, good for most use cases)
Invoke-WebRequest -Uri "https://github.com/tesseract-ocr/tessdata_fast/raw/main/eng.traineddata" -OutFile "./installer/tessdata/eng.traineddata"
# Download OSD (Orientation and Script Detection) for automatic script detection
Invoke-WebRequest -Uri "https://github.com/tesseract-ocr/tessdata_fast/raw/main/osd.traineddata" -OutFile "./installer/tessdata/osd.traineddata"
working-directory: ./windows
- name: install WiX
run: dotnet tool uninstall --global wix; dotnet tool install --global wix --version 6.0.2 && wix extension add -g WixToolset.UI.wixext/6.0.2
- name: Make sure WiX works
run: wix --version && wix extension list -g
- name: Download Flutter GUI
run: ((Invoke-WebRequest -UseBasicParsing https://api.github.com/repos/CCExtractor/ccextractorfluttergui/releases/latest).Content | ConvertFrom-Json).assets | ForEach-Object {if ($_.name -eq "windows.zip") { Invoke-WebRequest -UseBasicParsing -Uri $_.browser_download_url -OutFile windows.zip}}
working-directory: ./windows
- name: Display contents of dir
run: ls
working-directory: ./windows
- name: Unzip Flutter GUI
run: Expand-Archive -Path ./windows.zip -DestinationPath ./installer -Force
working-directory: ./windows
- name: Display installer folder contents
run: Get-ChildItem -Recurse ./installer
working-directory: ./windows
- name: Create portable zip
run: Compress-Archive -Path ./installer/* -DestinationPath ./CCExtractor.${{ steps.get_version.outputs.DISPLAY_VERSION }}_win_portable.zip
working-directory: ./windows
- name: Build installer
run: wix build -arch x64 -ext WixToolset.UI.wixext -d "AppVersion=${{ steps.get_version.outputs.VERSION }}" -o CCExtractor.${{ steps.get_version.outputs.DISPLAY_VERSION }}.msi installer.wxs CustomUI.wxs
working-directory: ./windows
- name: Upload as asset
uses: AButler/upload-release-assets@v3.0
with:
files: './windows/CCExtractor.${{ steps.get_version.outputs.DISPLAY_VERSION }}.msi;./windows/CCExtractor.${{ steps.get_version.outputs.DISPLAY_VERSION }}_win_portable.zip'
repo-token: ${{ secrets.GITHUB_TOKEN }}
create_linux_package:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v6
with:
path: ./ccextractor
- name: Get the version
id: get_version
run: |
VERSION=${GITHUB_REF/refs\/tags\/v/}
VERSION=${VERSION%%-*}
echo ::set-output name=DISPLAY_VERSION::$VERSION
- name: Create .tar.gz without git and windows folders
run: tar -pczf ./ccextractor.${{ steps.get_version.outputs.DISPLAY_VERSION }}.tar.gz --exclude "ccextractor/windows" --exclude "ccextractor/.git" ccextractor
- name: Upload as asset
uses: AButler/upload-release-assets@v3.0
with:
files: './ccextractor.${{ steps.get_version.outputs.DISPLAY_VERSION }}.tar.gz'
repo-token: ${{ secrets.GITHUB_TOKEN }}

41
.github/workflows/test_rust.yml vendored Normal file
View File

@@ -0,0 +1,41 @@
name: Unit Test Rust
on:
push:
paths:
- ".github/workflows/test.yml"
- "src/rust/**"
tags-ignore:
- "*.*"
pull_request:
types: [opened, synchronize, reopened]
paths:
- ".github/workflows/test.yml"
- "src/rust/**"
jobs:
test_rust:
runs-on: ubuntu-latest
defaults:
run:
working-directory: ./src/rust
steps:
- uses: actions/checkout@v6
- name: cache
uses: actions/cache@v5
with:
path: |
src/rust/.cargo/registry
src/rust/.cargo/git
src/rust/target
src/rust/lib_ccxr/target
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
restore-keys: ${{ runner.os }}-cargo-
- uses: actions-rs/toolchain@v1
with:
toolchain: stable
override: true
- name: Test main module
run: cargo test
working-directory: ./src/rust
- name: Test lib_ccxr module
run: cargo test
working-directory: ./src/rust/lib_ccxr

141
.gitignore vendored
View File

@@ -1,3 +1,9 @@
####
# Ignore tests tmp files and results
tests/runtest
tests/**/*.gcda
tests/**/*.gcno
####
# Ignore CVS related files
@@ -7,27 +13,156 @@ CVS
####
# Linux Ignored binary and build folder
*.o
*.so
mac/ccextractor
linux/ccextractor
linux/depend
linux/build_scan/
windows/x86_64-pc-windows-msvc/**
windows/Debug/**
windows/Debug-OCR/**
windows/release-with-debug/**
windows/Release/**
windows/Release-Full/**
windows/Release-OCR/**
windows/Debug-Full/**
windows/x64/**
windows/ccextractor.VC.db
build/
build_*/
####
# Python
*.pyc
####
# Visual Studio project Ignored files
.vs/**
windows/.vs/**
!windows/.vs/config/applicationhost.config
*.suo
*.sdf
*.opensdf
*.user
*.opendb
*.db
*.vscode
####
# Ignore the header file that is updated upon build
src/lib_ccx/compile_info.h
src/lib_ccx/compile_info_real.h
#### Ignore windows OCR libraries and folders
windows/libs/leptonica/**
windows/libs/tesseract/**
windows/Release-OCR/**
windows/Debug-OCR/**
# Ctags
*.tags*
tags
# Vagrant
.vagrant/
# Eclipse stuff
.cproject
.project
.settings/
# Mac
.DS_Store
windows/enc_temp_folder/*
#CMake
src/cmake-build-debug/
src/.idea/
#Autotools
linux/config.h
linux/config.log
linux/config.status
linux/Makefile
linux/autom4te.cache
linux/aclocal.m4
linux/*.in
linux/configure
linux/build-conf/
mac/rust/
mac/config.h
mac/config.log
mac/config.status
mac/Makefile
mac/autom4te.cache
mac/aclocal.m4
mac/*.in
mac/configure
mac/build-conf/
package_creators/*tar.gz
package_creators/build/*.deb
src/.deps/
src/.dirstamp
src/lib_ccx/.deps/
src/lib_ccx/.dirstamp
src/lib_hash/.deps/
src/lib_hash/.dirstamp
src/libpng/.deps/
src/libpng/.dirstamp
src/utf8proc/.deps/
src/utf8proc/.dirstamp
src/zlib/.deps/
src/zlib/.dirstamp
src/zvbi/.deps/
src/zvbi/.dirstamp
# Arch
package_creators/*.pkg.tar.xz
#RPMs
package_creators/*.rpm
src/lib_ccx/ccx.pc
windows/combase.pdb/
src/**/.deps
src/**/.dirstamp
mac/ccextractorGUI
linux/ccextractorGUI
linux/ccxGUI.ini
linux/CMakeCache.txt
linux/CMakeFiles/
linux/cmake_install.cmake
linux/install_manifest.txt
linux/lib_ccx/
mac/lib_ccx/
mac/install_manifest.txt
mac/cmake_install.cmake
mac/CMakeFiles/
mac/CMakeCache.txt
*.py.bak
# Bazel
bazel*
#Intellij IDEs
.idea/
# Plans (local only)
plans/
# Rust build and MakeFiles (and CMake files)
src/rust/CMakeFiles/
src/rust/CMakeCache.txt
src/rust/Makefile
src/rust/cmake_install.cmake
src/rust/target/
src/rust/lib_ccxr/target/
windows/ccx_rust.lib
windows/*/debug/*
windows/*/CACHEDIR.TAG
windows/.rustc_info.json
linux/configure~
# Plans and temporary files
plans/
tess.log
**/tess.log
ut=srt*

101
.travis.yml Normal file
View File

@@ -0,0 +1,101 @@
language: c
matrix:
include:
- os: osx
osx_image: xcode10.1
compiler: gcc
addons:
homebrew:
packages:
autoconf
libtool
tesseract
leptonica
script:
- cd mac
- ./build.command
- ./ccextractor --version
- os: osx
osx_image: xcode10.1
compiler: clang
addons:
homebrew:
packages:
autoconf
libtool
tesseract
leptonica
script:
- cd mac
- ./build.command
- ./ccextractor --version
- os: osx
osx_image: xcode10.1
compiler: gcc
addons:
homebrew:
packages:
autoconf
libtool
tesseract
leptonica
script:
- cd mac
- ./autogen.sh
- ./configure
- make
- ./ccextractor --version
- os: osx
osx_image: xcode10.1
compiler: clang
addons:
homebrew:
packages:
autoconf
libtool
tesseract
leptonica
script:
- cd mac
- ./autogen.sh
- ./configure
- make
- ./ccextractor --version
- os: osx
osx_image: xcode10.1
compiler: gcc
addons:
homebrew:
packages:
autoconf
libtool
tesseract
leptonica
script:
- mkdir build
- cd build
- cmake ../src/
- make
- ./ccextractor --version
- os: osx
osx_image: xcode10.1
compiler: clang
addons:
homebrew:
packages:
autoconf
libtool
tesseract
leptonica
script:
- mkdir build
- cd build
- cmake ../src/
- make
- ./ccextractor --version

339
LICENSE.txt Normal file
View File

@@ -0,0 +1,339 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Lesser General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License along
with this program; if not, write to the Free Software Foundation, Inc.,
51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License.

View File

@@ -3,8 +3,8 @@
MAINTAINER = Marc Espie <espie@openbsd.org>
CATEGORIES = multimedia
COMMENT = closed caption subtitles extractor
HOMEPAGE = http://ccextractor.sourceforge.net/
V = 0.77
HOMEPAGE = https://ccextractor.org
V = 0.96.5
DISTFILES = ccextractor.${V:S/.//}-src.zip
MASTER_SITES = ${MASTER_SITE_SOURCEFORGE:=ccextractor/}
DISTNAME = ccextractor-$V

123
README.md
View File

@@ -1,12 +1,119 @@
ccextractor
===========
<img src ="https://github.com/CCExtractor/ccextractor-org-media/blob/master/static/ccx_logo_transparent_800x600.png" width="200px" alt="logo" />
Carlos' version (mainstream) is the most stable branch.
# CCExtractor
Extracting subtitles has never been so easy. Just type the following command:
ccextractor "name of input"
[![Sample-Platform Build Status Windows](https://sampleplatform.ccextractor.org/static/img/status/build-windows.svg?maxAge=1800)](https://sampleplatform.ccextractor.org/test/master/windows)
[![Sample-Platform Build Status Linux](https://sampleplatform.ccextractor.org/static/img/status/build-linux.svg?maxAge=1800)](https://sampleplatform.ccextractor.org/test/master/linux)
[![SourceForge](https://img.shields.io/badge/SourceForge%20downloads-213k%2Ftotal-brightgreen.svg)](https://sourceforge.net/projects/ccextractor/)
[![GitHub All Releases](https://img.shields.io/github/downloads/CCExtractor/CCExtractor/total.svg)](https://github.com/CCExtractor/ccextractor/releases/latest)
Gui lovers should download the Sorceforge version of CCExtractor, the Git Version is not your cup of tea.
http://ccextractor.sourceforge.net/download-ccextractor.html
CCExtractor is a tool used to produce subtitles for TV recordings from almost anywhere in the world. We intend to keep up with all sources and formats.
For News about release, please find CHANGES.TXT
Subtitles are important for many people. If you're learning a new language, subtitles are a great way to learn it from movies or TV shows. If you are hard of hearing, subtitles can help you better understand what's happening on the screen. We aim to make it easy to generate subtitles by using the command line tool or Windows GUI.
The official repository is ([CCExtractor/ccextractor](https://github.com/CCExtractor/ccextractor)) and master being the most stable branch.
### **Features**
- Extract subtitles in real-time
- Translate subtitles
- Extract closed captions from DVDs
- Convert closed captions to subtitles
### Programming Languages & Technologies
The core functionality is written in C. Other languages used include C++ and Python.
## Installation and Usage
Downloads for precompiled binaries and source code can be found [on our website](https://ccextractor.org/public/general/downloads/).
### Windows Package Managers
**WinGet:**
```powershell
winget install CCExtractor.CCExtractor
```
**Chocolatey:**
```powershell
choco install ccextractor
```
**Scoop:**
```powershell
scoop bucket add extras
scoop install ccextractor
```
Extracting subtitles is relatively simple. Just run the following command:
`ccextractor <input>`
This will extract the subtitles.
More usage information can be found on our website:
- [Using the command line tool](https://ccextractor.org/public/general/command_line_usage/)
- [Using the Flutter GUI](https://ccextractor.org/public/general/flutter_gui/)
You can also find the list of parameters and their brief description by running `ccextractor` without any arguments.
You can find sample files on [our website](https://ccextractor.org/public/general/tvsamples/) to test the software.
### Building from Source
- [Building on Windows using WSL](docs/build-wsl.md)
#### Linux (Autotools) build notes
CCExtractor also supports an autotools-based build system under the `linux/`
directory.
Important notes:
- The autotools workflow lives inside `linux/`. The `configure` script is
generated there and should be run from that directory.
- Typical build steps are:
```
cd linux
./autogen.sh
./configure
make
```
- Rust support is enabled automatically if `cargo` and `rustc` are available
on the system. In that case, Rust components are built and linked during
`make`.
- If you encounter unexpected build or linking issues, a clean rebuild
(`make clean` or a fresh clone) is recommended, especially when Rust is
involved.
This build flow has been tested on Linux and WSL.
## Compiling CCExtractor
To learn more about how to compile and build CCExtractor for your platform check the [compilation guide](https://github.com/CCExtractor/ccextractor/blob/master/docs/COMPILATION.MD).
## Support
By far the best way to get support is by opening an issue at our [issue tracker](https://github.com/CCExtractor/ccextractor/issues).
When you create a new issue, please fill in the needed details in the provided template. That makes it easier for us to help you more efficiently.
If you have a question or a problem you can also [contact us by email or chat with the team in Slack](https://ccextractor.org/public/general/support/).
If you want to contribute to CCExtractor but can't submit some code patches or issues or video samples, you can also [donate to us](https://sourceforge.net/donate/index.php?group_id=190832)
## Contributing
You can contribute to the project by reporting issues, forking it, modifying the code and making a pull request to the repository. We have some rules, outlined in the [contributor's guide](.github/CONTRIBUTING.md).
## News & Other Information
News about releases and modifications to the code can be found in the [CHANGES.TXT](docs/CHANGES.TXT) file.
For more information visit the CCExtractor website: [https://www.ccextractor.org](https://www.ccextractor.org)
## License
GNU General Public License version 2.0 (GPL-2.0)

16
Vagrantfile vendored Normal file
View File

@@ -0,0 +1,16 @@
Vagrant.configure(2) do |config|
config.vm.box = "ubuntu/xenial64"
# Uncomment this line if you want to sync other folders
# config.vm.synced_folder "/home/user/video", "/video"
config.vm.provision "shell", inline: <<-SHELL
sudo apt-get install -y gcc
sudo apt-get install -y libcurl4-gnutls-dev
sudo apt-get install -y tesseract-ocr
sudo apt-get install -y tesseract-ocr-dev
sudo apt-get install -y libleptonica-dev
SHELL
end

0
WORKSPACE Normal file
View File

239
docker/Dockerfile Normal file
View File

@@ -0,0 +1,239 @@
# CCExtractor Docker Build
#
# Build variants via BUILD_TYPE argument:
# - minimal: Basic CCExtractor without OCR
# - ocr: CCExtractor with OCR support (default)
# - hardsubx: CCExtractor with burned-in subtitle extraction (requires FFmpeg)
#
# Source options via USE_LOCAL_SOURCE argument:
# - 0 (default): Clone from GitHub (standalone Dockerfile usage)
# - 1: Use local source (when building from cloned repo)
#
# Build examples:
#
# # Standalone (just the Dockerfile, clones from GitHub):
# docker build -t ccextractor docker/
# docker build --build-arg BUILD_TYPE=hardsubx -t ccextractor docker/
#
# # From cloned repository (faster, uses local source):
# docker build --build-arg USE_LOCAL_SOURCE=1 -f docker/Dockerfile -t ccextractor .
# docker build --build-arg USE_LOCAL_SOURCE=1 --build-arg BUILD_TYPE=minimal -f docker/Dockerfile -t ccextractor .
ARG DEBIAN_VERSION=bookworm-slim
FROM debian:${DEBIAN_VERSION} AS base
FROM base AS builder
# Build arguments
ARG BUILD_TYPE=ocr
ARG USE_LOCAL_SOURCE=0
# BUILD_TYPE: minimal, ocr, hardsubx
# USE_LOCAL_SOURCE: 0 = git clone, 1 = copy local source
# Avoid interactive prompts during package installation
ENV DEBIAN_FRONTEND=noninteractive
# Install base build dependencies
RUN apt-get update && apt-get install -y --no-install-recommends \
git \
curl \
ca-certificates \
gcc \
g++ \
cmake \
make \
pkg-config \
bash \
zlib1g-dev \
libpng-dev \
libjpeg-dev \
libssl-dev \
libfreetype-dev \
libxml2-dev \
libcurl4-gnutls-dev \
clang \
libclang-dev \
&& rm -rf /var/lib/apt/lists/*
# Install Rust toolchain
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain stable
ENV PATH="/root/.cargo/bin:${PATH}"
# Install OCR dependencies (for ocr and hardsubx builds)
RUN if [ "$BUILD_TYPE" = "ocr" ] || [ "$BUILD_TYPE" = "hardsubx" ]; then \
apt-get update && apt-get install -y --no-install-recommends \
tesseract-ocr \
libtesseract-dev \
libleptonica-dev \
&& rm -rf /var/lib/apt/lists/*; \
fi
# Install FFmpeg dependencies (for hardsubx build)
RUN if [ "$BUILD_TYPE" = "hardsubx" ]; then \
apt-get update && apt-get install -y --no-install-recommends \
libavcodec-dev \
libavformat-dev \
libavutil-dev \
libswscale-dev \
libswresample-dev \
libavfilter-dev \
libavdevice-dev \
&& rm -rf /var/lib/apt/lists/*; \
fi
# Build and install GPAC library
WORKDIR /root
RUN git clone -b v2.4.0 --depth 1 https://github.com/gpac/gpac
WORKDIR /root/gpac
RUN ./configure && make -j$(nproc) lib && make install-lib && ldconfig
WORKDIR /root
RUN rm -rf /root/gpac
# Get CCExtractor source (either clone or copy based on USE_LOCAL_SOURCE)
WORKDIR /root
# First, copy local source if provided (will be empty dir if building standalone)
COPY . /root/ccextractor-local/
# Then get source: use local copy if USE_LOCAL_SOURCE=1 and source exists,
# otherwise clone from GitHub
RUN if [ "$USE_LOCAL_SOURCE" = "1" ] && [ -f /root/ccextractor-local/src/ccextractor.c ]; then \
echo "Using local source"; \
mv /root/ccextractor-local /root/ccextractor; \
else \
echo "Cloning from GitHub"; \
rm -rf /root/ccextractor-local; \
git clone --depth 1 https://github.com/CCExtractor/ccextractor.git /root/ccextractor; \
fi
WORKDIR /root/ccextractor/linux
# Generate build info
RUN ./pre-build.sh
# Build Rust library with appropriate features
RUN if [ "$BUILD_TYPE" = "hardsubx" ]; then \
cd ../src/rust && \
CARGO_TARGET_DIR=../../linux/rust cargo build --release --features hardsubx_ocr; \
else \
cd ../src/rust && \
CARGO_TARGET_DIR=../../linux/rust cargo build --release; \
fi
RUN cp rust/release/libccx_rust.a ./libccx_rust.a
# Compile CCExtractor
RUN if [ "$BUILD_TYPE" = "minimal" ]; then \
BLD_FLAGS="-std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP -DGPAC_64_BITS"; \
BLD_INCLUDE="-I../src -I../src/lib_ccx/ -I /usr/include/gpac/ -I../src/thirdparty/libpng -I../src/thirdparty/zlib -I../src/lib_ccx/zvbi -I../src/thirdparty/lib_hash -I../src/thirdparty -I../src/thirdparty/freetype/include"; \
BLD_LINKER="-lm -Wl,--allow-multiple-definition -lpthread -ldl -lgpac ./libccx_rust.a"; \
elif [ "$BUILD_TYPE" = "hardsubx" ]; then \
BLD_FLAGS="-std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DENABLE_OCR -DENABLE_HARDSUBX -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP -DGPAC_64_BITS"; \
BLD_INCLUDE="-I../src -I /usr/include/leptonica/ -I /usr/include/tesseract/ -I../src/lib_ccx/ -I /usr/include/gpac/ -I../src/thirdparty/libpng -I../src/thirdparty/zlib -I../src/lib_ccx/zvbi -I../src/thirdparty/lib_hash -I../src/thirdparty -I../src/thirdparty/freetype/include"; \
BLD_LINKER="-lm -Wl,--allow-multiple-definition -ltesseract -lleptonica -lpthread -ldl -lgpac -lswscale -lavutil -lavformat -lavcodec -lavfilter -lswresample ./libccx_rust.a"; \
else \
BLD_FLAGS="-std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DENABLE_OCR -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP -DGPAC_64_BITS"; \
BLD_INCLUDE="-I../src -I /usr/include/leptonica/ -I /usr/include/tesseract/ -I../src/lib_ccx/ -I /usr/include/gpac/ -I../src/thirdparty/libpng -I../src/thirdparty/zlib -I../src/lib_ccx/zvbi -I../src/thirdparty/lib_hash -I../src/thirdparty -I../src/thirdparty/freetype/include"; \
BLD_LINKER="-lm -Wl,--allow-multiple-definition -ltesseract -lleptonica -lpthread -ldl -lgpac ./libccx_rust.a"; \
fi && \
SRC_LIBPNG="$(find ../src/thirdparty/libpng/ -name '*.c')" && \
SRC_ZLIB="$(find ../src/thirdparty/zlib/ -name '*.c')" && \
SRC_CCX="$(find ../src/lib_ccx/ -name '*.c')" && \
SRC_GPAC="$(find /usr/include/gpac/ -name '*.c' 2>/dev/null || true)" && \
SRC_HASH="$(find ../src/thirdparty/lib_hash/ -name '*.c')" && \
SRC_UTF8PROC="../src/thirdparty/utf8proc/utf8proc.c" && \
SRC_FREETYPE="../src/thirdparty/freetype/autofit/autofit.c \
../src/thirdparty/freetype/base/ftbase.c \
../src/thirdparty/freetype/base/ftbbox.c \
../src/thirdparty/freetype/base/ftbdf.c \
../src/thirdparty/freetype/base/ftbitmap.c \
../src/thirdparty/freetype/base/ftcid.c \
../src/thirdparty/freetype/base/ftfntfmt.c \
../src/thirdparty/freetype/base/ftfstype.c \
../src/thirdparty/freetype/base/ftgasp.c \
../src/thirdparty/freetype/base/ftglyph.c \
../src/thirdparty/freetype/base/ftgxval.c \
../src/thirdparty/freetype/base/ftinit.c \
../src/thirdparty/freetype/base/ftlcdfil.c \
../src/thirdparty/freetype/base/ftmm.c \
../src/thirdparty/freetype/base/ftotval.c \
../src/thirdparty/freetype/base/ftpatent.c \
../src/thirdparty/freetype/base/ftpfr.c \
../src/thirdparty/freetype/base/ftstroke.c \
../src/thirdparty/freetype/base/ftsynth.c \
../src/thirdparty/freetype/base/ftsystem.c \
../src/thirdparty/freetype/base/fttype1.c \
../src/thirdparty/freetype/base/ftwinfnt.c \
../src/thirdparty/freetype/bdf/bdf.c \
../src/thirdparty/freetype/bzip2/ftbzip2.c \
../src/thirdparty/freetype/cache/ftcache.c \
../src/thirdparty/freetype/cff/cff.c \
../src/thirdparty/freetype/cid/type1cid.c \
../src/thirdparty/freetype/gzip/ftgzip.c \
../src/thirdparty/freetype/lzw/ftlzw.c \
../src/thirdparty/freetype/pcf/pcf.c \
../src/thirdparty/freetype/pfr/pfr.c \
../src/thirdparty/freetype/psaux/psaux.c \
../src/thirdparty/freetype/pshinter/pshinter.c \
../src/thirdparty/freetype/psnames/psnames.c \
../src/thirdparty/freetype/raster/raster.c \
../src/thirdparty/freetype/sfnt/sfnt.c \
../src/thirdparty/freetype/smooth/smooth.c \
../src/thirdparty/freetype/truetype/truetype.c \
../src/thirdparty/freetype/type1/type1.c \
../src/thirdparty/freetype/type42/type42.c \
../src/thirdparty/freetype/winfonts/winfnt.c" && \
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_LIBPNG $SRC_HASH $SRC_UTF8PROC $SRC_FREETYPE" && \
gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractor $BLD_SOURCES $BLD_LINKER
# Copy binary to known location
RUN cp /root/ccextractor/linux/ccextractor /ccextractor
# Final minimal image
FROM base AS final
ARG BUILD_TYPE=ocr
# Avoid interactive prompts
ENV DEBIAN_FRONTEND=noninteractive
# Install runtime dependencies based on build type
RUN apt-get update && apt-get install -y --no-install-recommends \
libpng16-16 \
libjpeg62-turbo \
zlib1g \
libssl3 \
libcurl4 \
&& rm -rf /var/lib/apt/lists/*
# OCR runtime dependencies
RUN if [ "$BUILD_TYPE" = "ocr" ] || [ "$BUILD_TYPE" = "hardsubx" ]; then \
apt-get update && apt-get install -y --no-install-recommends \
tesseract-ocr \
liblept5 \
&& rm -rf /var/lib/apt/lists/*; \
fi
# HardSubX runtime dependencies
RUN if [ "$BUILD_TYPE" = "hardsubx" ]; then \
apt-get update && apt-get install -y --no-install-recommends \
libavcodec59 \
libavformat59 \
libavutil57 \
libswscale6 \
libswresample4 \
libavfilter8 \
libavdevice59 \
&& rm -rf /var/lib/apt/lists/*; \
fi
# Copy GPAC library from builder
COPY --from=builder /usr/local/lib/libgpac.so* /usr/local/lib/
# Update library cache
RUN ldconfig
# Copy CCExtractor binary
COPY --from=builder /ccextractor /ccextractor
ENTRYPOINT ["/ccextractor"]

91
docker/README.md Normal file
View File

@@ -0,0 +1,91 @@
# CCExtractor Docker Image
This Dockerfile builds CCExtractor with support for multiple build variants.
## Build Variants
| Variant | Description | Features |
|---------|-------------|----------|
| `minimal` | Basic CCExtractor | No OCR support |
| `ocr` | With OCR support (default) | Tesseract OCR for bitmap subtitles |
| `hardsubx` | With burned-in subtitle extraction | OCR + FFmpeg for hardcoded subtitles |
## Building
### Standalone Build (from Dockerfile only)
You can build CCExtractor using just the Dockerfile - it will clone the source from GitHub:
```bash
# Default build (OCR enabled)
docker build -t ccextractor docker/
# Minimal build (no OCR)
docker build --build-arg BUILD_TYPE=minimal -t ccextractor docker/
# HardSubX build (OCR + FFmpeg for burned-in subtitles)
docker build --build-arg BUILD_TYPE=hardsubx -t ccextractor docker/
```
### Build from Cloned Repository (faster)
If you have already cloned the repository, you can use local source for faster builds:
```bash
git clone https://github.com/CCExtractor/ccextractor.git
cd ccextractor
# Default build (OCR enabled)
docker build --build-arg USE_LOCAL_SOURCE=1 -f docker/Dockerfile -t ccextractor .
# Minimal build
docker build --build-arg USE_LOCAL_SOURCE=1 --build-arg BUILD_TYPE=minimal -f docker/Dockerfile -t ccextractor .
# HardSubX build
docker build --build-arg USE_LOCAL_SOURCE=1 --build-arg BUILD_TYPE=hardsubx -f docker/Dockerfile -t ccextractor .
```
## Build Arguments
| Argument | Default | Description |
|----------|---------|-------------|
| `BUILD_TYPE` | `ocr` | Build variant: `minimal`, `ocr`, or `hardsubx` |
| `USE_LOCAL_SOURCE` | `0` | Set to `1` to use local source instead of cloning |
| `DEBIAN_VERSION` | `bookworm-slim` | Debian version to use as base |
## Usage
### Basic Usage
```bash
# Show version
docker run --rm ccextractor --version
# Show help
docker run --rm ccextractor --help
```
### Processing Local Files
Mount your local directory to process files:
```bash
# Process a video file with output file
docker run --rm -v $(pwd):$(pwd) -w $(pwd) ccextractor input.mp4 -o output.srt
# Process using stdout
docker run --rm -v $(pwd):$(pwd) -w $(pwd) ccextractor input.mp4 --stdout > output.srt
```
### Interactive Mode
```bash
docker run --rm -it --entrypoint=/bin/bash ccextractor
```
## Image Size
The multi-stage build produces runtime images:
- `minimal`: ~130MB
- `ocr`: ~215MB (includes Tesseract)
- `hardsubx`: ~610MB (includes Tesseract + FFmpeg)

View File

@@ -29,7 +29,7 @@ To do:
though. No samples, no support.
- A few commands are not yet supported, specifically those related
to delay.
- Detect and extract captions from MP4 (MOV) files, handled by gpacmp4
- Detect and extract captions from MP4 (MOV) files, handled by gpac
Done (18.08.2015):

92
docs/AUTHORS.TXT Normal file
View File

@@ -0,0 +1,92 @@
ccextractor was originally a mildly optimized C port of McPoodle's excellent
but painfully slow Perl script SCC_RIP. That port (ccextractor 0.01) was
written by Carlos Fernández (cfsmp3).
After a number of versions that did something semiuseful Volker Quetschke
joined the effort and together Carlos and Volker to CCExtractor a point in
which it was actually really usable, at least for the cases that interested
them.
Unfortunately Volker moved on once CCExtractor did what he needed to do for
him.
At some point David Liontooth from UCLA started to use CCExtractor as a
replacement for libzvbi because libzvbi wasn't working for some specific
streams. UCLA became the primary key user as they were using CCExtractor
24x7 to process a huge amount of stream from several countries, and was
therefore able to provide samples, proper bug reports, etc.
At that time CCEXtractor was still US-centric, because it was originally
written so Carlos could get subtitles for US TV shows. But UCLA wanted
European subtitles too, and they already had recording nodes in Denmark
(which use teletext) and Spain (which uses DVB).
For teletext a good solution existed already: Petr Kutalek's telxcc.
We contacted Petr and asked for permission to integrate his code into
CCExtractor. Petr's absolutely brilliantly clean code was easy to
integrate and build upon - and with it, we added support for the first
kind of European subtitles.
Around that time, we decided to apply for Google Summer of Code. That
was also a game changer, with Willem, Ruslan and Anshul being the first
3 students. They are still around, now as mentors and year round
contributors.
Since them, many more people have been involved: More than 10 as
Google Summer of Code students, Code-In students, companies that
sponsored development by hiring team members to do custom development
(Comcast was the first one, and we'll always be grateful for the
opportunity).
List of students is below (if they added themselves). For a complete
list, just check the pull requests at GitHub.
Home: https://www.ccextractor.org
Google Summer of Code 2014 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Anshul Maheshwari
Google Summer of Code 2015 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Anshul Maheshwari
- Nurendra Choudhary
- Oleg Kiselev
- Vasanth Kalingeri
Google Summer of Code 2016 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Abhishek Vinjamoori
- Abhinav Shukla
- Rishabh Garg
Google Code-in 2016 students
- Evgeny Shulgin
- Manveer Basra
- Alexandru Bratosin
- Matej Plavevski
- Danila Fedorin
Google Code-in 2017 students
- Matej Plavevski
- Harry Yu
- Theodore Fabian
- Nikunj Taneja
- John Chew
- Aadi Bajpai
- Wiliam(Hori75)
Google Summer of Code 2017 students
- Diptanshu Jamgade
- Mayank Gupta
Google Code-in 2018 students
- Matej Plavevski
- Ivan Makarov
- Albert (alufers)
- Brian M
- John Chew
- T1duS

View File

@@ -0,0 +1,157 @@
# Building CCExtractor on macOS using System Libraries (-system-libs)
## Overview
This document explains how to build CCExtractor on macOS using system-installed libraries instead of bundled third-party libraries.
This build mode is required for Homebrew compatibility and is enabled via the `-system-libs` flag introduced in PR #1862.
## Why is -system-libs needed?
### Background
CCExtractor was removed from Homebrew (homebrew-core) because:
- Homebrew does not allow bundling third-party libraries
- The default CCExtractor build compiles libraries from `src/thirdparty/`
- This violates Homebrew packaging policies
### What -system-libs fixes
The `-system-libs` flag allows CCExtractor to:
- Use system-installed libraries via Homebrew
- Resolve headers and linker flags using `pkg-config`
- Skip compiling bundled copies of common libraries
This makes CCExtractor acceptable for Homebrew packaging.
## Build Modes Explained
### 1⃣ Default Build (Bundled Libraries)
**Command:**
```bash
./mac/build.command
```
**Behavior:**
- Compiles bundled libraries:
- `freetype`
- `libpng`
- `zlib`
- `utf8proc`
- Self-contained binary
- Larger size
- Suitable for standalone builds
### 2⃣ System Libraries Build (Homebrew-compatible)
**Command:**
```bash
./mac/build.command -system-libs
```
**Behavior:**
- Uses system libraries via `pkg-config`
- Does not compile bundled libraries
- Smaller binary
- Faster build
- Required for Homebrew
## Required Homebrew Dependencies
Install required dependencies:
```bash
brew install pkg-config autoconf automake libtool \
gpac freetype libpng protobuf-c utf8proc zlib
```
**Optional** (OCR / HARDSUBX support):
```bash
brew install tesseract leptonica ffmpeg
```
## How to Build
```bash
cd mac
./build.command -system-libs
```
**Verify:**
```bash
./ccextractor --version
```
## What Changes Internally with -system-libs
### Libraries NOT compiled (system-provided)
- **FreeType**
- **libpng**
- **zlib**
- **utf8proc**
### Libraries STILL bundled
- **lib_hash** (Custom SHA-256 implementation, no system equivalent)
## CI Coverage
A new CI job was added:
- `build_shell_system_libs`
**What it does:**
- Installs Homebrew dependencies
- Runs `./build.command -system-libs`
- Verifies the binary runs correctly
This ensures Homebrew-compatible builds stay working.
## Verification (Local)
You can confirm system libraries are used:
```bash
otool -L mac/ccextractor
```
**Expected output includes paths like:**
```
/opt/homebrew/opt/gpac/lib/libgpac.dylib
```
## Homebrew Formula Usage (Future)
Example formula snippet:
```ruby
def install
system "./mac/build.command", "-system-libs"
bin.install "mac/ccextractor"
end
```
## Summary
- `-system-libs` is opt-in
- Default build remains unchanged
- Enables CCExtractor to return to Homebrew
- Fully tested in CI and locally
## Related
- **PR #1862** — Add `-system-libs` flag
- **Issue #1580** — Homebrew compatibility
- **Issue #1534** — System library support

File diff suppressed because it is too large Load Diff

341
docs/COMPILATION.MD Normal file
View File

@@ -0,0 +1,341 @@
# Installation
## Homebrew
The easiest way to install CCExtractor for Mac and Linux is through Homebrew:
```bash
brew install ccextractor
```
Note: If you don't have Homebrew installed, see [brew.sh](https://brew.sh/)
for installation instructions.
---
# Compiling CCExtractor
You may compile CCExtractor across all major platforms using `CMakeLists.txt` stored under `ccextractor/src/` directory. Autoconf and custom build scripts are also available. See platform specific instructions in the below sections.
Downloads for precompiled binaries and source code can be found [on our website](https://www.ccextractor.org?id=public:general:downloads).
Clone the latest repository from Github
```bash
git clone https://github.com/CCExtractor/ccextractor.git
```
### Hardsubx (Burned-in Subtitles) and FFmpeg Versions
CCExtractor's hardsubx feature extracts burned-in subtitles from videos using OCR. It requires FFmpeg libraries. The build system automatically selects appropriate FFmpeg versions for each platform:
- **Linux**: FFmpeg 6.x (default)
- **Windows**: FFmpeg 6.x (default)
- **macOS**: FFmpeg 8.x (default)
You can override the default by setting the `FFMPEG_VERSION` environment variable to `ffmpeg6`, `ffmpeg7`, or `ffmpeg8` before building. This flexibility ensures compatibility with different FFmpeg installations across platforms.
## Docker
You can now use docker image to build latest source of CCExtractor without any environmental hustle. Follow these [instructions](https://github.com/CCExtractor/ccextractor/tree/master/docker/README.md) for building docker image & usage of it.
## Linux
1. Make sure all the dependencies are met.
Debian:
```bash
sudo apt-get install -y libgpac-dev libglew-dev libglfw3-dev cmake gcc libcurl4-gnutls-dev tesseract-ocr libtesseract-dev libleptonica-dev clang libclang-dev
```
RHEL/Fedora:
```bash
yum install -y glew-devel glfw-devel cmake gcc libcurl-devel tesseract-devel leptonica-devel clang gpac-devel
```
Arch:
```bash
sudo paru -S glew glfw curl tesseract leptonica cmake gcc clang gpac
```
or
```bash
sudo pacman -S glew glfw curl tesseract leptonica cmake gcc clang gpac
```
Rust 1.54 or above is also required. [Install Rust](https://www.rust-lang.org/tools/install). Check specific compilation methods below, on how to compile without rust.
**Note:** On Ubuntu Version 23.10 (Mantic) and later, `libgpac-dev` isn't available, you should build gpac from source by following the easy build instructions [here](https://github.com/gpac/gpac/wiki/GPAC-Build-Guide-for-Linux)
**Note:** On Ubuntu Version 18.04 (Bionic) and later, `libtesseract-dev` is installed rather than `tesseract-ocr-dev`, which does not exist anymore.
**Note:** On Ubuntu Version 14.04 (Trusty) and earlier, you should build leptonica and tesseract from source
2. Compiling
### Using the build script
By default build script does not include debugging information hence, you cannot debug the executable produced (i.e. `./ccextractor`) on a debugger. To include debugging information, use the `builddebug` script.
```bash
# navigate to linux directory and call the build script
cd ccextractor/linux
# compile without debug flags
./build
# compile with debug info
./build -debug # same as ./builddebug
# compile with hardsubx (burned-in subtitle extraction)
# Hardsubx requires FFmpeg libraries. Different FFmpeg versions are used by default:
# - Linux: FFmpeg 6.x (automatic)
# - Windows: FFmpeg 6.x (automatic)
# - macOS: FFmpeg 8.x (automatic)
./build -hardsubx # uses platform-specific FFmpeg version
# To override the default FFmpeg version, set FFMPEG_VERSION:
FFMPEG_VERSION=ffmpeg8 ./build -hardsubx # force FFmpeg 8 on any platform
FFMPEG_VERSION=ffmpeg6 ./build -hardsubx # force FFmpeg 6 on any platform
FFMPEG_VERSION=ffmpeg7 ./build -hardsubx # force FFmpeg 7 on any platform
# [Optional] For custom FFmpeg installations, set these environment variables:
FFMPEG_INCLUDE_DIR=/usr/include
FFMPEG_PKG_CONFIG_PATH=/usr/lib/pkgconfig
# test your build
./ccextractor
```
### Standard linux compilation through Autoconf scripts
```bash
sudo apt-get install autoconf # dependency to generate configuration script
cd ccextractor/linux
./autogen.sh
./configure
make
# test your build
./ccextractor
# make build systemwide
sudo make install
```
### Using CMake
```bash
# create and navigate to directory where you want to store built files
cd ccextractor/
mkdir build
cd build
# generate makefile using cmake and then compile
cmake ../src/ # options here
make
# test your build
./ccextractor
# make build systemwide
sudo make install
```
`cmake` also accepts the options:
`-DWITH_OCR=ON` to enable OCR
`-DWITH_HARDSUBX=ON` to enable burned-in subtitles (requires FFmpeg)
For hardsubx with specific FFmpeg versions:
Set `FFMPEG_VERSION=ffmpeg6` for FFmpeg 6.x (default on Linux and Windows)
Set `FFMPEG_VERSION=ffmpeg7` for FFmpeg 7.x
Set `FFMPEG_VERSION=ffmpeg8` for FFmpeg 8.x
(Defaults: Linux=FFmpeg 6, Windows=FFmpeg 6, macOS=FFmpeg 8)
([OPTIONAL] For custom FFmpeg installations, set these environment variables)
FFMPEG_INCLUDE_DIR=/usr/include
FFMPEG_PKG_CONFIG_PATH=/usr/lib/pkgconfig
### Compiling with GUI
The GUI for CCExtractor has been moved to a separate repository ([https://github.com/CCExtractor/ccextractorfluttergui](https://github.com/CCExtractor/ccextractorfluttergui)).
## macOS
1. Make sure all the dependencies are met. Decide if you want OCR; if so, you'll need to install tesseract and leptonica.
Dependencies can be installed via Homebrew as:
```bash
brew install pkg-config
brew install autoconf automake libtool
brew install cmake gpac
# optional if you want OCR:
brew install tesseract
brew install leptonica
# optional if you want hardsubx (burned-in subtitle extraction):
brew install ffmpeg
```
If configuring OCR, use pkg-config to verify tesseract and leptonica dependencies, e.g.
```bash
pkg-config --exists --print-errors tesseract
pkg-config --exists --print-errors lept
```
### Compiling
#### Using build.command script:
```bash
cd ccextractor/mac
./build.command # basic build
./build.command -ocr # build with OCR support
./build.command -hardsubx # build with hardsubx (uses FFmpeg 8 by default on macOS)
# Override FFmpeg version if needed:
FFMPEG_VERSION=ffmpeg7 ./build.command -hardsubx
# test your build
./ccextractor
```
#### Using CMake
```bash
# create and navigate to directory where you want to store built files
cd ccextractor/
mkdir build
cd build
# generate makefile using cmake and then compile
cmake ../src/ # options here
make
# test your build
./ccextractor
```
`cmake` also accepts the options:
`-DWITH_OCR=ON` to enable OCR
`-DWITH_HARDSUBX=ON` to enable burned-in subtitles
#### Standard compilation through Autoconf scripts:
```bash
cd ccextractor/mac
./autogen.sh
./configure
make
# test your build
./ccextractor
```
#### Compiling with GUI:
The GUI for CCExtractor has been moved to a separate repository ([https://github.com/CCExtractor/ccextractorfluttergui](https://github.com/CCExtractor/ccextractorfluttergui)).
## Windows
Dependencies are clang and rust. To enable OCR, rust x86_64-pc-windows-msvc or i686-pc-windows-msvc target should be installed
GPAC is also required, you can install it through chocolatey:
```
choco install gpac
```
Other dependencies are required through vcpkg, so you can follow below steps:
1. Download vcpkg (prefer version `2023.02.24` as it is supported)
2. Integrate vcpkg into your system, run the below command in the downloaded vcpkg folder:
```
vcpkg integrate install
```
3. Set Environment Variable for Vcpkg triplet, you can choose between x86 or x64 based on your system.
```
setx VCPKG_DEFAULT_TRIPLET "x64-windows-static"
setx RUSTFLAGS "-Ctarget-feature=+crt-static"
```
4. Install dependencies from vcpkg
In this step we are using `x64-windows-static` triplet, but you will have to use the triplet you set in Step 3
if building Debug-Full, Release-Full (HardSubx)
```
vcpkg install ffmpeg leptonica tesseract --triplet x64-windows-static
```
Note: Windows builds use FFmpeg 6 by default. To override:
```
set FFMPEG_VERSION=ffmpeg8
msbuild ccextractor.sln /p:Configuration=Debug-Full /p:Platform=x64
```
otherwise if you have Debug, Release
```
vcpkg install libpng --triplet x64-windows-static
```
Note: Following screenshots and steps are based on Visual Studio 2017, but they should be more or less same for other versions.
1.Open `windows/` directory to locate `ccextractor.vcxproj` and `ccextractor.sln` (red arrow).
![Project Files](img/projectFiles.png)
2.Accept the security prompt (if any), to proceed with compilation.
![A warning you can receive](img/Warning.png)
3.Using Visual Studio (2015 or above), open ccextractor.sln. This will build both CCExtractor and its GUI. To build them separately, open the respective .vcxproj file.
4.In Solution Explorer, you'll see two projects with the VS version and Windows release version in parenthesis. Change them to parameters which are true for you by clicking right mouse button on project and selecting properties.
![Project Section](img/ProjectSection.png)
![Properties, that you have to change](img/Properties.png)
5.Right click and select `build` to compile the project and generate executable file.
![Building button](img/Building.png)
6.Find the executable file in `Debug` or `Release` folder, based on selected configuration.
![Path to Binaries](img/Binaries.png)
Configurations options are: `(Debug|Release)-Full`
Configurations options include dependent libraries which are used for OCR.
### Using CMake
You may also generate `.sln` files for Visual Studio and build using build tools, or open `.sln` files using Visual Studio.
```bash
cmake ../src/ -G "Visual Studio 14 2015"
cmake --build . --config Release --ccextractor
```
### Using MSBuild
Run the following command in `windows/` directory
```bash
msbuild ccextractor.sln /p:Configuration=Release /p:Platform=x64
```
Different configuration options are,
| Configuration | Platform | Rust target required |
| ------------- |:-------------:| -----:|
| Release | x64 | default |
| Debug | x64 | default |
| Release-Full(OCR) | Win32 | i686-pc-windows-msvc |
| Debug-Full(OCR) | Win32 | i686-pc-windows-msvc |
## Building Installation Packages
### Arch Linux
Go to the package_creators folder using `cd` and run the `./arch.sh`
### Redhat Package Manager (rpm) based Linux Distributions
Go to the package_creators folder using `cd` and run the `./rpm.sh`

View File

@@ -1,58 +0,0 @@
Overview
========
FFmpeg Intigration was done to support multiple encapsulator.
Dependecy
=========
FFmpeg library's
Download and Install FFmpeg on your linux pc.
---------------------------------------------
Download latest source code from following link
https://ffmpeg.org/download.html
then following command to install ffmpeg
./configure && make && make install
Note:If you installed ffmpeg on non standurd location, please change/update your
enviorment variable $PATH and $LD_LIBRARY_PATH
Download and Install FFmpeg on your Windows pc.
----------------------------------------------
Download prebuild library from following link
http://ffmpeg.zeranoe.com/builds/
You need to download Shared Versions to run the program and Dev Versions to compile.
How to compile ccextractor
==========================
In Linux
--------
make ENABLE_FFMPEG=yes
On Windows
----------
put the path of libs/include of ffmpeg library in library paths.
step 1) In visual studio 2013 right click <Project> and select property.
step 2) Select Configuration properties in left panel(column) of property.
step 3) Select VC++ Directory.
step 4) In the right pane, in the right-hand column of the VC++ Directory property,
open the drop-down menu and choose Edit.
Step 5) Add path of Directory where you have kept uncompressed library of FFmpeg.
Set preprocessor flag ENABLE_FFMPEG=1
Step 1)In visual studio 2013 right click <Project> and select property.
Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4)In the Preprocessor Definitions dialog box, add ENABLE_FFMPEG=1. Choose OK to save your changes.
Add library in linker
step 1)Open property of project
Step 2)Select Configuration properties
Step 3)Select Linker in left panel(column)
Step 4)Select Input
Step 5)Select Additional dependencies in right panel
Step 6)Add all FFmpeg's lib in new line

48
docs/FFMPEG.md Normal file
View File

@@ -0,0 +1,48 @@
# Overview
FFmpeg Integration was done to support multiple encapsulations.
## Dependencies
FFmpeg libraries
### Download and Install FFmpeg on your Linux pc:
Download latest source code from following link
https://ffmpeg.org/download.html
Then following command to install ffmpeg:
`./configure && make && make install`
Note:If you installed ffmpeg on non-standard location, please change/update your
environment variable `$PATH` and `$LD_LIBRARY_PATH`
### Download and Install FFmpeg on your Windows pc:
1. Download vcpkg (prefer version `2023.02.24` as it is supported)
2. Integrate vcpkg into your system, run the below command in the downloaded vcpkg folder:
```
vcpkg integrate install
```
3. Set Environment Variable for Vcpkg triplet, you can choose between x86 or x64 based on your system.
```
setx VCPKG_DEFAULT_TRIPLET "x64-windows-static"
setx RUSTFLAGS "-Ctarget-feature=+crt-static"
```
4. Install ffmpeg from vcpkg
In this step we are using `x64-windows-static` triplet, but you will have to use the triplet you set in Step 3
```
vcpkg install ffmpeg --triplet x64-windows-static
```
## How to compile ccextractor
### On Linux:
`make ENABLE_FFMPEG=yes`
### On Windows:
#### Set preprocessor flag `ENABLE_FFMPEG=1`
1. In visual studio 2022 right click <Project> and select property.
2. In the left panel, select Configuration Properties, C/C++, Preprocessor.
3. In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
4. In the Preprocessor Definitions dialog box, add `ENABLE_FFMPEG=1`. Choose OK to save your changes.

View File

@@ -1,4 +1,4 @@
Starting with version 0.51, ccextractor has a mode
Starting with version 0.51, CCExtractor has a mode
that allows frontends and other programs know what
the current progress is as well as get information
on interesting events, such as a file being open
@@ -80,7 +80,7 @@ VIDEOINFO - New video information found
Horizontal resolution
Vertical resolution
Aspect ratio
Framerate
Frame rate
Example: ###VIDEOINFO#1980#1080#16:9#29.97

View File

@@ -3,7 +3,7 @@ G608
G608 (for grid 608) is generated by CCExtractor by using -out=g608.
This is a verbose format that exports the contents of the 608 grid verbatim
so there's no loss of positioning or colors due the limitations or complexy
so there's no loss of positioning or colors due the limitations or complexity
or other output formats.
G608 is a text file with a structure based on .srt and looks like this:
@@ -46,13 +46,13 @@ The possible color values are:
And the possible font values are:
R => Regular
I => Italic
I => Italics
U => Underlined
B => Underlined + italic
B => Underlined + Italics
If a 'E' is found in ether color or font that means a bug in CCExtractor. Should you ever get
If a 'E' is found in either color or font that means a bug in CCExtractor. Should you ever get
an E please send us a .bin file that causes it.
This format is intended for post processing tools that need to represent the output of a 608
decoder accurately but that don't want to deal with the madness of other more generic subtitle
formats.
formats.

86
docs/HARDSUBX.txt Normal file
View File

@@ -0,0 +1,86 @@
Overview
========
Subtitles which are burned into the video (or hard subbed) can be extracted using the -hardsubx flag.
The system works by processing video frames and extracting only the subtitles from them, followed
by an OCR recognition using Tesseract.
Dependencies
============
Tesseract (OCR library by Google)
Leptonica (C Image processing library)
FFMpeg (Video Processing Library)
Compilation
===========
Linux
-----
Make sure Tesseract, Leptonica and FFMPeg are installed, and that their libraries can be found using pkg-config.
Refer to OCR.txt for installation details.
FFmpeg from packages (on Debian) plus a couple of other dependencies you will need:
sudo apt-get install libavcodec-dev libavformat-dev libavutil-dev libswscale-dev libxcb-shm0-dev liblzma-dev
FFmpeg from source:
To install FFmpeg (libav), follow the steps at:-
https://trac.ffmpeg.org/wiki/CompilationGuide/Ubuntu - For Ubuntu, Debian and Linux Mint
https://trac.ffmpeg.org/wiki/CompilationGuide/Generic - For generic Linux compilation
To validate your FFMpeg installation, make sure you can run the following commands on your terminal:-
pkg-config --cflags libavcodec
pkg-config --cflags libavformat
pkg-config --cflags libavutil
pkg-config --cflags libswscale
pkg-config --libs libavcodec
pkg-config --libs libavformat
pkg-config --libs libavutil
pkg-config --libs libswscale
On success, you should see the correct include directory path and the linker flags.
To build the program with hardsubx support,
== from the Linux directory run:-
./configure --enable-hardsubx
make ENABLE_HARDSUBX=yes
== using cmake from root directory
mkdir build
cd build
cmake -DWITH_OCR=on -DWITH_HARDSUBX=on ../src/
make
NOTE: The build has been tested with FFMpeg version 3.1.0, and Tesseract 3.04.
macOS
-----
Install the required dependencies using Homebrew:
brew install tesseract leptonica ffmpeg
To build the program with hardsubx support, use one of these methods:
== Using build.command (Recommended):
cd ccextractor/mac
./build.command -hardsubx
== Using autoconf:
cd ccextractor/mac
./autogen.sh
./configure --enable-hardsubx --enable-ocr
make
== Using cmake:
cd ccextractor
mkdir build && cd build
cmake -DWITH_OCR=ON -DWITH_HARDSUBX=ON ../src/
make
NOTE: The -hardsubx parameter uses a single dash (not --hardsubx).
Windows
-------
Coming Soon

View File

@@ -1,17 +1,9 @@
A mailing list is now available from sourceforge:
A mailing list is now available from google groups:
https://groups.google.com/forum/#!forum/ccextractor-dev
The old one, hosted in sourceforge, is discontinued, but here is the link just in case:
https://lists.sourceforge.net/lists/listinfo/ccextractor-users
I expect it to be very low traffic (right now there's around 10
people actively helping with ccextractor in one way or
another), so almost everything goes here:
- Bug reports
- Feature requests
- Announcements
NOT here:
- Samples

123
docs/OCR.md Normal file
View File

@@ -0,0 +1,123 @@
# Overview
OCR (Optical Character Recognition) is a technique used to
extract text from images. In the World of Subtitle, subtitle stored
in bitmap format are common and even necessary. For converting subtitle
in bitmap format to subtitle in text format OCR is used.
# Dependency
1. Tesseract (OCR library by Google)
2. Leptonica (Image processing library)
# How to compile CCExtractor on Linux with OCR
## Install Dependency
### Using package manager
#### Ubuntu, Debian
```
sudo apt-get install libleptonica-dev libtesseract-dev tesseract-ocr-eng
```
#### Suse
```
zypper install leptonica-devel
```
### Downloading source code and compiling it.
#### Leptonnica.
This package is available in your distro, you need liblept-devel library.
If Leptonica isn't available for your distribution, or you want to use a newer version
than they offer, you can compile your own.
you can download lib leptonica source code from http://www.leptonica.com/download.html
#### Tesseract.
Tesseract is available directly from many Linux distributions. The package is generally
called 'tesseract' or 'tesseract-ocr' - search your distribution's repositories to
find it. Packages are also generally available for language training data (search the
repositories,) but if not you will need to download the appropriate training data,
unpack it, and copy the .traineddata file into the 'tessdata' directory, probably
/usr/share/tesseract-ocr/tessdata or /usr/share/tessdata.
If Tesseract isn't available for your distribution, or you want to use a newer version
than they offer, you can compile your own.
If you compile Tesseract then following command in its source code are enough
```
./autogen.sh
./configure
make
sudo make install
sudo ldconfig
```
Note:
1. CCExtractor is tested with Tesseract 3.04 version but it works with older versions.
2. Useful Download links:
1. *Tesseract* https://github.com/tesseract-ocr/tesseract/archive/3.04.00.tar.gz
2. *Tesseract training data* https://github.com/tesseract-ocr/tessdata/archive/3.04.00.tar.gz
##Compilation
###using Build script
```
cd ccextractor/linux
./build
```
### Passing flags to configure
```
cd ccextractor/linux
./autogen.sh
./configure --with-gui --enable-ocr
make
```
### Passing flags to cmake
```
cd <CCExrtactor cloned code>
mkdir build
cd build
cmake -DWITH_OCR=ON ../src
make
```
How to compile CCExtractor on Windows with OCR
===============================================
Download prebuild library of leptonica and tesseract from following link
https://drive.google.com/file/d/0B2ou7ZfB-2nZOTRtc3hJMHBtUFk/view?usp=sharing
put the path of libs/include of leptonica and tesseract in library paths.
1. In visual studio 2022 right click <Project> and select property.
2. Select Configuration properties in left panel(column) of property.
3. Select VC++ Directory.
4. In the right pane, in the right-hand column of the VC++ Directory property, open the drop-down menu and choose Edit.
5. Add path of Directory where you have kept uncompressed library of leptonica and tesseract.
Set preprocessor flag ENABLE_OCR=1
1. In visual studio 2022 right click <Project> and select property.
2. In the left panel, select Configuration Properties, C/C++, Preprocessor.
3. In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
4. In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.
Add library in linker
1. Open property of project
2. Select Configuration properties
3. Select Linker in left panel(column)
4. Select Input
5. Select Additional dependencies in right panel
6. Add libtesseract304d.lib in new line
7. Add liblept172.lib in new line
Download language data from following link
https://code.google.com/p/tesseract-ocr/downloads/list
after downloading the tesseract-ocr-3.02.eng.tar.gz extract the tar file and put
tessdata folder where you have kept CCExtractor executable
Copy the tesseract and leptonica dll from lib folder downloaded from above link to folder of executable or in system32.

View File

@@ -1,94 +0,0 @@
Overview
========
OCR (Optical Character Recognisation ) is an technique used to
extract text from images. In the World of Subtile, subtitle stored
in bitmap format are common and even neccassary. for converting subtile
in bitmap format to subtilte in text format ocr is used.
Dependency
==========
Tesseract (OCR library by google)
Leptonica (image processing library)
How to compile ccextractor on linux with OCR
=============================================
Download and Install Leptonnica.
-------------------------------
This package is available, you need liblept-devel library.
If Leptonica isn't available for your distribution, or you want to use a newer version
than they offer, you can compile your own.
you can download lib leptonica from http://www.leptonica.com/download.html
Download and Install Tesseract.
-------------------------------
Tesseract is available directly from many Linux distributions. The package is generally
called 'tesseract' or 'tesseract-ocr' - search your distribution's repositories to
find it. Packages are also generally available for language training data (search the
repositories,) but if not you will need to download the appropriate training data,
unpack it, and copy the .traineddata file into the 'tessdata' directory, probably
/usr/share/tesseract-ocr/tessdata or /usr/share/tessdata.
If Tesseract isn't available for your distribution, or you want to use a newer version
than they offer, you can compile your own.
If you compile Tesseract then following command in its source code are enough
./autogen.sh
./configure
make
sudo make install
sudo ldconfig
Note:
1) CCExtractor is tested with Tesseract 3.04 version but it works with older versions.
you can download tesseract from https://github.com/tesseract-ocr/tesseract/archive/3.04.00.tar.gz
you can download tesseract training data from https://github.com/tesseract-ocr/tessdata/archive/3.04.00.tar.gz
Compile CCextractor passing flags like following
-------------------------------------------------
make ENABLE_OCR=yes
How to compile ccextractor on Windows with OCR
===============================================
Download prebuild library of leptonica and tesseract from following link
https://drive.google.com/file/d/0B2ou7ZfB-2nZOTRtc3hJMHBtUFk/view?usp=sharing
put the path of libs/include of leptonica and tesseract in library paths.
step 1) In visual studio 2013 right click <Project> and select property.
step 2) Select Configuration properties in left panel(column) of property.
step 3) Select VC++ Directory.
step 4) In the right pane, in the right-hand column of the VC++ Directory property,
open the drop-down menu and choose Edit.
Step 5) Add path of Directory where you have kept uncompressed library of leptonica
and tesseract.
Set preprocessor flag ENABLE_OCR=1
Step 1)In visual studio 2013 right click <Project> and select property.
Step 2)In the left panel, select Configuration Properties, C/C++, Preprocessor.
Step 3)In the right panel, in the right-hand column of the Preprocessor Definitions property, open the drop-down menu and choose Edit.
Step 4)In the Preprocessor Definitions dialog box, add ENABLE_OCR=1. Choose OK to save your changes.
Add library in linker
step 1)Open property of project
Step 2)Select Configuration properties
Step 3)Select Linker in left panel(column)
Step 4)Select Input
Step 5)Select Additional dependencies in right panel
Step 6)Add libtesseract304d.lib in new line
Step 7)Add liblept172.lib in new line
Download language data from following link
https://code.google.com/p/tesseract-ocr/downloads/list
after downloading the tesseract-ocr-3.02.eng.tar.gz extract the tar file and put
tessdata folder where you have kept ccextractor executable
Copy the tesseract and leptonica dll from lib folder downloaded from above link to folder of executable or in system32.

View File

@@ -1,49 +1,16 @@
ccextractor, 0.81
-----------------
Authors: Carlos Fernández (cfsmp3), Volker Quetschke.
Maintainer: cfsmp3
## CCExtractor
check AUTHORS.TXT for history and developers
Lots of credit goes to other people, though:
McPoodle (author of the original SCC_RIP), Neuron2, and others (see source
code).
Home: http://www.ccextractor.org
You can subscribe to new releases notifications at freshmeat:
http://freshmeat.net/projects/ccextractor
Google Summer of Code 2014 students
- Willem Van Iseghem
- Ruslan KuchumoV
- Anshul Maheshwari
Google Summer of Code 2015 students
- Willem Van Iseghem
- Ruslan Kuchumov
- Anshul Maheshwari
- Nurendra Choudhary
- Oleg Kiselev
- Vasanth Kalingeri
License
-------
## License
GPL 2.0.
Description
-----------
ccextractor was originally a mildly optimized C port of McPoodle's excellent
but painfully slow Perl script SCC_RIP. It lets you rip the raw closed
captions (read: subtitles) data from a number of sources, such as DVD or
ATSC (digital TV) streams.
Since the original port, lots of changes have been made, such as HDTV
support, analog captures support (via bttv cards), direct .srt/.smi
generation, time adjusting, and more.
## Description
Since the original port, the whole code has been rewritten (more than once,
one might add) and support for most subtitle formats around the world has
been added (teletext, DVB, CEA-708, ISDB...)
Basic Usage
-----------
## Basic Usage
(please run ccextractor with no parameters for the complete manual -
this is for your convenience, really).
@@ -59,9 +26,16 @@ Running ccextractor without parameters shows the help screen. Usage is
trivial - you just need to pass the input file and (optionally) some
details about the input and output files.
Example:
Languages
---------
ccextractor input_video.ts
This command extracts subtitles from the input video file and generates a subtitle output file
(such as .srt) in the same directory.
## Languages
Usually English captions are transmitted in line 21 field 1 data,
using channel 1, so the default values are correct so you don't
need to do anything and you don't need to understand what it all
@@ -79,20 +53,17 @@ So try adding these parameter combinations to your other parameters.
If there are Spanish subtitles, one of them should work.
McPoodle's page
---------------
## McPoodle's page
http://www.theneitherworld.com/mcpoodle/SCC_TOOLS/DOCS/SCC_TOOLS.HTML
Essential CC related information and free (with source) tools.
Encoding
--------
## Encoding
This version, in both its Linux and Windows builds generates by
default Unicode files. You can use -latin1 and -utf8 if you prefer
these encodings (usually it just depends on what your specific
player likes).
Future work
-----------
## Future work
- Please check www.ccextractor.org for news and future work.

View File

@@ -0,0 +1,71 @@
# C to Rust Migration Guide
## Porting C Functions to Rust
This guide outlines the process of migrating C functions to Rust while maintaining compatibility with existing C code.
### Step 1: Identify the C Function
First, identify the C function you want to port. For example, let's consider a function named `net_send_cc()` in a file called `networking.c`:
```c
void net_send_cc() {
// Some C code
}
```
### Step 2: Create a Pure Rust Equivalent
Write an equivalent function in pure Rust within the `lib_ccxr` module:
```rust
fn net_send_cc() {
// Rust equivalent code to `net_send_cc` function in `networking.c`
}
```
### Step 3: Create a C-Compatible Rust Function
In the `libccxr_exports` module, create a new function that will be callable from C:
```rust
#[no_mangle]
pub extern "C" fn ccxr_net_send_cc() {
net_send_cc() // Call the pure Rust function
}
```
### Step 4: Declare the Rust Function in C
In the original C file (`networking.c`), declare the Rust function as an external function:
```rust
extern void ccxr_net_send_cc();
```
### Step 5: Modify the Original C Function
Update the original C function to use the Rust implementation when available:
```c
void net_send_cc() {
#ifndef DISABLE_RUST
return ccxr_net_send_cc(); // Use the Rust implementation
#else
// Original C code
#endif
}
```
## Rust module system
- `lib_ccxr` crate -> **The Idiomatic Rust layer**
- Path: `src/rust/lib_ccxr`
- This layer will contain the migrated idiomatic Rust. It will have complete documentation and tests.
- `libccxr_exports` module -> **The C-like Rust layer**
- Path: `src/rust/src/libccxr_exports`
- This layer will have function names the same as defined in C but with the prefix `ccxr_`. These are the functions defined in the `lib_ccx` crate under appropriate modules. And these functions will be provided to the C library.
- Ex: `extern "C" fn ccxr_<function_name>(<args>) {}`

View File

@@ -0,0 +1,27 @@
A guide to how dependencies should be updated in CCExtractor.
Author: thealphadollar
======================
CCExtractor depends on multiple dependencies and they are updated from time to time. On every major revision of the dependencies, the changes need to be incorporated into our repository.
It is not straightforward since we make minor (or sometimes major) changes into the library to use it and these changes are lost in case of direct file replacement. To overcome this issue, we should follow the below pathway.
*) Create a duplicate copy of the CCExtractor's folder of the library, to be updated (we will be calling this folder lib(copy) in steps and original one as lib).
*) Download the latest files of the library from official source (the folder is called as lib(orig) in further steps).
*) Look for files with the same name in lib and lib(orig). It can be done manually in case of small libraries (libpng), otherwise a script can be written utilising the grep command to find out files from the library which we use.
*) In lib, replace all the files (found in previous step) with their updated versions from lib(orig). A copy command can be used in the script written for the previous step to accomplish this step.
Now, the files in our repository have been updated. In steps to follow, we will try to grab lost changes using lib(copy).
*) Run diff command between lib(copy) and lib for all files and store the output in a text document. Here files from lib(copy) should be given as first argument to notice deletions clearly.
*) Look for deletions in an updated file and manually inspect (or ask mentor) whether that part is to be restored or not. In most cases, it is to be restored but it's better to ask than to break.
Once the changes have been restored, try to compile CCExtractor. It is very much likely that the compilation will fail. The most probably reason for this could be inclusion of unnecessary lines of code and their accompanying dependencies.
e.g "X is not defined" can be an error when we don't include the file in which X is defined nor remove the unnecessary line using X.
CCExtractor doesn't use a library fully, we use only the code and files necessary. This requires manual removal of extra lines and dependencies.
*) Output the compilation erros in a text document while compiling.
*) Use inspection and comparison with lib(copy) to decide whether the line causing error is to be removed.
Compile again, debug and push the change for the Continuous Integration tests on samples.

129
docs/VOBSUB.md Normal file
View File

@@ -0,0 +1,129 @@
# VOBSUB Subtitle Extraction from MKV Files
CCExtractor supports extracting VOBSUB (S_VOBSUB) subtitles from Matroska (MKV) containers. VOBSUB is an image-based subtitle format originally from DVD video.
## Overview
VOBSUB subtitles consist of two files:
- `.idx` - Index file containing metadata, palette, and timestamp/position entries
- `.sub` - Binary file containing the actual subtitle bitmap data in MPEG Program Stream format
## Basic Usage
```bash
ccextractor movie.mkv
```
This will extract all VOBSUB tracks and create paired `.idx` and `.sub` files:
- `movie_eng.idx` + `movie_eng.sub` (first English track)
- `movie_eng_1.idx` + `movie_eng_1.sub` (second English track, if present)
- etc.
## Converting VOBSUB to SRT (Text)
Since VOBSUB subtitles are images, you need OCR (Optical Character Recognition) to convert them to text-based formats like SRT.
### Using subtile-ocr (Recommended)
[subtile-ocr](https://github.com/gwen-lg/subtile-ocr) is an actively maintained Rust tool that provides accurate OCR conversion.
#### Option 1: Docker (Easiest)
We provide a Dockerfile that builds subtile-ocr with all dependencies:
```bash
# Build the Docker image (one-time)
cd tools/vobsubocr
docker build -t subtile-ocr .
# Extract VOBSUB from MKV
ccextractor movie.mkv
# Convert to SRT using OCR
docker run --rm -v $(pwd):/data subtile-ocr -l eng -o /data/movie_eng.srt /data/movie_eng.idx
```
#### Option 2: Install subtile-ocr Natively
If you have Rust and Tesseract development libraries installed:
```bash
# Install dependencies (Ubuntu/Debian)
sudo apt-get install libleptonica-dev libtesseract-dev tesseract-ocr tesseract-ocr-eng
# Install subtile-ocr
cargo install --git https://github.com/gwen-lg/subtile-ocr
# Convert
subtile-ocr -l eng -o movie_eng.srt movie_eng.idx
```
### subtile-ocr Options
| Option | Description |
|--------|-------------|
| `-l, --lang <LANG>` | Tesseract language code (required). Examples: `eng`, `fra`, `deu`, `chi_sim` |
| `-o, --output <FILE>` | Output SRT file (stdout if not specified) |
| `-t, --threshold <0.0-1.0>` | Binarization threshold (default: 0.6) |
| `-d, --dpi <DPI>` | Image DPI for OCR (default: 150) |
| `--dump` | Save processed subtitle images as PNG files |
### Language Codes
Install additional Tesseract language packs as needed:
```bash
# Examples
sudo apt-get install tesseract-ocr-fra # French
sudo apt-get install tesseract-ocr-deu # German
sudo apt-get install tesseract-ocr-spa # Spanish
sudo apt-get install tesseract-ocr-chi-sim # Simplified Chinese
```
## Technical Details
### .idx File Format
The index file contains:
1. Header with metadata (size, palette, alignment settings)
2. Language identifier line
3. Timestamp entries with file positions
Example:
```
# VobSub index file, v7 (do not modify this line!)
size: 720x576
palette: 000000, 828282, ...
id: eng, index: 0
timestamp: 00:01:12:920, filepos: 000000000
timestamp: 00:01:18:640, filepos: 000000800
...
```
### .sub File Format
The binary file contains MPEG Program Stream packets:
- Each subtitle is wrapped in a PS Pack header (14 bytes) + PES header (15 bytes)
- Subtitles are aligned to 2048-byte boundaries
- Contains raw SPU (SubPicture Unit) bitmap data
## Troubleshooting
### Empty output files
- Ensure the MKV file actually contains VOBSUB tracks (check with `mediainfo` or `ffprobe`)
- CCExtractor will report "No VOBSUB subtitles to write" if the track is empty
### OCR quality issues
- Try adjusting the `-t` threshold parameter
- Ensure the correct language pack is installed
- Use `--dump` to inspect the processed images
### Docker permission issues
- The output files may be owned by root; use `sudo chown` to fix ownership
- Or run Docker with `--user $(id -u):$(id -g)`
## See Also
- [OCR.md](OCR.md) - General OCR support in CCExtractor
- [subtile-ocr GitHub](https://github.com/gwen-lg/subtile-ocr) - OCR tool documentation

137
docs/build-wsl.md Normal file
View File

@@ -0,0 +1,137 @@
# Building CCExtractor on Windows using WSL
This guide explains how to build CCExtractor on Windows using WSL (Ubuntu).
It is based on a fresh setup and includes all required dependencies and
common build issues encountered during compilation.
---
## Prerequisites
- Windows 10 or Windows 11
- WSL enabled
- Ubuntu installed via Microsoft Store
---
## Install WSL and Ubuntu
From PowerShell (run as Administrator):
```powershell
wsl --install -d Ubuntu
```
Restart the system if prompted, then launch Ubuntu from the Start menu.
---
## Update system packages
```bash
sudo apt update
```
---
## Install basic build tools
```bash
sudo apt install -y build-essential git pkg-config
```
---
## Install Rust (required)
CCExtractor includes Rust components, so Rust and Cargo are required.
```bash
curl https://sh.rustup.rs -sSf | sh
source ~/.cargo/env
```
Verify installation:
```bash
cargo --version
rustc --version
```
---
## Install required libraries
```bash
sudo apt install -y \
libclang-dev clang \
libtesseract-dev tesseract-ocr \
libgpac-dev
```
---
## Clone the repository
```bash
git clone https://github.com/CCExtractor/ccextractor.git
cd ccextractor
```
---
## Build CCExtractor
```bash
cd linux
./build
```
After a successful build, verify by running:
```bash
./ccextractor
```
You should see the help/usage output.
---
## Common build issues
### cargo: command not found
```bash
source ~/.cargo/env
```
---
### Unable to find libclang
```bash
sudo apt install libclang-dev clang
```
---
### gpac/isomedia.h: No such file or directory
```bash
sudo apt install libgpac-dev
```
---
### please install tesseract development library
```bash
sudo apt install libtesseract-dev tesseract-ocr
```
---
## Notes
- Compiler warnings during the build process are expected and do not indicate failure.
- This guide was tested on Ubuntu (WSL) running on Windows 11.

View File

@@ -1,27 +1,30 @@
#######################################################
# Version 0.01
# Version 0.02
#
# To enable required option please uncommnent option
# To enable required option please uncomment option
#
# The Input Source tag option give ability to user
# to take imput from file, standurd input or network
# This tag take number in its input and there meanings
# The Input Source tag option gives ability to user
# to take input from file, standard input or network
# This tag takes number in its input and their meanings
# are following
# 0 = file
# 1 = stdin
# 2 = network
# 3 = tcp
INPUT_SOURCE=0
# The Buffer Input tag
# This tag take number in its input.
# This tag takes number in its input.
# Is it ccx_bufferdata_type ?
#BUFFER_INPUT=0
# The Direct Rollup tag
# This tag take number in its input and there meanings
# This tag takes number in its input and their meanings
# are following
# 0 = no
# 1 = yes
@@ -29,7 +32,7 @@ INPUT_SOURCE=0
#DIRECT_ROLLUP=
#The No font Color Tag
# This tag take number in its input and there meanings
# This tag takes number in its input and their meanings
# are following
# 0 = no
# 1 = yes
@@ -37,46 +40,58 @@ INPUT_SOURCE=0
#NOFONT_COLOR=
#The No type Setting Tag
# This tag take number in its input and there meanings
# This tag takes number in its input and their meanings
# are following
# 0 = no
# 1 = yes
#NOTYPE_SETTING=
# The Codec Tag take the preference of codec
# tag CCX_CODEC_ANY is by default
# This tag take number in its input and there meaning
# The Codec Tag takes the preference of codec
# tag CCX_CODEC_ANY by default
# This tag takes number in its input and their meanings
# are following
# 0 = CCX_CODEC_ANY (default)
# 1 = CCX_CODEC_TELETEXT
# 2 = CCX_CODEC_DVB
# 3 = CCX_CODEC_ISDB_CC
# 4 = CCX_CODEC_ATSC_CC
# 5 = CCX_CODEC_NONE
#CODEC=
# The NO Codec Tag uses codec specified
# tag CCX_CODEC_NONE by default
# This tag takes number in its input and their meanings
# are following
# 0 = CCX_CODEC_ANY
# 1 = CCX_CODEC_TELETEXT
# 2 = CCX_CODEC_DVB
#CODEC=
# The NO Codec Tag do not use codec specified
# tag CCX_CODEC_NONE is by default
# This tag take number in its input and there meaning
# are following
# 1 = CCX_CODEC_TELETEXT
# 2 = CCX_CODEC_DVB
# 3 = CCX_CODEC_NONE
# 3 = CCX_CODEC_ISDB_CC
# 4 = CCX_CODEC_ATSC_CC
# 5 = CCX_CODEC_NONE (default)
#NOCODEC=
# OUTPUT_FORMAT tag specify format of output
# by default output format is srt
# This tag take number in its input and there meaning
# This tag takes number in its input and their meanings
# are following
# 0 = CCX_OF_RAW
# 1 = CCX_OF_SRT (default)
# 2 = CCX_OF_SAMI
# 3 = CCX_OF_TRANSCRIPT
# 4 = CCX_OF_RCWT
# 5 = CCX_OF_NULL
# 6 = CCX_OF_SMPTETT
# 7 = CCX_OF_SPUPNG
# 8 = CCX_OF_DVDRAW
# 0 = CCX_OF_RAW
# 1 = CCX_OF_SRT (default)
# 2 = CCX_OF_SAMI
# 3 = CCX_OF_TRANSCRIPT
# 4 = CCX_OF_RCWT
# 5 = CCX_OF_NULL
# 6 = CCX_OF_SMPTETT
# 7 = CCX_OF_SPUPNG
# 8 = CCX_OF_DVDRAW
# 9 = CCX_OF_WEBVTT
# 10 = CCX_OF_SIMPLE_XML
# 11 = CCX_OF_G608
# 12 = CCX_OF_CURL
# 13 = CCX_OF_SSA
# 14 = CCX_OF_MCC
#OUTPUT_FORMAT=
@@ -87,7 +102,7 @@ INPUT_SOURCE=0
#START_CREDIT_TEXT=
# Start credit do not start before apecified time in tag
# Start credit do not start before specified time in tag
# this tag only accepts SS, MM:SS or HH:MM:SS
#START_CREDIT_NOT_BEFORE=
@@ -124,7 +139,7 @@ INPUT_SOURCE=0
#END_CREDITS_FOR_ATMOST
# Is Video edited or splitted by tool
# Is Video edited or split by tool
# By default its 1, ccextractor will process input files in
# sequence as if they were all one large file i.e
# split by a generic, non video-aware tool. If you
@@ -140,7 +155,7 @@ INPUT_SOURCE=0
# overrides the default PTS timing,GOP timing is always
# used for Elementary Streams.
#
# This tag take number in its input and there meaning
# This tag takes number in its input and their meanings
# are following
# 0 = use pts for time (when reasonable)
# 1 = use gop for time (when reasonable)
@@ -164,7 +179,7 @@ INPUT_SOURCE=0
# emulation, you can have ccextractor write only one
# line at a time, getting rid of these repeated lines.
#
# This tag take number in its input and there meaning
# This tag take number in its input and their meanings
# are following
# 0 = no (default)
# 1 = yes

340
docs/freetype.TXT Normal file
View File

@@ -0,0 +1,340 @@
GNU GENERAL PUBLIC LICENSE
Version 2, June 1991
Copyright (C) 1989, 1991 Free Software Foundation, Inc.
51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The licenses for most software are designed to take away your
freedom to share and change it. By contrast, the GNU General Public
License is intended to guarantee your freedom to share and change free
software--to make sure the software is free for all its users. This
General Public License applies to most of the Free Software
Foundation's software and to any other program whose authors commit to
using it. (Some other Free Software Foundation software is covered by
the GNU Library General Public License instead.) You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
this service if you wish), that you receive source code or can get it
if you want it, that you can change the software or use pieces of it
in new free programs; and that you know you can do these things.
To protect your rights, we need to make restrictions that forbid
anyone to deny you these rights or to ask you to surrender the rights.
These restrictions translate to certain responsibilities for you if you
distribute copies of the software, or if you modify it.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must give the recipients all the rights that
you have. You must make sure that they, too, receive or can get the
source code. And you must show them these terms so they know their
rights.
We protect your rights with two steps: (1) copyright the software, and
(2) offer you this license which gives you legal permission to copy,
distribute and/or modify the software.
Also, for each author's protection and ours, we want to make certain
that everyone understands that there is no warranty for this free
software. If the software is modified by someone else and passed on, we
want its recipients to know that what they have is not the original, so
that any problems introduced by others will not reflect on the original
authors' reputations.
Finally, any free program is threatened constantly by software
patents. We wish to avoid the danger that redistributors of a free
program will individually obtain patent licenses, in effect making the
program proprietary. To prevent this, we have made it clear that any
patent must be licensed for everyone's free use or not licensed at all.
The precise terms and conditions for copying, distribution and
modification follow.
GNU GENERAL PUBLIC LICENSE
TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
0. This License applies to any program or other work which contains
a notice placed by the copyright holder saying it may be distributed
under the terms of this General Public License. The "Program", below,
refers to any such program or work, and a "work based on the Program"
means either the Program or any derivative work under copyright law:
that is to say, a work containing the Program or a portion of it,
either verbatim or with modifications and/or translated into another
language. (Hereinafter, translation is included without limitation in
the term "modification".) Each licensee is addressed as "you".
Activities other than copying, distribution and modification are not
covered by this License; they are outside its scope. The act of
running the Program is not restricted, and the output from the Program
is covered only if its contents constitute a work based on the
Program (independent of having been made by running the Program).
Whether that is true depends on what the Program does.
1. You may copy and distribute verbatim copies of the Program's
source code as you receive it, in any medium, provided that you
conspicuously and appropriately publish on each copy an appropriate
copyright notice and disclaimer of warranty; keep intact all the
notices that refer to this License and to the absence of any warranty;
and give any other recipients of the Program a copy of this License
along with the Program.
You may charge a fee for the physical act of transferring a copy, and
you may at your option offer warranty protection in exchange for a fee.
2. You may modify your copy or copies of the Program or any portion
of it, thus forming a work based on the Program, and copy and
distribute such modifications or work under the terms of Section 1
above, provided that you also meet all of these conditions:
a) You must cause the modified files to carry prominent notices
stating that you changed the files and the date of any change.
b) You must cause any work that you distribute or publish, that in
whole or in part contains or is derived from the Program or any
part thereof, to be licensed as a whole at no charge to all third
parties under the terms of this License.
c) If the modified program normally reads commands interactively
when run, you must cause it, when started running for such
interactive use in the most ordinary way, to print or display an
announcement including an appropriate copyright notice and a
notice that there is no warranty (or else, saying that you provide
a warranty) and that users may redistribute the program under
these conditions, and telling the user how to view a copy of this
License. (Exception: if the Program itself is interactive but
does not normally print such an announcement, your work based on
the Program is not required to print an announcement.)
These requirements apply to the modified work as a whole. If
identifiable sections of that work are not derived from the Program,
and can be reasonably considered independent and separate works in
themselves, then this License, and its terms, do not apply to those
sections when you distribute them as separate works. But when you
distribute the same sections as part of a whole which is a work based
on the Program, the distribution of the whole must be on the terms of
this License, whose permissions for other licensees extend to the
entire whole, and thus to each and every part regardless of who wrote it.
Thus, it is not the intent of this section to claim rights or contest
your rights to work written entirely by you; rather, the intent is to
exercise the right to control the distribution of derivative or
collective works based on the Program.
In addition, mere aggregation of another work not based on the Program
with the Program (or with a work based on the Program) on a volume of
a storage or distribution medium does not bring the other work under
the scope of this License.
3. You may copy and distribute the Program (or a work based on it,
under Section 2) in object code or executable form under the terms of
Sections 1 and 2 above provided that you also do one of the following:
a) Accompany it with the complete corresponding machine-readable
source code, which must be distributed under the terms of Sections
1 and 2 above on a medium customarily used for software interchange; or,
b) Accompany it with a written offer, valid for at least three
years, to give any third party, for a charge no more than your
cost of physically performing source distribution, a complete
machine-readable copy of the corresponding source code, to be
distributed under the terms of Sections 1 and 2 above on a medium
customarily used for software interchange; or,
c) Accompany it with the information you received as to the offer
to distribute corresponding source code. (This alternative is
allowed only for noncommercial distribution and only if you
received the program in object code or executable form with such
an offer, in accord with Subsection b above.)
The source code for a work means the preferred form of the work for
making modifications to it. For an executable work, complete source
code means all the source code for all modules it contains, plus any
associated interface definition files, plus the scripts used to
control compilation and installation of the executable. However, as a
special exception, the source code distributed need not include
anything that is normally distributed (in either source or binary
form) with the major components (compiler, kernel, and so on) of the
operating system on which the executable runs, unless that component
itself accompanies the executable.
If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code.
4. You may not copy, modify, sublicense, or distribute the Program
except as expressly provided under this License. Any attempt
otherwise to copy, modify, sublicense or distribute the Program is
void, and will automatically terminate your rights under this License.
However, parties who have received copies, or rights, from you under
this License will not have their licenses terminated so long as such
parties remain in full compliance.
5. You are not required to accept this License, since you have not
signed it. However, nothing else grants you permission to modify or
distribute the Program or its derivative works. These actions are
prohibited by law if you do not accept this License. Therefore, by
modifying or distributing the Program (or any work based on the
Program), you indicate your acceptance of this License to do so, and
all its terms and conditions for copying, distributing or modifying
the Program or works based on it.
6. Each time you redistribute the Program (or any work based on the
Program), the recipient automatically receives a license from the
original licensor to copy, distribute or modify the Program subject to
these terms and conditions. You may not impose any further
restrictions on the recipients' exercise of the rights granted herein.
You are not responsible for enforcing compliance by third parties to
this License.
7. If, as a consequence of a court judgment or allegation of patent
infringement or for any other reason (not limited to patent issues),
conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot
distribute so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you
may not distribute the Program at all. For example, if a patent
license would not permit royalty-free redistribution of the Program by
all those who receive copies directly or indirectly through you, then
the only way you could satisfy both it and this License would be to
refrain entirely from distribution of the Program.
If any portion of this section is held invalid or unenforceable under
any particular circumstance, the balance of the section is intended to
apply and the section as a whole is intended to apply in other
circumstances.
It is not the purpose of this section to induce you to infringe any
patents or other property right claims or to contest validity of any
such claims; this section has the sole purpose of protecting the
integrity of the free software distribution system, which is
implemented by public license practices. Many people have made
generous contributions to the wide range of software distributed
through that system in reliance on consistent application of that
system; it is up to the author/donor to decide if he or she is willing
to distribute software through any other system and a licensee cannot
impose that choice.
This section is intended to make thoroughly clear what is believed to
be a consequence of the rest of this License.
8. If the distribution and/or use of the Program is restricted in
certain countries either by patents or by copyrighted interfaces, the
original copyright holder who places the Program under this License
may add an explicit geographical distribution limitation excluding
those countries, so that distribution is permitted only in or among
countries not thus excluded. In such case, this License incorporates
the limitation as if written in the body of this License.
9. The Free Software Foundation may publish revised and/or new versions
of the General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the Program
specifies a version number of this License which applies to it and "any
later version", you have the option of following the terms and conditions
either of that version or of any later version published by the Free
Software Foundation. If the Program does not specify a version number of
this License, you may choose any version ever published by the Free Software
Foundation.
10. If you wish to incorporate parts of the Program into other free
programs whose distribution conditions are different, write to the author
to ask for permission. For software which is copyrighted by the Free
Software Foundation, write to the Free Software Foundation; we sometimes
make exceptions for this. Our decision will be guided by the two goals
of preserving the free status of all derivatives of our free software and
of promoting the sharing and reuse of software generally.
NO WARRANTY
11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
REPAIR OR CORRECTION.
12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
POSSIBILITY OF SUCH DAMAGES.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
convey the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
<one line to give the program's name and a brief idea of what it does.>
Copyright (C) <year> <name of author>
This program is free software; you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation; either version 2 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program; if not, write to the Free Software
Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA
Also add information on how to contact you by electronic and paper mail.
If the program is interactive, make it output a short notice like this
when it starts in an interactive mode:
Gnomovision version 69, Copyright (C) year name of author
Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, the commands you use may
be called something other than `show w' and `show c'; they could even be
mouse-clicks or menu items--whatever suits your program.
You should also get your employer (if you work as a programmer) or your
school, if any, to sign a "copyright disclaimer" for the program, if
necessary. Here is a sample; alter the names:
Yoyodyne, Inc., hereby disclaims all copyright interest in the program
`Gnomovision' (which makes passes at compilers) written by James Hacker.
<signature of Ty Coon>, 1 April 1989
Ty Coon, President of Vice
This General Public License does not permit incorporating your program into
proprietary programs. If your program is a subroutine library, you may
consider it more useful to permit linking proprietary applications with the
library. If this is what you want to do, use the GNU Library General
Public License instead of this License.

BIN
docs/img/Binaries.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 131 KiB

BIN
docs/img/Building.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 78 KiB

BIN
docs/img/ProjectSection.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

BIN
docs/img/Properties.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 99 KiB

BIN
docs/img/Warning.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 69 KiB

BIN
docs/img/projectFiles.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 101 KiB

12
docs/raspberrypi.md Normal file
View File

@@ -0,0 +1,12 @@
# Installing on a Raspberry Pi
dependencies for ocr mode:
* libleptonica-dev
* libtesseract-dev
```bash
sudo apt-get install libleptonica-dev libtesseract-dev
```
Other than this you just need to cd into the linux directory and run `make` or `make ENABLE_OCR=yes` if you want ocr enabled.

View File

@@ -1,4 +1,4 @@
For building ccextractor using cmake folllow below steps..
For building CCExtractor using cmake follow steps below..
Step 1) Check you have right version of cmake installed. ( version >= 3.0.2 )
We are using CMP0037 policy of cmake which was introduced in 3.0.0
@@ -6,13 +6,15 @@ Step 1) Check you have right version of cmake installed. ( version >= 3.0.2 )
suggest to use 3.0.2 or higher version.
Step 2) create a seprate directory where you want to build the target.
In Unix you can do it using follwing commands.
Step 2) create a separate directory where you want to build the target.
In Unix you can do it using following commands.
~> cd ccextractor
~> mkdir build
Step 3) make the build sytem using cmake
~> cmake ../src/
Step 3) make the build system using cmake. Params in [] are optional and have
been explained later in the document.
~> cmake [-DWITH_FFMPEG=ON] [-DWITH_OCR=ON]
[-DWITH_HARDSUBX=ON] ../src/
Step 4) Compile the code.
~> make
@@ -27,5 +29,8 @@ cmake -DWITH_FFMPEG=ON ../src/
If you want to build CCExtractor with OCR you need to pass
cmake -DWITH_OCR=ON ../src/
If you want to build CCExtractor with HARDSUBX support
cmake -DWITH_HARDSUBX=ON ../src/
Hint for looking all the things you want to set from outside
cmake -LAH ../src/

BIN
fonts/Cousine-Regular.ttf Normal file

Binary file not shown.

BIN
fonts/DroidSans.ttf Normal file

Binary file not shown.

BIN
fonts/Karla-Regular.ttf Normal file

Binary file not shown.

BIN
fonts/ProggyClean.ttf Normal file

Binary file not shown.

BIN
fonts/ProggyTiny.ttf Normal file

Binary file not shown.

BIN
fonts/Raleway-Bold.ttf Normal file

Binary file not shown.

BIN
fonts/Roboto-Bold.ttf Normal file

Binary file not shown.

BIN
fonts/Roboto-Light.ttf Normal file

Binary file not shown.

BIN
fonts/Roboto-Regular.ttf Normal file

Binary file not shown.

BIN
fonts/kenvector_future.ttf Normal file

Binary file not shown.

Binary file not shown.

BIN
icon/computer.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 620 B

BIN
icon/default.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 32 KiB

BIN
icon/desktop.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 583 B

BIN
icon/directory.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 4.2 KiB

BIN
icon/drive.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 13 KiB

BIN
icon/font.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 17 KiB

BIN
icon/home.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 819 B

BIN
icon/img.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 11 KiB

BIN
icon/movie.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 24 KiB

BIN
icon/music.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

BIN
icon/text.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 5.1 KiB

2
linux/.gitignore vendored Normal file
View File

@@ -0,0 +1,2 @@
libccx_rust.a
rust

View File

@@ -1,133 +0,0 @@
SHELL = /bin/sh
CC = gcc
SYS := $(shell gcc -dumpmachine)
CFLAGS = -O3 -std=gnu99
INCLUDE = -I../src/gpacmp4/ -I../src/libpng -I../src/lib_hash -I../src/zlib -I../src/lib_ccx -I../src/.
INCLUDE += -I../src/zvbi
ALL_FLAGS = -Wno-write-strings -D_FILE_OFFSET_BITS=64
LDFLAGS = -lm
ifneq (, $(findstring linux, $(SYS)))
CFLAGS +=-DGPAC_CONFIG_LINUX
endif
TARGET = ccextractor
OBJS_DIR = objs
VPATH = ../src:../src/gpacmp4:../src/libpng:../src/zlib:../src/lib_ccx:../src/zvbi:../src/lib_hash
SRCS_DIR = ../src
SRCS_C = $(wildcard $(SRCS_DIR)/*.c)
OBJS = $(SRCS_C:$(SRCS_DIR)/%.c=$(OBJS_DIR)/%.o)
SRCS_CCX_DIR = $(SRCS_DIR)/lib_ccx
SRCS_CCX = $(wildcard $(SRCS_CCX_DIR)/*.c)
OBJS_CCX = $(SRCS_CCX:$(SRCS_CCX_DIR)/%.c=$(OBJS_DIR)/%.o)
SRCS_PNG_DIR = $(SRCS_DIR)/libpng
SRCS_PNG = $(wildcard $(SRCS_PNG_DIR)/*.c)
OBJS_PNG = $(SRCS_PNG:$(SRCS_PNG_DIR)/%.c=$(OBJS_DIR)/%.o)
SRCS_ZVBI_DIR = $(SRCS_DIR)/zvbi
SRCS_ZVBI = $(wildcard $(SRCS_ZVBI_DIR)/*.c)
OBJS_ZVBI = $(SRCS_ZVBI:$(SRCS_ZVBI_DIR)/%.c=$(OBJS_DIR)/%.o)
SRCS_GPACMP4_DIR = $(SRCS_DIR)/gpacmp4
SRCS_GPACMP4_C = $(wildcard $(SRCS_GPACMP4_DIR)/*.c)
SRCS_GPACMP4_CPP = $(wildcard $(SRCS_GPACMP4_DIR)/*.cpp)
OBJS_GPACMP4 = $(SRCS_GPACMP4_C:$(SRCS_GPACMP4_DIR)/%.c=$(OBJS_DIR)/%.o) \
$(SRCS_GPACMP4_CPP:$(SRCS_GPACMP4_DIR)/%.cpp=$(OBJS_DIR)/%.o)
SRCS_ZLIB_DIR = $(SRCS_DIR)/zlib
SRCS_ZLIB = $(wildcard $(SRCS_ZLIB_DIR)/*.c)
OBJS_ZLIB = $(SRCS_ZLIB:$(SRCS_ZLIB_DIR)/%.c=$(OBJS_DIR)/%.o)
SRCS_HASH_DIR = $(SRCS_DIR)/lib_hash
SRCS_HASH = $(wildcard $(SRCS_HASH_DIR)/*.c)
OBJS_HASH = $(SRCS_HASH:$(SRCS_HASH_DIR)/%.c=$(OBJS_DIR)/%.o)
INSTLALL = cp -f -p
INSTLALL_PROGRAM = $(INSTLALL)
DESTDIR = /usr/bin
ifeq ($(ENABLE_OCR),yes)
CFLAGS+=-DENABLE_OCR -DPNG_NO_CONFIG_H
TESS_LDFLAGS+= $(shell pkg-config --libs tesseract)
LEPT_LDFLAGS+= $(shell pkg-config --libs lept)
#error checking of library are there or not
ifeq ($(TESS_LDFLAGS),$(EMPTY))
$(error **ERROR** "tesseract not found")
else
#TODO print the version of library found
$(info "tesseract found")
endif
ifeq ($(LEPT_LDFLAGS),$(EMPTY))
$(error **ERROR** "leptonica not found")
else
#TODO print the version of library found
$(info "Leptonica found")
endif
CFLAGS += $(shell pkg-config --cflags tesseract)
CFLAGS += $(shell pkg-config --cflags lept)
LDFLAGS += $(TESS_LDFLAGS)
LDFLAGS += $(LEPT_LDFLAGS)
endif
ifeq ($(ENABLE_FFMPEG),yes)
CFLAGS+=-DENABLE_FFMPEG
CFLAGS+= $(shell pkg-config --cflags libavcodec)
CFLAGS+= $(shell pkg-config --cflags libavformat)
CFLAGS+= $(shell pkg-config --cflags libavutil)
LDFLAGS+= $(shell pkg-config --libs libavcodec )
LDFLAGS+= $(shell pkg-config --libs libavformat )
LDFLAGS+= $(shell pkg-config --libs libavutil )
endif
.PHONY: all
all: pre-build objs_dir $(TARGET)
.PHONY: objs_dir
objs_dir:
mkdir -p $(OBJS_DIR)
$(TARGET): $(OBJS) $(OBJS_PNG) $(OBJS_GPACMP4) $(OBJS_ZVBI) $(OBJS_ZLIB) $(OBJS_HASH) $(OBJS_CCX)
$(CC) $(ALL_FLAGS) $(CFLAGS) $(OBJS) $(OBJS_CCX) $(OBJS_PNG) $(OBJS_ZVBI) $(OBJS_GPACMP4) $(OBJS_ZLIB) $(OBJS_HASH) $(LDFLAGS) -o $@
$(OBJS_DIR)/%.o: %.c
$(CC) -c $(ALL_FLAGS) $(INCLUDE) $(CFLAGS) $< -o $@
$(OBJS_DIR)/%.o: %.cpp
$(CC) -c $(ALL_FLAGS) $(INCLUDE) $(CFLAGS) $< -o $@ -I../src/gpacmp4
$(OBJS_DIR)/ccextractor.o: ccextractor.c
$(CC) -c $(ALL_FLAGS) $(INCLUDE) $(CFLAGS) -O0 $< -o $@
.PHONY: clean
clean:
rm -rf $(TARGET) 2>/dev/null || true
rm -rf $(OBJS_CCX) $(OBJS_PNG) $(OBJS_ZLIB) $(OBJS_GPACMP4) $(OBJS_HASH) $(OBJS) 2>/dev/null || true
rm -rdf $(OBJS_DIR) 2>/dev/null || true
rm -rf .depend 2>/dev/null || true
.PHONY: install
install: $(TARGET)
$(INSTLALL_PROGRAM) $(TARGET) $(DESTDIR)
.PHONY: uninstall
uninstall:
rm -iv $(DESTDIR)/$(TARGET)
.PHONY: depend dep
depend dep:
$(CC) $(CFLAGS) $(INCLUDE) -E -MM $(SRCS_C) $(SRCS_PNG) $(SRCS_ZVBI) $(SRCS_ZLIB) $(SRCS_HASH) $(SRCS_CCX) \
$(SRCS_GPACMP4_C) $(SRCS_GPACMP4_CPP) |\
sed 's/^[a-zA-Z_0-9]*.o/$(OBJS_DIR)\/&/' > .depend
.PHONY: pre-build
pre-build:
./pre-build.sh
-include .depend

351
linux/Makefile.am Normal file
View File

@@ -0,0 +1,351 @@
AUTOMAKE_OPTIONS = foreign
ACLOCAL_AMFLAGS = -I m4/
bin_PROGRAMS = ccextractor
ccextractor_SOURCES = \
../src/ccextractor.c \
../src/ccextractor.h \
/usr/include/gpac/avparse.h \
/usr/include/gpac/base_coding.h \
/usr/include/gpac/bitstream.h \
/usr/include/gpac/color.h \
/usr/include/gpac/config_file.h \
/usr/include/gpac/configuration.h \
/usr/include/gpac/constants.h \
/usr/include/gpac/events_constants.h \
/usr/include/gpac/ietf.h \
/usr/include/gpac/isomedia.h \
/usr/include/gpac/list.h \
/usr/include/gpac/maths.h \
/usr/include/gpac/media_tools.h \
/usr/include/gpac/mpeg4_odf.h \
/usr/include/gpac/network.h \
/usr/include/gpac/revision.h \
/usr/include/gpac/setup.h \
/usr/include/gpac/tools.h \
/usr/include/gpac/utf.h \
/usr/include/gpac/version.h \
/usr/include/gpac/iso639.h \
/usr/include/gpac/internal/avilib.h \
/usr/include/gpac/internal/isomedia_dev.h \
/usr/include/gpac/internal/media_dev.h \
/usr/include/gpac/internal/odf_dev.h \
/usr/include/gpac/internal/odf_parse_common.h \
/usr/include/gpac/internal/ogg.h \
../src/thirdparty/libpng/pngstruct.h \
../src/thirdparty/libpng/pngpriv.h \
../src/thirdparty/libpng/pnginfo.h \
../src/thirdparty/libpng/pnglibconf.h \
../src/thirdparty/libpng/pngconf.h \
../src/thirdparty/libpng/pngdebug.h \
../src/thirdparty/libpng/png.h \
../src/thirdparty/libpng/png.c \
../src/thirdparty/libpng/pngerror.c \
../src/thirdparty/libpng/pngget.c \
../src/thirdparty/libpng/pngmem.c \
../src/thirdparty/libpng/pngpread.c \
../src/thirdparty/libpng/pngread.c \
../src/thirdparty/libpng/pngrio.c \
../src/thirdparty/libpng/pngrtran.c \
../src/thirdparty/libpng/pngrutil.c \
../src/thirdparty/libpng/pngset.c \
../src/thirdparty/libpng/pngtrans.c \
../src/thirdparty/libpng/pngwio.c \
../src/thirdparty/libpng/pngwrite.c \
../src/thirdparty/libpng/pngwtran.c \
../src/thirdparty/libpng/pngwutil.c \
../src/lib_ccx/ccx_common_common.h \
../src/lib_ccx/ccx_common_option.h \
../src/lib_ccx/utility.h \
../src/lib_ccx/activity.h \
../src/lib_ccx/asf_constants.h \
../src/lib_ccx/avc_functions.h \
../src/lib_ccx/cc_bitstream.h \
../src/lib_ccx/ccx_common_option.c \
../src/lib_ccx/ccx_common_common.c \
../src/lib_ccx/compile_info_real.h \
../src/lib_ccx/utility.c \
../src/lib_ccx/activity.c \
../src/lib_ccx/asf_functions.c \
../src/lib_ccx/avc_functions.c \
../src/lib_ccx/cc_bitstream.c \
../src/lib_ccx/ccx_common_char_encoding.c \
../src/lib_ccx/ccx_common_char_encoding.h \
../src/lib_ccx/ccx_common_constants.c \
../src/lib_ccx/ccx_common_constants.h \
../src/lib_ccx/ccx_common_platform.h \
../src/lib_ccx/ccx_common_structs.h \
../src/lib_ccx/ccx_common_timing.c \
../src/lib_ccx/ccx_common_timing.h \
../src/lib_ccx/ccx_decoders_608.c \
../src/lib_ccx/ccx_decoders_608.h \
../src/lib_ccx/ccx_decoders_708.c \
../src/lib_ccx/ccx_decoders_708_encoding.c \
../src/lib_ccx/ccx_decoders_708_encoding.h \
../src/lib_ccx/ccx_decoders_708.h \
../src/lib_ccx/ccx_decoders_708_output.c \
../src/lib_ccx/ccx_decoders_708_output.h \
../src/lib_ccx/ccx_decoders_common.c \
../src/lib_ccx/ccx_decoders_common.h \
../src/lib_ccx/ccx_decoders_isdb.c \
../src/lib_ccx/ccx_decoders_isdb.h \
../src/lib_ccx/ccx_decoders_structs.h \
../src/lib_ccx/ccx_decoders_vbi.c \
../src/lib_ccx/ccx_decoders_vbi.h \
../src/lib_ccx/ccx_decoders_xds.c \
../src/lib_ccx/ccx_decoders_xds.h \
../src/lib_ccx/ccx_demuxer.c \
../src/lib_ccx/ccx_demuxer.h \
../src/lib_ccx/ccx_demuxer_mxf.c \
../src/lib_ccx/ccx_demuxer_mxf.h \
../src/lib_ccx/ccx_dtvcc.c \
../src/lib_ccx/ccx_dtvcc.h \
../src/lib_ccx/ccx_encoders_common.c \
../src/lib_ccx/ccx_encoders_common.h \
../src/lib_ccx/ccx_encoders_curl.c \
../src/lib_ccx/ccx_encoders_g608.c \
../src/lib_ccx/ccx_encoders_helpers.c \
../src/lib_ccx/ccx_encoders_helpers.h \
../src/lib_ccx/ccx_encoders_mcc.c \
../src/lib_ccx/ccx_encoders_mcc.h \
../src/lib_ccx/ccx_encoders_sami.c \
../src/lib_ccx/ccx_encoders_scc.c \
../src/lib_ccx/ccx_encoders_smptett.c \
../src/lib_ccx/ccx_encoders_splitbysentence.c \
../src/lib_ccx/ccx_encoders_spupng.c \
../src/lib_ccx/ccx_encoders_srt.c \
../src/lib_ccx/ccx_encoders_ssa.c \
../src/lib_ccx/ccx_encoders_structs.h \
../src/lib_ccx/ccx_encoders_transcript.c \
../src/lib_ccx/ccx_encoders_webvtt.c \
../src/lib_ccx/ccx_encoders_xds.c \
../src/lib_ccx/ccx_encoders_xds.h \
../src/lib_ccx/ccx_gxf.c \
../src/lib_ccx/ccx_gxf.h \
../src/lib_ccx/ccx_mp4.h \
../src/lib_ccx/compile_info.h \
../src/lib_ccx/compile_info_real.h \
../src/lib_ccx/configuration.c \
../src/lib_ccx/configuration.h \
../src/lib_ccx/disable_warnings.h \
../src/lib_ccx/dvb_subtitle_decoder.c \
../src/lib_ccx/dvb_subtitle_decoder.h \
../src/lib_ccx/dvd_subtitle_decoder.c \
../src/lib_ccx/dvd_subtitle_decoder.h \
../src/lib_ccx/es_functions.c \
../src/lib_ccx/es_userdata.c \
../src/lib_ccx/ffmpeg_intgr.c \
../src/lib_ccx/ffmpeg_intgr.h \
../src/lib_ccx/file_buffer.h \
../src/lib_ccx/file_functions.c \
../src/lib_ccx/general_loop.c \
../src/lib_ccx/hamming.h \
../src/lib_ccx/hardsubx.c \
../src/lib_ccx/hardsubx_classifier.c \
../src/lib_ccx/hardsubx_decoder.c \
../src/lib_ccx/hardsubx.h \
../src/lib_ccx/hardsubx_imgops.c \
../src/lib_ccx/hardsubx_utility.c \
../src/lib_ccx/lib_ccx.c \
../src/lib_ccx/lib_ccx.h \
../src/lib_ccx/list.h \
../src/lib_ccx/matroska.c \
../src/lib_ccx/matroska.h \
../src/lib_ccx/vobsub_decoder.c \
../src/lib_ccx/vobsub_decoder.h \
../src/lib_ccx/mp4.c \
../src/lib_ccx/myth.c \
../src/lib_ccx/networking.c \
../src/lib_ccx/networking.h \
../src/lib_ccx/ocr.c \
../src/lib_ccx/ocr.h \
../src/lib_ccx/output.c \
../src/lib_ccx/params.c \
../src/lib_ccx/params_dump.c \
../src/lib_ccx/sequencing.c \
../src/lib_ccx/stdintmsc.h \
../src/lib_ccx/stream_functions.c \
../src/lib_ccx/teletext.h \
../src/lib_ccx/telxcc.c \
../src/lib_ccx/ts_functions.c \
../src/lib_ccx/ts_functions.h \
../src/lib_ccx/ts_info.c \
../src/lib_ccx/ts_tables.c \
../src/lib_ccx/ts_tables_epg.c \
../src/lib_ccx/wtv_constants.h \
../src/lib_ccx/wtv_functions.c \
../src/thirdparty/zlib/adler32.c \
../src/thirdparty/zlib/compress.c \
../src/thirdparty/zlib/crc32.c \
../src/thirdparty/zlib/crc32.h \
../src/thirdparty/zlib/deflate.c \
../src/thirdparty/zlib/deflate.h \
../src/thirdparty/zlib/gzclose.c \
../src/thirdparty/zlib/gzguts.h \
../src/thirdparty/zlib/gzlib.c \
../src/thirdparty/zlib/gzread.c \
../src/thirdparty/zlib/gzwrite.c \
../src/thirdparty/zlib/infback.c \
../src/thirdparty/zlib/inffast.c \
../src/thirdparty/zlib/inffast.h \
../src/thirdparty/zlib/inffixed.h \
../src/thirdparty/zlib/inflate.c \
../src/thirdparty/zlib/inflate.h \
../src/thirdparty/zlib/inftrees.c \
../src/thirdparty/zlib/inftrees.h \
../src/thirdparty/zlib/trees.c \
../src/thirdparty/zlib/trees.h \
../src/thirdparty/zlib/uncompr.c \
../src/thirdparty/zlib/zconf.h \
../src/thirdparty/zlib/zlib.h \
../src/thirdparty/zlib/zutil.c \
../src/thirdparty/zlib/zutil.h \
../src/thirdparty/utf8proc/utf8proc.c \
../src/thirdparty/utf8proc/utf8proc.h \
../src/thirdparty/lib_hash/sha2.c \
../src/thirdparty/lib_hash/sha2.h \
../src/lib_ccx/zvbi/bcd.h \
../src/lib_ccx/zvbi/bit_slicer.c \
../src/lib_ccx/zvbi/bit_slicer.h \
../src/lib_ccx/zvbi/decoder.c \
../src/lib_ccx/zvbi/macros.h \
../src/lib_ccx/zvbi/misc.h \
../src/lib_ccx/zvbi/raw_decoder.c \
../src/lib_ccx/zvbi/raw_decoder.h \
../src/lib_ccx/zvbi/sampling_par.c \
../src/lib_ccx/zvbi/sampling_par.h \
../src/lib_ccx/zvbi/sliced.h \
../src/lib_ccx/zvbi/zvbi_decoder.h \
../src/freetype/* \
../src/thirdparty/freetype/autofit/autofit.c \
../src/thirdparty/freetype/base/ftbase.c \
../src/thirdparty/freetype/base/ftbbox.c \
../src/thirdparty/freetype/base/ftbdf.c \
../src/thirdparty/freetype/base/ftbitmap.c \
../src/thirdparty/freetype/base/ftcid.c \
../src/thirdparty/freetype/base/ftfntfmt.c \
../src/thirdparty/freetype/base/ftfstype.c \
../src/thirdparty/freetype/base/ftgasp.c \
../src/thirdparty/freetype/base/ftglyph.c \
../src/thirdparty/freetype/base/ftgxval.c \
../src/thirdparty/freetype/base/ftinit.c \
../src/thirdparty/freetype/base/ftlcdfil.c \
../src/thirdparty/freetype/base/ftmm.c \
../src/thirdparty/freetype/base/ftotval.c \
../src/thirdparty/freetype/base/ftpatent.c \
../src/thirdparty/freetype/base/ftpfr.c \
../src/thirdparty/freetype/base/ftstroke.c \
../src/thirdparty/freetype/base/ftsynth.c \
../src/thirdparty/freetype/base/ftsystem.c \
../src/thirdparty/freetype/base/fttype1.c \
../src/thirdparty/freetype/base/ftwinfnt.c \
../src/thirdparty/freetype/bdf/bdf.c \
../src/thirdparty/freetype/bzip2/ftbzip2.c \
../src/thirdparty/freetype/cache/ftcache.c \
../src/thirdparty/freetype/cff/cff.c \
../src/thirdparty/freetype/cid/type1cid.c \
../src/thirdparty/freetype/gzip/ftgzip.c \
../src/thirdparty/freetype/include/ft2build.h \
../src/thirdparty/freetype/lzw/ftlzw.c \
../src/thirdparty/freetype/pcf/pcf.c \
../src/thirdparty/freetype/pfr/pfr.c \
../src/thirdparty/freetype/psaux/psaux.c \
../src/thirdparty/freetype/pshinter/pshinter.c \
../src/thirdparty/freetype/psnames/psnames.c \
../src/thirdparty/freetype/raster/raster.c \
../src/thirdparty/freetype/sfnt/sfnt.c \
../src/thirdparty/freetype/smooth/smooth.c \
../src/thirdparty/freetype/truetype/truetype.c \
../src/thirdparty/freetype/type1/type1.c \
../src/thirdparty/freetype/type42/type42.c \
../src/thirdparty/freetype/winfonts/winfnt.c
if SYS_IS_APPLE_SILICON
ccextractor_SOURCES += ../src/thirdparty/libpng/arm/arm_init.c \
../src/thirdparty/libpng/arm/filter_neon_intrinsics.c \
../src/thirdparty/libpng/arm/palette_neon_intrinsics.c
endif
ccextractor_CFLAGS = -std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP
ccextractor_CPPFLAGS =-I../src/lib_ccx/ -I/usr/include/ -I../src/thirdparty/libpng/ -I../src/thirdparty/zlib/ -I../src/lib_ccx/zvbi/ -I../src/thirdparty/lib_hash/ -I../src/thirdparty -I../src/ -I../src/thirdparty/freetype/include/
ccextractor_LDADD=-lm -lpthread -ldl -lgpac
if SYS_IS_LINUX
ccextractor_CFLAGS += -O3 -s
endif
if SYS_IS_MAC
ccextractor_CFLAGS += -DPAC_CONFIG_DARWIN -Dfopen64=fopen -Dopen64=open -Dlseek64=lseek
ccextractor_LDADD += -liconv -lz
endif
if SYS_IS_64_BIT
ccextractor_CFLAGS += -DGPAC_64_BITS
endif
HARDSUBX_FEATURE_RUST=
if HARDSUBX_IS_ENABLED
ccextractor_CFLAGS += -DENABLE_HARDSUBX
ccextractor_CPPFLAGS+= ${libavcodec_CFLAGS}
ccextractor_CPPFLAGS+= ${libavformat_CFLAGS}
ccextractor_CPPFLAGS+= ${libavfilter_CFLAGS}
ccextractor_CPPFLAGS+= ${libavutil_CFALGS}
ccextractor_CPPFLAGS+= ${libswscale_CFLAGS}
# HARDSUBX requires tesseract/leptonica for OCR (same as OCR feature)
ccextractor_CPPFLAGS+= ${tesseract_CFLAGS}
ccextractor_CPPFLAGS+= ${lept_CFLAGS}
AV_LIB = ${libavcodec_LIBS}
AV_LIB += ${libavformat_LIBS}
AV_LIB += ${libavfilter_LIBS}
AV_LIB += ${libavutil_LIBS}
AV_LIB += ${libswscale_LIBS}
ccextractor_LDADD += $(AV_LIB)
# HARDSUBX requires tesseract/leptonica libs for OCR
ccextractor_LDADD += ${tesseract_LIBS}
ccextractor_LDADD += ${lept_LIBS}
HARDSUBX_FEATURE_RUST += --features "hardsubx_ocr"
endif
if OCR_IS_ENABLED
ccextractor_CFLAGS += -DENABLE_OCR -DPN3G_NO_CONFIG_H
LEPT_LIB = ${lept_LIBS}
LEPT_CPPFLAG = ${lept_CFLAGS}
if TESSERACT_PRESENT
TESS_LIB = ${tesseract_LIBS}
TESS_CPPFLAG = ${tesseract_CFLAGS}
else
#fix for raspberry pi not having a pkgconfig file for tesseract
if TESSERACT_PRESENT_RPI
TESS_LIB = -ltesseract
TESS_CPPFLAG = -I/usr/include/tesseract
endif
endif
ccextractor_CPPFLAGS += $(TESS_CPPFLAG)
ccextractor_CPPFLAGS += $(LEPT_CPPFLAG)
ccextractor_LDADD += $(TESS_LIB)
ccextractor_LDADD += $(LEPT_LIB)
endif
ccextractor_LDADD += ./rust/@RUST_TARGET_SUBDIR@/libccx_rust.a
if DEBUG_RELEASE
CARGO_RELEASE_ARGS=
else
CARGO_RELEASE_ARGS=--release
endif
./rust/@RUST_TARGET_SUBDIR@/libccx_rust.a:
cd ../src/rust && \
CARGO_TARGET_DIR=../../linux/rust $(CARGO) build $(HARDSUBX_FEATURE_RUST) $(CARGO_RELEASE_ARGS);
EXTRA_DIST = /usr/include/gpac/sync_layer.h ../src/lib_ccx/ccfont2.xbm ../src/thirdparty/utf8proc/utf8proc_data.c fonts/ icon/

4
linux/autogen.sh Executable file
View File

@@ -0,0 +1,4 @@
#!/usr/bin/env bash
./pre-build.sh
autoreconf -i

View File

@@ -1,14 +1,213 @@
#!/bin/bash
BLD_FLAGS="-std=gnu99 -Wno-write-strings -DGPAC_CONFIG_LINUX -D_FILE_OFFSET_BITS=64"
BLD_INCLUDE="-I../src/lib_ccx/ -I../src/gpacmp4/ -I../src/libpng/ -I../src/zlib/ -I../src/zvbi -I../src/lib_hash"
SRC_LIBPNG="$(find ../src/libpng/ -name '*.c')"
SRC_ZLIB="$(find ../src/zlib/ -name '*.c')"
SRC_ZVBI="$(find ../src/zvbi/ -name '*.c')"
SRC_CCX="$(find ../src/lib_ccx/ -name '*.c')"
SRC_GPAC="$(find ../src/gpacmp4/ -name '*.c')"
SRC_HASH="$(find ../src/lib_hash/ -name '*.c')"
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_ZVBI $SRC_LIBPNG $SRC_HASH"
BLD_LINKER="-lm -zmuldefs"
#!/usr/bin/env bash
RUST_LIB="rust/release/libccx_rust.a"
RUST_PROFILE="--release"
USE_SYSTEM_LIBS=false
while [[ $# -gt 0 ]]; do
case $1 in
-debug)
DEBUG=true
BLD_FLAGS="$BLD_FLAGS -g -fsanitize=address"
RUST_PROFILE=""
RUST_LIB="rust/debug/libccx_rust.a"
shift
;;
-hardsubx)
HARDSUBX=true
# Allow overriding FFmpeg version via environment variable
if [ -n "$FFMPEG_VERSION" ]; then
RUST_FEATURES="--features hardsubx_ocr,$FFMPEG_VERSION"
else
RUST_FEATURES="--features hardsubx_ocr"
fi
BLD_FLAGS="$BLD_FLAGS -DENABLE_HARDSUBX"
BLD_LINKER="$BLD_LINKER -lswscale -lavutil -pthread -lavformat -lavcodec -lavfilter -lxcb-shm -lxcb -lX11 -llzma -lswresample"
shift
;;
-system-libs)
USE_SYSTEM_LIBS=true
shift
;;
-*)
echo "Unknown option $1"
exit 1
;;
esac
done
if [ "$USE_SYSTEM_LIBS" = true ]; then
command -v pkg-config >/dev/null || {
echo "Error: pkg-config is required for -system-libs mode"
exit 1
}
MISSING=""
for lib in libpng zlib freetype2 libutf8proc; do
if ! pkg-config --exists "$lib" 2>/dev/null; then
MISSING="$MISSING $lib"
fi
done
if [ -n "$MISSING" ]; then
echo "Error: Missing required system libraries:$MISSING"
echo ""
echo "On Debian/Ubuntu: sudo apt install libpng-dev zlib1g-dev libfreetype-dev libutf8proc-dev"
exit 1
fi
for hdr in leptonica/allheaders.h tesseract/capi.h; do
if ! echo "#include <$hdr>" | gcc -E - >/dev/null 2>&1; then
echo "Error: Missing headers for <$hdr>"
echo "On Debian/Ubuntu: sudo apt install libleptonica-dev libtesseract-dev"
exit 1
fi
done
PKG_CFLAGS="$(pkg-config --cflags libpng zlib freetype2 libutf8proc)"
PKG_LIBS="$(pkg-config --libs libpng zlib freetype2 libutf8proc)"
fi
BLD_FLAGS="$BLD_FLAGS -std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DENABLE_OCR -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP"
if [ "$USE_SYSTEM_LIBS" != true ]; then
BLD_FLAGS="$BLD_FLAGS -DFT2_BUILD_LIBRARY"
fi
bit_os=$(getconf LONG_BIT)
if [ "$bit_os" == "64" ]
then
BLD_FLAGS="$BLD_FLAGS -DGPAC_64_BITS"
fi
BLD_INCLUDE="-I../src -I /usr/include/leptonica/ -I /usr/include/tesseract/ -I../src/lib_ccx/ -I /usr/include/gpac/ -I../src/thirdparty/libpng -I../src/thirdparty/zlib -I../src/lib_ccx/zvbi -I../src/thirdparty/lib_hash -I../src/thirdparty -I../src/thirdparty/freetype/include"
SRC_LIBPNG="$(find ../src/thirdparty/libpng/ -name '*.c')"
SRC_ZLIB="$(find ../src/thirdparty/zlib/ -name '*.c')"
SRC_CCX="$(find ../src/lib_ccx/ -name '*.c')"
SRC_GPAC="$(find /usr/include/gpac/ -name '*.c' 2>/dev/null)"
SRC_HASH="$(find ../src/thirdparty/lib_hash/ -name '*.c')"
SRC_UTF8PROC="../src/thirdparty/utf8proc/utf8proc.c"
SRC_FREETYPE="../src/thirdparty/freetype/autofit/autofit.c
../src/thirdparty/freetype/base/ftbase.c
../src/thirdparty/freetype/base/ftbbox.c
../src/thirdparty/freetype/base/ftbdf.c
../src/thirdparty/freetype/base/ftbitmap.c
../src/thirdparty/freetype/base/ftcid.c
../src/thirdparty/freetype/base/ftfntfmt.c
../src/thirdparty/freetype/base/ftfstype.c
../src/thirdparty/freetype/base/ftgasp.c
../src/thirdparty/freetype/base/ftglyph.c
../src/thirdparty/freetype/base/ftgxval.c
../src/thirdparty/freetype/base/ftinit.c
../src/thirdparty/freetype/base/ftlcdfil.c
../src/thirdparty/freetype/base/ftmm.c
../src/thirdparty/freetype/base/ftotval.c
../src/thirdparty/freetype/base/ftpatent.c
../src/thirdparty/freetype/base/ftpfr.c
../src/thirdparty/freetype/base/ftstroke.c
../src/thirdparty/freetype/base/ftsynth.c
../src/thirdparty/freetype/base/ftsystem.c
../src/thirdparty/freetype/base/fttype1.c
../src/thirdparty/freetype/base/ftwinfnt.c
../src/thirdparty/freetype/bdf/bdf.c
../src/thirdparty/freetype/bzip2/ftbzip2.c
../src/thirdparty/freetype/cache/ftcache.c
../src/thirdparty/freetype/cff/cff.c
../src/thirdparty/freetype/cid/type1cid.c
../src/thirdparty/freetype/gzip/ftgzip.c
../src/thirdparty/freetype/lzw/ftlzw.c
../src/thirdparty/freetype/pcf/pcf.c
../src/thirdparty/freetype/pfr/pfr.c
../src/thirdparty/freetype/psaux/psaux.c
../src/thirdparty/freetype/pshinter/pshinter.c
../src/thirdparty/freetype/psnames/psnames.c
../src/thirdparty/freetype/raster/raster.c
../src/thirdparty/freetype/sfnt/sfnt.c
../src/thirdparty/freetype/smooth/smooth.c
../src/thirdparty/freetype/truetype/truetype.c
../src/thirdparty/freetype/type1/type1.c
../src/thirdparty/freetype/type42/type42.c
../src/thirdparty/freetype/winfonts/winfnt.c"
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_LIBPNG $SRC_HASH $SRC_UTF8PROC $SRC_FREETYPE"
BLD_LINKER="$BLD_LINKER -lm -zmuldefs -l tesseract -l leptonica -lpthread -ldl -lgpac"
if [ "$USE_SYSTEM_LIBS" = true ]; then
LEPTONICA_CFLAGS="$(pkg-config --cflags --silence-errors lept)"
TESSERACT_CFLAGS="$(pkg-config --cflags --silence-errors tesseract)"
GPAC_CFLAGS="$(pkg-config --cflags --silence-errors gpac)"
BLD_INCLUDE="-I../src -I../src/lib_ccx -I../src/lib_ccx/zvbi -I../src/thirdparty/lib_hash \
$PKG_CFLAGS $LEPTONICA_CFLAGS $TESSERACT_CFLAGS $GPAC_CFLAGS"
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_HASH"
# Preserve FFmpeg libraries if -hardsubx was specified
FFMPEG_LIBS=""
if [ "$HARDSUBX" = true ]; then
FFMPEG_LIBS="-lswscale -lavutil -pthread -lavformat -lavcodec -lavfilter -lxcb-shm -lxcb -lX11 -llzma -lswresample"
fi
BLD_LINKER="$PKG_LIBS -ltesseract -lleptonica -lgpac -lpthread -ldl -lm $FFMPEG_LIBS"
fi
echo "Running pre-build script..."
./pre-build.sh
gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractor $BLD_SOURCES $BLD_LINKER
echo "Trying to compile..."
BLD_LINKER="$BLD_LINKER ./libccx_rust.a"
echo "Checking for cargo..."
if ! [ -x "$(command -v cargo)" ]; then
echo 'Error: cargo is not installed.' >&2
exit 1
fi
rustc_version="$(rustc --version)"
semver=( ${rustc_version//./ } )
version="${semver[1]}.${semver[2]}.${semver[3]}"
MSRV="1.87.0"
if [ "$(printf '%s\n' "$MSRV" "$version" | sort -V | head -n1)" = "$MSRV" ]; then
echo "rustc >= MSRV(${MSRV})"
else
echo "Minimum supported rust version(MSRV) is ${MSRV}, please upgrade rust"
exit 1
fi
echo "Building rust files..."
(cd ../src/rust && CARGO_TARGET_DIR=../../linux/rust cargo build $RUST_PROFILE $RUST_FEATURES) || { echo "Failed. " ; exit 1; }
cp $RUST_LIB ./libccx_rust.a
echo "Building ccextractor"
out=$((LC_ALL=C gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractor $BLD_SOURCES $BLD_LINKER)2>&1)
res=$?
if [[ $out == *"gcc: command not found"* ]]
then
echo "Error: please install gcc";
exit 1
fi
if [[ $out == *"curl.h: No such file or directory"* ]]
then
echo "Error: please install curl development library (libcurl4-gnutls-dev for Debian/Ubuntu)";
exit 2
fi
if [[ $out == *"capi.h: No such file or directory"* ]]
then
echo "Error: please install tesseract development library (libtesseract-dev for Debian/Ubuntu)";
exit 3
fi
if [[ $out == *"allheaders.h: No such file or directory"* ]]
then
echo "Error: please install leptonica development library (libleptonica-dev for Debian/Ubuntu)";
exit 4
fi
if [[ $res -ne 0 ]] # Unknown error
then
echo "Compiled with errors"
>&2 echo "$out"
exit 5
fi
if [[ "$out" != "" ]] ; then
echo "$out"
echo "Compilation successful, compiler message shown in previous lines"
else
echo "Compilation successful, no compiler messages."
fi
if [ -d ./utf8proc_compat ]; then
rm -rf ./utf8proc_compat
fi

230
linux/build_appimage.sh Executable file
View File

@@ -0,0 +1,230 @@
#!/bin/bash
#
# CCExtractor AppImage Build Script
#
# Build variants via BUILD_TYPE environment variable:
# - minimal: Basic CCExtractor without OCR (smallest size)
# - ocr: CCExtractor with OCR support (default)
# - hardsubx: CCExtractor with burned-in subtitle extraction (requires FFmpeg)
#
# Usage:
# ./build_appimage.sh # Builds 'ocr' variant (default)
# BUILD_TYPE=minimal ./build_appimage.sh
# BUILD_TYPE=hardsubx ./build_appimage.sh
#
# Requirements:
# - CMake, GCC, pkg-config, Rust toolchain
# - For OCR: tesseract-ocr, libtesseract-dev, libleptonica-dev
# - For HardSubX: libavcodec-dev, libavformat-dev, libswscale-dev, etc.
# - wget for downloading linuxdeploy
#
set -e
# Build type: minimal, ocr, hardsubx (default: ocr)
BUILD_TYPE="${BUILD_TYPE:-ocr}"
echo "=========================================="
echo "CCExtractor AppImage Builder"
echo "Build type: $BUILD_TYPE"
echo "=========================================="
# Validate build type
case "$BUILD_TYPE" in
minimal|ocr|hardsubx)
;;
*)
echo "Error: Invalid BUILD_TYPE '$BUILD_TYPE'"
echo "Valid options: minimal, ocr, hardsubx"
exit 1
;;
esac
# Store paths
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
REPO_ROOT="$(dirname "$SCRIPT_DIR")"
BUILD_DIR="$SCRIPT_DIR/appimage_build"
# Clean up function
cleanup() {
if [ -d "$BUILD_DIR" ]; then
echo "Cleaning up build directory..."
rm -rf "$BUILD_DIR"
fi
}
# Cleanup on exit (comment out for debugging)
trap cleanup EXIT
# Create fresh build directory
rm -rf "$BUILD_DIR" 2>/dev/null || true
mkdir -p "$BUILD_DIR"
cd "$BUILD_DIR"
# Determine CMake options based on build type
CMAKE_OPTIONS=""
case "$BUILD_TYPE" in
minimal)
CMAKE_OPTIONS=""
;;
ocr)
CMAKE_OPTIONS="-DWITH_OCR=ON"
;;
hardsubx)
CMAKE_OPTIONS="-DWITH_OCR=ON -DWITH_HARDSUBX=ON -DWITH_FFMPEG=ON"
;;
esac
echo "CMake options: $CMAKE_OPTIONS"
# Configure with CMake
echo "Configuring with CMake..."
cmake $CMAKE_OPTIONS "$REPO_ROOT/src"
# Build
echo "Building CCExtractor..."
make -j$(nproc)
# Verify binary was built
if [ ! -f "$BUILD_DIR/ccextractor" ]; then
echo "Error: ccextractor binary not found after build"
exit 1
fi
echo "Build successful!"
"$BUILD_DIR/ccextractor" --version
# Download linuxdeploy
echo "Downloading linuxdeploy..."
LINUXDEPLOY_URL="https://github.com/linuxdeploy/linuxdeploy/releases/download/continuous/linuxdeploy-x86_64.AppImage"
wget -q --show-progress "$LINUXDEPLOY_URL" -O linuxdeploy-x86_64.AppImage
chmod +x linuxdeploy-x86_64.AppImage
# Create AppDir structure
echo "Creating AppDir structure..."
mkdir -p AppDir/usr/bin
mkdir -p AppDir/usr/share/icons/hicolor/256x256/apps
mkdir -p AppDir/usr/share/applications
mkdir -p AppDir/usr/share/tessdata
# Copy binary
cp "$BUILD_DIR/ccextractor" AppDir/usr/bin/
# Download icon
echo "Downloading icon..."
PNG_URL="https://ccextractor.org/images/ccextractor.png"
if wget -q "$PNG_URL" -O AppDir/usr/share/icons/hicolor/256x256/apps/ccextractor.png 2>/dev/null; then
echo "Icon downloaded successfully"
else
# Create a simple placeholder icon if download fails
echo "Warning: Could not download icon, creating placeholder"
convert -size 256x256 xc:navy -fill white -gravity center -pointsize 40 -annotate 0 "CCX" \
AppDir/usr/share/icons/hicolor/256x256/apps/ccextractor.png 2>/dev/null || \
echo "P3 256 256 255" > AppDir/usr/share/icons/hicolor/256x256/apps/ccextractor.ppm
fi
# Create desktop file
cat > AppDir/usr/share/applications/ccextractor.desktop << 'EOF'
[Desktop Entry]
Type=Application
Name=CCExtractor
Comment=Extract closed captions and subtitles from video files
Exec=ccextractor
Icon=ccextractor
Categories=AudioVideo;Video;
Terminal=true
NoDisplay=true
EOF
# Copy desktop file to AppDir root (required by linuxdeploy)
cp AppDir/usr/share/applications/ccextractor.desktop AppDir/
# Copy icon to AppDir root
cp AppDir/usr/share/icons/hicolor/256x256/apps/ccextractor.png AppDir/ 2>/dev/null || true
# For OCR builds, bundle tessdata
if [ "$BUILD_TYPE" = "ocr" ] || [ "$BUILD_TYPE" = "hardsubx" ]; then
echo "Bundling tessdata for OCR support..."
# Try to find system tessdata
TESSDATA_PATHS=(
"/usr/share/tesseract-ocr/5/tessdata"
"/usr/share/tesseract-ocr/4.00/tessdata"
"/usr/share/tessdata"
"/usr/local/share/tessdata"
)
TESSDATA_SRC=""
for path in "${TESSDATA_PATHS[@]}"; do
if [ -d "$path" ] && [ -f "$path/eng.traineddata" ]; then
TESSDATA_SRC="$path"
break
fi
done
if [ -n "$TESSDATA_SRC" ]; then
echo "Found tessdata at: $TESSDATA_SRC"
# Copy English language data (most common)
cp "$TESSDATA_SRC/eng.traineddata" AppDir/usr/share/tessdata/ 2>/dev/null || true
# Copy OSD (orientation and script detection) if available
cp "$TESSDATA_SRC/osd.traineddata" AppDir/usr/share/tessdata/ 2>/dev/null || true
else
echo "Warning: tessdata not found, downloading English language data..."
wget -q "https://github.com/tesseract-ocr/tessdata/raw/main/eng.traineddata" \
-O AppDir/usr/share/tessdata/eng.traineddata || true
fi
# Create wrapper script that sets TESSDATA_PREFIX
mv AppDir/usr/bin/ccextractor AppDir/usr/bin/ccextractor.bin
cat > AppDir/usr/bin/ccextractor << 'WRAPPER'
#!/bin/bash
SELF_DIR="$(dirname "$(readlink -f "$0")")"
export TESSDATA_PREFIX="${SELF_DIR}/../share/tessdata"
exec "${SELF_DIR}/ccextractor.bin" "$@"
WRAPPER
chmod +x AppDir/usr/bin/ccextractor
fi
# Determine output name based on build type
ARCH="x86_64"
case "$BUILD_TYPE" in
minimal)
OUTPUT_NAME="ccextractor-minimal-${ARCH}.AppImage"
;;
ocr)
OUTPUT_NAME="ccextractor-${ARCH}.AppImage"
;;
hardsubx)
OUTPUT_NAME="ccextractor-hardsubx-${ARCH}.AppImage"
;;
esac
# Build AppImage
echo "Building AppImage..."
export OUTPUT="$OUTPUT_NAME"
# Determine which executable to pass to linuxdeploy
# For OCR builds, we have a wrapper script, so pass the actual binary (.bin)
if [ -f "AppDir/usr/bin/ccextractor.bin" ]; then
LINUXDEPLOY_EXEC="AppDir/usr/bin/ccextractor.bin"
else
LINUXDEPLOY_EXEC="AppDir/usr/bin/ccextractor"
fi
./linuxdeploy-x86_64.AppImage \
--appdir=AppDir \
--executable="$LINUXDEPLOY_EXEC" \
--desktop-file=AppDir/ccextractor.desktop \
--icon-file=AppDir/ccextractor.png \
--output=appimage
# Move to output directory
mv "$OUTPUT_NAME" "$SCRIPT_DIR/"
echo "=========================================="
echo "AppImage built successfully!"
echo "Output: $SCRIPT_DIR/$OUTPUT_NAME"
echo ""
echo "Test with: $SCRIPT_DIR/$OUTPUT_NAME --version"
echo "=========================================="

3
linux/build_hardsubx Executable file
View File

@@ -0,0 +1,3 @@
#!/usr/bin/env bash
./build -hardsubx

View File

@@ -1,14 +1,3 @@
#!/bin/bash
BLD_FLAGS="-g -std=gnu99 -Wno-write-strings -DGPAC_CONFIG_LINUX -D_FILE_OFFSET_BITS=64"
BLD_INCLUDE="-I../src/lib_ccx/ -I../src/gpacmp4/ -I../src/libpng/ -I../src/zlib/ -I../src/zvbi -I../src/lib_hash"
SRC_LIBPNG="$(find ../src/libpng/ -name '*.c')"
SRC_ZLIB="$(find ../src/zlib/ -name '*.c')"
SRC_ZVBI="$(find ../src/zvbi/ -name '*.c')"
SRC_CCX="$(find ../src/lib_ccx/ -name '*.c')"
SRC_GPAC="$(find ../src/gpacmp4/ -name '*.c')"
SRC_HASH="$(find ../src/lib_hash/ -name '*.c')"
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_GPAC $SRC_ZLIB $SRC_ZVBI $SRC_LIBPNG $SRC_HASH"
BLD_LINKER="-lm -zmuldefs"
#!/usr/bin/env bash
./pre-build.sh
gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractor $BLD_SOURCES $BLD_LINKER
./build -debug

4
linux/cleanup Executable file
View File

@@ -0,0 +1,4 @@
#!/usr/bin/env bash
make distclean > /dev/null 2>&1 || true
rm -rf Makefile configure *.in config.status config.log aclocal.m4 build-conf autom4te.cache

165
linux/configure.ac Normal file
View File

@@ -0,0 +1,165 @@
# -*- Autoconf -*-
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.71])
AC_INIT([CCExtractor], [0.96.5], [carlos@ccextractor.org])
AC_CONFIG_AUX_DIR([build-conf])
AC_CONFIG_SRCDIR([../src/ccextractor.c])
AM_INIT_AUTOMAKE([foreign subdir-objects])
AC_CONFIG_MACRO_DIRS([m4])
# Checks for programs.
AC_PROG_CC
AC_PROG_INSTALL
AC_PROG_MAKE_SET
#Checks for "pkg-config" utility
AC_MSG_CHECKING([pkg-config m4 macros])
if test m4_ifdef([PKG_CHECK_MODULES], [yes], [no]) = yes; then
AC_MSG_RESULT([yes]);
else
AC_MSG_RESULT([no]);
AC_MSG_ERROR([
pkg-config is required.])
fi
# Checks for libraries.
AC_CHECK_LIB([m], [sin], [], [AC_MSG_ERROR(Math library not installed. Install it before proceeding.)])
AC_CHECK_LIB([leptonica], [getLeptonicaVersion], [HAS_LEPT=1 && PKG_CHECK_MODULES([lept], [lept])], [HAS_LEPT=0])
AC_CHECK_LIB([tesseract], [TessVersion], [HAS_TESSERACT=1 && PKG_CHECK_MODULES([tesseract], [tesseract])], [HAS_TESSERACT=0])
AC_CHECK_LIB([avcodec], [avcodec_version], [HAS_AVCODEC=1 && PKG_CHECK_MODULES([libavcodec], [libavcodec])], [HAS_AVCODEC=0])
AC_CHECK_LIB([avformat], [avformat_version], [HAS_AVFORMAT=1 && PKG_CHECK_MODULES([libavformat], [libavformat])], [HAS_AVFORMAT=0])
AC_CHECK_LIB([avutil], [avutil_version], [HAS_AVUTIL=1 && PKG_CHECK_MODULES([libavutil], [libavutil])], [HAS_AVUTIL=0])
AC_CHECK_LIB([swscale], [swscale_version], [HAS_SWSCALE=1 && PKG_CHECK_MODULES([libswscale], [libswscale])], [HAS_SWSCALE=0])
# Check for GPAC library (required for MP4 support)
PKG_CHECK_MODULES([gpac], [gpac], [HAS_GPAC=1], [HAS_GPAC=0])
AS_IF([test $HAS_GPAC -eq 0],
[AC_MSG_ERROR([GPAC library not found. Install gpac-devel (Fedora/RHEL), libgpac-dev (Debian/Ubuntu), or gpac (Arch) before proceeding.])])
# Checks for header files.
AC_CHECK_HEADERS([arpa/inet.h fcntl.h float.h inttypes.h limits.h locale.h malloc.h netdb.h netinet/in.h stddef.h stdint.h stdlib.h string.h sys/socket.h sys/time.h sys/timeb.h termios.h unistd.h wchar.h])
# Checks for typedefs, structures, and compiler characteristics.
AC_CHECK_HEADER_STDBOOL
AC_C_INLINE
AC_TYPE_INT16_T
AC_TYPE_INT32_T
AC_TYPE_INT64_T
AC_TYPE_INT8_T
AC_TYPE_OFF_T
AC_TYPE_PID_T
AC_TYPE_SIZE_T
AC_TYPE_SSIZE_T
AC_TYPE_UINT16_T
AC_TYPE_UINT32_T
AC_TYPE_UINT64_T
AC_TYPE_UINT8_T
AC_CHECK_TYPES([ptrdiff_t])
# Checks for library functions.
AC_FUNC_ERROR_AT_LINE
AC_FUNC_FSEEKO
AC_FUNC_MALLOC
AC_FUNC_MKTIME
AC_FUNC_REALLOC
AC_FUNC_STRERROR_R
AC_CHECK_FUNCS([floor ftruncate gethostbyname gettimeofday inet_ntoa mblen memchr memmove memset mkdir modf pow realpath rmdir select setlocale socket sqrt strcasecmp strchr strdup strerror strndup strrchr strstr strtol])
# Checks for arguments with configure
AC_ARG_ENABLE([hardsubx],
AS_HELP_STRING([--enable-hardsubx], [Enables extraction of burnt subtitles (hard subtitles)]),
[case "${enableval}" in
yes) hardsubx=true ;;
no) hardsubx=false ;;
*) AC_MSG_ERROR([bad value ${enableval} for --enable-hardsubx]) ;;
esac],[hardsubx=false])
AC_ARG_ENABLE([ocr],
AS_HELP_STRING([--enable-ocr], [Enables Optical Character Recognition]),
[case "${enableval}" in
yes) ocr=true ;;
no) ocr=false ;;
*) AC_MSG_ERROR([bad value ${enableval} for --enable-ocr]) ;;
esac],[ocr=false])
AC_ARG_ENABLE([ffmpeg],
AS_HELP_STRING([--enable-ffmpeg], [Enable FFmpeg integration]),
[case "${enableval}" in
yes) ffmpeg=true ;;
no) ffmpeg=false ;;
*) AC_MSG_ERROR([bad value ${enableval} for --enable-ffmpeg]) ;;
esac],[ffmpeg=false])
#Add argument for rust
AC_ARG_WITH([rust],
AS_HELP_STRING([--with-rust], [Builds CCExtractor with rust library]),
[with_rust=$withval],
[with_rust=yes])
AC_MSG_CHECKING(whether to build with rust library)
if test "x$with_rust" = "xyes" ; then
AC_MSG_RESULT(yes)
#Check if cargo and rust is installed
AC_PATH_PROG([CARGO], [cargo], [notfound])
AS_IF([test "$CARGO" = "notfound"], [AC_MSG_ERROR([cargo is required])])
AC_PATH_PROG([RUSTC], [rustc], [notfound])
AS_IF([test "$RUSTC" = "notfound"], [AC_MSG_ERROR([rustc is required])])
rustc_version=$(rustc --version)
MSRV="1.87.0"
AX_COMPARE_VERSION($rustc_version, [ge], [$MSRV],
[AC_MSG_RESULT(rustc >= $MSRV)],
[AC_MSG_ERROR([Minimum supported rust version(MSRV) is $MSRV, please upgrade rust])])
else
AC_MSG_RESULT(no)
fi
AM_CONDITIONAL([WITH_RUST], [test "x$with_rust" = "xyes"])
AC_ARG_ENABLE(debug,
AS_HELP_STRING([--enable-debug],
[Build Rust code with debugging information [default=no]]),
[debug_release=$enableval],
[debug_release=no])
AC_MSG_CHECKING(whether to build Rust code with debugging information)
if test "x$debug_release" = "xyes" ; then
AC_MSG_RESULT(yes)
RUST_TARGET_SUBDIR=debug
else
AC_MSG_RESULT(no)
RUST_TARGET_SUBDIR=release
fi
AM_CONDITIONAL([DEBUG_RELEASE], [test "x$debug_release" = "xyes"])
AC_SUBST([RUST_TARGET_SUBDIR])
#Checks and prompts if libraries found/not found to avoild failure while building
AS_IF([ test x$hardsubx = xtrue && test $HAS_AVCODEC -gt 0 ], [AC_MSG_NOTICE(avcodec library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_AVCODEC -gt 0 ], [AC_MSG_ERROR(avcodec library not found. Please install the avcodec library before proceeding)])
AS_IF([ test x$hardsubx = xtrue && test $HAS_AVFORMAT -gt 0 ], [AC_MSG_NOTICE(avformat library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_AVFORMAT -gt 0 ], [AC_MSG_ERROR(avformat library not found. Please install the avformat library before proceeding)])
AS_IF([ test x$hardsubx = xtrue && test $HAS_AVUTIL -gt 0 ], [AC_MSG_NOTICE(avutil library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_AVUTIL -gt 0 ], [AC_MSG_ERROR(avutil library not found. Please install the avutil library before proceeding)])
AS_IF([ test x$hardsubx = xtrue && test $HAS_SWSCALE -gt 0 ], [AC_MSG_NOTICE(swscale library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_SWSCALE -gt 0 ], [AC_MSG_ERROR(swscale library not found. Please install the swscale library before proceeding)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test $HAS_TESSERACT -gt 0 ], [TESS_VERSION=$(tesseract --version 2>&1 | grep tesseract) && AC_MSG_NOTICE(tesseract library found... $TESS_VERSION)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test ! $HAS_TESSERACT -gt 0 ], [AC_MSG_ERROR(tesserect library not found. Please install the tesseract library before proceeding)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test $HAS_LEPT -gt 0 ], [LEPT_VERSION=$(tesseract --version 2>&1 | grep leptonica) && AC_MSG_NOTICE(leptonica library found... $LEPT_VERSION)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test ! $HAS_LEPT -gt 0 ], [AC_MSG_ERROR(leptonica library not found. Please install the leptonica library before proceeding)])
#AM_CONDITIONAL(s) for setting values to enable/disable flags in Makefile.am
AM_CONDITIONAL(HARDSUBX_IS_ENABLED, [ test x$hardsubx = xtrue ])
AM_CONDITIONAL(OCR_IS_ENABLED, [ test x$ocr = xtrue || test x$hardsubx = xtrue ])
AM_CONDITIONAL(FFMPEG_IS_ENABLED, [ test x$ffmpeg = xtrue ])
AM_CONDITIONAL(TESSERACT_PRESENT, [ test ! -z "$(pkg-config --libs-only-l --silence-errors tesseract)" ])
AM_CONDITIONAL(TESSERACT_PRESENT_RPI, [ test -d "/usr/include/tesseract" && test $(ls -A /usr/include/tesseract | wc -l) -gt 0 ])
AM_CONDITIONAL(SYS_IS_LINUX, [ test $(uname -s) = "Linux"])
AM_CONDITIONAL(SYS_IS_MAC, [ test $(uname -s) = "Darwin"])
AM_CONDITIONAL(SYS_IS_APPLE_SILICON, [ test $(uname -a | awk '{print $NF}') = "arm64" ])
AM_CONDITIONAL(SYS_IS_64_BIT,[test $(getconf LONG_BIT) = "64"])
AC_CONFIG_FILES([Makefile])
AC_OUTPUT

1
linux/description-pak Normal file
View File

@@ -0,0 +1 @@
Best open source tool for a subtitled world

View File

@@ -0,0 +1,177 @@
# ===========================================================================
# https://www.gnu.org/software/autoconf-archive/ax_compare_version.html
# ===========================================================================
#
# SYNOPSIS
#
# AX_COMPARE_VERSION(VERSION_A, OP, VERSION_B, [ACTION-IF-TRUE], [ACTION-IF-FALSE])
#
# DESCRIPTION
#
# This macro compares two version strings. Due to the various number of
# minor-version numbers that can exist, and the fact that string
# comparisons are not compatible with numeric comparisons, this is not
# necessarily trivial to do in a autoconf script. This macro makes doing
# these comparisons easy.
#
# The six basic comparisons are available, as well as checking equality
# limited to a certain number of minor-version levels.
#
# The operator OP determines what type of comparison to do, and can be one
# of:
#
# eq - equal (test A == B)
# ne - not equal (test A != B)
# le - less than or equal (test A <= B)
# ge - greater than or equal (test A >= B)
# lt - less than (test A < B)
# gt - greater than (test A > B)
#
# Additionally, the eq and ne operator can have a number after it to limit
# the test to that number of minor versions.
#
# eq0 - equal up to the length of the shorter version
# ne0 - not equal up to the length of the shorter version
# eqN - equal up to N sub-version levels
# neN - not equal up to N sub-version levels
#
# When the condition is true, shell commands ACTION-IF-TRUE are run,
# otherwise shell commands ACTION-IF-FALSE are run. The environment
# variable 'ax_compare_version' is always set to either 'true' or 'false'
# as well.
#
# Examples:
#
# AX_COMPARE_VERSION([3.15.7],[lt],[3.15.8])
# AX_COMPARE_VERSION([3.15],[lt],[3.15.8])
#
# would both be true.
#
# AX_COMPARE_VERSION([3.15.7],[eq],[3.15.8])
# AX_COMPARE_VERSION([3.15],[gt],[3.15.8])
#
# would both be false.
#
# AX_COMPARE_VERSION([3.15.7],[eq2],[3.15.8])
#
# would be true because it is only comparing two minor versions.
#
# AX_COMPARE_VERSION([3.15.7],[eq0],[3.15])
#
# would be true because it is only comparing the lesser number of minor
# versions of the two values.
#
# Note: The characters that separate the version numbers do not matter. An
# empty string is the same as version 0. OP is evaluated by autoconf, not
# configure, so must be a string, not a variable.
#
# The author would like to acknowledge Guido Draheim whose advice about
# the m4_case and m4_ifvaln functions make this macro only include the
# portions necessary to perform the specific comparison specified by the
# OP argument in the final configure script.
#
# LICENSE
#
# Copyright (c) 2008 Tim Toolan <toolan@ele.uri.edu>
#
# Copying and distribution of this file, with or without modification, are
# permitted in any medium without royalty provided the copyright notice
# and this notice are preserved. This file is offered as-is, without any
# warranty.
#serial 13
dnl #########################################################################
AC_DEFUN([AX_COMPARE_VERSION], [
AC_REQUIRE([AC_PROG_AWK])
# Used to indicate true or false condition
ax_compare_version=false
# Convert the two version strings to be compared into a format that
# allows a simple string comparison. The end result is that a version
# string of the form 1.12.5-r617 will be converted to the form
# 0001001200050617. In other words, each number is zero padded to four
# digits, and non digits are removed.
AS_VAR_PUSHDEF([A],[ax_compare_version_A])
A=`echo "$1" | sed -e 's/\([[0-9]]*\)/Z\1Z/g' \
-e 's/Z\([[0-9]]\)Z/Z0\1Z/g' \
-e 's/Z\([[0-9]][[0-9]]\)Z/Z0\1Z/g' \
-e 's/Z\([[0-9]][[0-9]][[0-9]]\)Z/Z0\1Z/g' \
-e 's/[[^0-9]]//g'`
AS_VAR_PUSHDEF([B],[ax_compare_version_B])
B=`echo "$3" | sed -e 's/\([[0-9]]*\)/Z\1Z/g' \
-e 's/Z\([[0-9]]\)Z/Z0\1Z/g' \
-e 's/Z\([[0-9]][[0-9]]\)Z/Z0\1Z/g' \
-e 's/Z\([[0-9]][[0-9]][[0-9]]\)Z/Z0\1Z/g' \
-e 's/[[^0-9]]//g'`
dnl # In the case of le, ge, lt, and gt, the strings are sorted as necessary
dnl # then the first line is used to determine if the condition is true.
dnl # The sed right after the echo is to remove any indented white space.
m4_case(m4_tolower($2),
[lt],[
ax_compare_version=`echo "x$A
x$B" | sed 's/^ *//' | sort -r | sed "s/x${A}/false/;s/x${B}/true/;1q"`
],
[gt],[
ax_compare_version=`echo "x$A
x$B" | sed 's/^ *//' | sort | sed "s/x${A}/false/;s/x${B}/true/;1q"`
],
[le],[
ax_compare_version=`echo "x$A
x$B" | sed 's/^ *//' | sort | sed "s/x${A}/true/;s/x${B}/false/;1q"`
],
[ge],[
ax_compare_version=`echo "x$A
x$B" | sed 's/^ *//' | sort -r | sed "s/x${A}/true/;s/x${B}/false/;1q"`
],[
dnl Split the operator from the subversion count if present.
m4_bmatch(m4_substr($2,2),
[0],[
# A count of zero means use the length of the shorter version.
# Determine the number of characters in A and B.
ax_compare_version_len_A=`echo "$A" | $AWK '{print(length)}'`
ax_compare_version_len_B=`echo "$B" | $AWK '{print(length)}'`
# Set A to no more than B's length and B to no more than A's length.
A=`echo "$A" | sed "s/\(.\{$ax_compare_version_len_B\}\).*/\1/"`
B=`echo "$B" | sed "s/\(.\{$ax_compare_version_len_A\}\).*/\1/"`
],
[[0-9]+],[
# A count greater than zero means use only that many subversions
A=`echo "$A" | sed "s/\(\([[0-9]]\{4\}\)\{m4_substr($2,2)\}\).*/\1/"`
B=`echo "$B" | sed "s/\(\([[0-9]]\{4\}\)\{m4_substr($2,2)\}\).*/\1/"`
],
[.+],[
AC_WARNING(
[invalid OP numeric parameter: $2])
],[])
# Pad zeros at end of numbers to make same length.
ax_compare_version_tmp_A="$A`echo $B | sed 's/./0/g'`"
B="$B`echo $A | sed 's/./0/g'`"
A="$ax_compare_version_tmp_A"
# Check for equality or inequality as necessary.
m4_case(m4_tolower(m4_substr($2,0,2)),
[eq],[
test "x$A" = "x$B" && ax_compare_version=true
],
[ne],[
test "x$A" != "x$B" && ax_compare_version=true
],[
AC_WARNING([invalid OP parameter: $2])
])
])
AS_VAR_POPDEF([A])dnl
AS_VAR_POPDEF([B])dnl
dnl # Execute ACTION-IF-TRUE / ACTION-IF-FALSE.
if test "$ax_compare_version" = "true" ; then
m4_ifvaln([$4],[$4],[:])dnl
m4_ifvaln([$5],[else $5])dnl
fi
]) dnl AX_COMPARE_VERSION

10
linux/module_generator Executable file
View File

@@ -0,0 +1,10 @@
#!/usr/bin/env bash
SRC_LIBPNG="$(find ../src/thirdparty/libpng/ -name '*.c')"
SRC_ZLIB="$(find ../src/thirdparty/zlib/ -name '*.c')"
SRC_ZVBI="$(find ../src/thirdparty/zvbi/ -name '*.c')"
SRC_CCX="$(find ../src/lib_ccx/ -name '*.c')"
SRC_HASH="$(find ../src/thirdparty/lib_hash/ -name '*.c')"
SRC_UTF8PROC="../src/utf8proc/utf8proc.c"
BLD_SOURCES="../src/ccextractor.c ../src/ccextractorapi_wrap.c $SRC_CCX $SRC_ZLIB $SRC_ZVBI $SRC_LIBPNG $SRC_HASH $SRC_UTF8PROC"
python setup.py $BLD_SOURCES

View File

@@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash
echo "Obtaining Git commit"
commit=(`git rev-parse HEAD 2>/dev/null`)
if [ -z "$commit" ]; then
@@ -21,14 +21,14 @@ fi
if [ -z "$commit" ]; then
commit="Unknown"
fi
builddate=`date +%Y-%m-%d`
builddate=`date --utc --date="@${SOURCE_DATE_EPOCH:-$(date +%s)}" +%Y-%m-%d`
echo "Storing variables in file"
echo "Commit: $commit"
echo "Date: $builddate"
echo "#ifndef CCX_CCEXTRACTOR_COMPILE_H" > ../src/lib_ccx/compile_info.h
echo "#define CCX_CCEXTRACTOR_COMPILE_H" >> ../src/lib_ccx/compile_info.h
echo "#define GIT_COMMIT \"$commit\"" >> ../src/lib_ccx/compile_info.h
echo "#define COMPILE_DATE \"$builddate\"" >> ../src/lib_ccx/compile_info.h
echo "#endif" >> ../src/lib_ccx/compile_info.h
echo "Stored all in compile.h"
echo "#ifndef CCX_CCEXTRACTOR_COMPILE_REAL_H" > ../src/lib_ccx/compile_info_real.h
echo "#define CCX_CCEXTRACTOR_COMPILE_REAL_H" >> ../src/lib_ccx/compile_info_real.h
echo "#define GIT_COMMIT \"$commit\"" >> ../src/lib_ccx/compile_info_real.h
echo "#define COMPILE_DATE \"$builddate\"" >> ../src/lib_ccx/compile_info_real.h
echo "#endif" >> ../src/lib_ccx/compile_info_real.h
echo "Stored all in compile_info_real.h"
echo "Done."

319
mac/Makefile.am Normal file
View File

@@ -0,0 +1,319 @@
AUTOMAKE_OPTIONS = foreign
ACLOCAL_AMFLAGS = -I m4/
bin_PROGRAMS = ccextractor
ccextractor_SOURCES = \
../src/ccextractor.c \
../src/ccextractor.h \
../src/thirdparty/libpng/pngstruct.h \
../src/thirdparty/libpng/pngpriv.h \
../src/thirdparty/libpng/pnginfo.h \
../src/thirdparty/libpng/pnglibconf.h \
../src/thirdparty/libpng/pngconf.h \
../src/thirdparty/libpng/pngdebug.h \
../src/thirdparty/libpng/png.h \
../src/thirdparty/libpng/png.c \
../src/thirdparty/libpng/pngerror.c \
../src/thirdparty/libpng/pngget.c \
../src/thirdparty/libpng/pngmem.c \
../src/thirdparty/libpng/pngpread.c \
../src/thirdparty/libpng/pngread.c \
../src/thirdparty/libpng/pngrio.c \
../src/thirdparty/libpng/pngrtran.c \
../src/thirdparty/libpng/pngrutil.c \
../src/thirdparty/libpng/pngset.c \
../src/thirdparty/libpng/pngtrans.c \
../src/thirdparty/libpng/pngwio.c \
../src/thirdparty/libpng/pngwrite.c \
../src/thirdparty/libpng/pngwtran.c \
../src/thirdparty/libpng/pngwutil.c \
../src/lib_ccx/ccx_common_common.h \
../src/lib_ccx/ccx_common_option.h \
../src/lib_ccx/utility.h \
../src/lib_ccx/activity.h \
../src/lib_ccx/asf_constants.h \
../src/lib_ccx/avc_functions.h \
../src/lib_ccx/cc_bitstream.h \
../src/lib_ccx/ccx_common_option.c \
../src/lib_ccx/ccx_common_common.c \
../src/lib_ccx/utility.c \
../src/lib_ccx/activity.c \
../src/lib_ccx/asf_functions.c \
../src/lib_ccx/avc_functions.c \
../src/lib_ccx/cc_bitstream.c \
../src/lib_ccx/ccx_common_char_encoding.c \
../src/lib_ccx/ccx_common_char_encoding.h \
../src/lib_ccx/ccx_common_constants.c \
../src/lib_ccx/ccx_common_constants.h \
../src/lib_ccx/ccx_common_platform.h \
../src/lib_ccx/ccx_common_structs.h \
../src/lib_ccx/ccx_common_timing.c \
../src/lib_ccx/ccx_common_timing.h \
../src/lib_ccx/ccx_decoders_608.c \
../src/lib_ccx/ccx_decoders_608.h \
../src/lib_ccx/ccx_decoders_708.c \
../src/lib_ccx/ccx_decoders_708_encoding.c \
../src/lib_ccx/ccx_decoders_708_encoding.h \
../src/lib_ccx/ccx_decoders_708.h \
../src/lib_ccx/ccx_decoders_708_output.c \
../src/lib_ccx/ccx_decoders_708_output.h \
../src/lib_ccx/ccx_decoders_common.c \
../src/lib_ccx/ccx_decoders_common.h \
../src/lib_ccx/ccx_decoders_isdb.c \
../src/lib_ccx/ccx_decoders_isdb.h \
../src/lib_ccx/ccx_decoders_structs.h \
../src/lib_ccx/ccx_decoders_vbi.c \
../src/lib_ccx/ccx_decoders_vbi.h \
../src/lib_ccx/ccx_decoders_xds.c \
../src/lib_ccx/ccx_decoders_xds.h \
../src/lib_ccx/ccx_demuxer.c \
../src/lib_ccx/ccx_demuxer.h \
../src/lib_ccx/ccx_demuxer_mxf.c \
../src/lib_ccx/ccx_demuxer_mxf.h \
../src/lib_ccx/ccx_dtvcc.c \
../src/lib_ccx/ccx_dtvcc.h \
../src/lib_ccx/ccx_encoders_common.c \
../src/lib_ccx/ccx_encoders_common.h \
../src/lib_ccx/ccx_encoders_curl.c \
../src/lib_ccx/ccx_encoders_g608.c \
../src/lib_ccx/ccx_encoders_helpers.c \
../src/lib_ccx/ccx_encoders_helpers.h \
../src/lib_ccx/ccx_encoders_mcc.c \
../src/lib_ccx/ccx_encoders_mcc.h \
../src/lib_ccx/ccx_encoders_sami.c \
../src/lib_ccx/ccx_encoders_scc.c \
../src/lib_ccx/ccx_encoders_smptett.c \
../src/lib_ccx/ccx_encoders_splitbysentence.c \
../src/lib_ccx/ccx_encoders_spupng.c \
../src/lib_ccx/ccx_encoders_srt.c \
../src/lib_ccx/ccx_encoders_ssa.c \
../src/lib_ccx/ccx_encoders_structs.h \
../src/lib_ccx/ccx_encoders_transcript.c \
../src/lib_ccx/ccx_encoders_webvtt.c \
../src/lib_ccx/ccx_encoders_xds.c \
../src/lib_ccx/ccx_encoders_xds.h \
../src/lib_ccx/ccx_gxf.c \
../src/lib_ccx/ccx_gxf.h \
../src/lib_ccx/ccx_mp4.h \
../src/lib_ccx/compile_info.h \
../src/lib_ccx/compile_info_real.h \
../src/lib_ccx/configuration.c \
../src/lib_ccx/configuration.h \
../src/lib_ccx/disable_warnings.h \
../src/lib_ccx/dvb_subtitle_decoder.c \
../src/lib_ccx/dvb_subtitle_decoder.h \
../src/lib_ccx/dvd_subtitle_decoder.c \
../src/lib_ccx/dvd_subtitle_decoder.h \
../src/lib_ccx/es_functions.c \
../src/lib_ccx/es_userdata.c \
../src/lib_ccx/ffmpeg_intgr.c \
../src/lib_ccx/ffmpeg_intgr.h \
../src/lib_ccx/file_buffer.h \
../src/lib_ccx/file_functions.c \
../src/lib_ccx/general_loop.c \
../src/lib_ccx/hamming.h \
../src/lib_ccx/hardsubx.c \
../src/lib_ccx/hardsubx_classifier.c \
../src/lib_ccx/hardsubx_decoder.c \
../src/lib_ccx/hardsubx.h \
../src/lib_ccx/hardsubx_imgops.c \
../src/lib_ccx/hardsubx_utility.c \
../src/lib_ccx/lib_ccx.c \
../src/lib_ccx/lib_ccx.h \
../src/lib_ccx/list.h \
../src/lib_ccx/matroska.c \
../src/lib_ccx/matroska.h \
../src/lib_ccx/vobsub_decoder.c \
../src/lib_ccx/vobsub_decoder.h \
../src/lib_ccx/mp4.c \
../src/lib_ccx/myth.c \
../src/lib_ccx/networking.c \
../src/lib_ccx/networking.h \
../src/lib_ccx/ocr.c \
../src/lib_ccx/ocr.h \
../src/lib_ccx/output.c \
../src/lib_ccx/params.c \
../src/lib_ccx/params_dump.c \
../src/lib_ccx/sequencing.c \
../src/lib_ccx/stdintmsc.h \
../src/lib_ccx/stream_functions.c \
../src/lib_ccx/teletext.h \
../src/lib_ccx/telxcc.c \
../src/lib_ccx/ts_functions.c \
../src/lib_ccx/ts_functions.h \
../src/lib_ccx/ts_info.c \
../src/lib_ccx/ts_tables.c \
../src/lib_ccx/ts_tables_epg.c \
../src/lib_ccx/wtv_constants.h \
../src/lib_ccx/wtv_functions.c \
../src/thirdparty/zlib/adler32.c \
../src/thirdparty/zlib/compress.c \
../src/thirdparty/zlib/crc32.c \
../src/thirdparty/zlib/crc32.h \
../src/thirdparty/zlib/deflate.c \
../src/thirdparty/zlib/deflate.h \
../src/thirdparty/zlib/gzclose.c \
../src/thirdparty/zlib/gzguts.h \
../src/thirdparty/zlib/gzlib.c \
../src/thirdparty/zlib/gzread.c \
../src/thirdparty/zlib/gzwrite.c \
../src/thirdparty/zlib/infback.c \
../src/thirdparty/zlib/inffast.c \
../src/thirdparty/zlib/inffast.h \
../src/thirdparty/zlib/inffixed.h \
../src/thirdparty/zlib/inflate.c \
../src/thirdparty/zlib/inflate.h \
../src/thirdparty/zlib/inftrees.c \
../src/thirdparty/zlib/inftrees.h \
../src/thirdparty/zlib/trees.c \
../src/thirdparty/zlib/trees.h \
../src/thirdparty/zlib/uncompr.c \
../src/thirdparty/zlib/zconf.h \
../src/thirdparty/zlib/zlib.h \
../src/thirdparty/zlib/zutil.c \
../src/thirdparty/zlib/zutil.h \
../src/thirdparty/utf8proc/utf8proc.c \
../src/thirdparty/utf8proc/utf8proc.h \
../src/thirdparty/lib_hash/sha2.c \
../src/thirdparty/lib_hash/sha2.h \
../src/lib_ccx/zvbi/bcd.h \
../src/lib_ccx/zvbi/bit_slicer.c \
../src/lib_ccx/zvbi/bit_slicer.h \
../src/lib_ccx/zvbi/decoder.c \
../src/lib_ccx/zvbi/macros.h \
../src/lib_ccx/zvbi/misc.h \
../src/lib_ccx/zvbi/raw_decoder.c \
../src/lib_ccx/zvbi/raw_decoder.h \
../src/lib_ccx/zvbi/sampling_par.c \
../src/lib_ccx/zvbi/sampling_par.h \
../src/lib_ccx/zvbi/sliced.h \
../src/lib_ccx/zvbi/zvbi_decoder.h \
../src/freetype/* \
../src/thirdparty/freetype/autofit/autofit.c \
../src/thirdparty/freetype/base/ftbase.c \
../src/thirdparty/freetype/base/ftbbox.c \
../src/thirdparty/freetype/base/ftbdf.c \
../src/thirdparty/freetype/base/ftbitmap.c \
../src/thirdparty/freetype/base/ftcid.c \
../src/thirdparty/freetype/base/ftfntfmt.c \
../src/thirdparty/freetype/base/ftfstype.c \
../src/thirdparty/freetype/base/ftgasp.c \
../src/thirdparty/freetype/base/ftglyph.c \
../src/thirdparty/freetype/base/ftgxval.c \
../src/thirdparty/freetype/base/ftinit.c \
../src/thirdparty/freetype/base/ftlcdfil.c \
../src/thirdparty/freetype/base/ftmm.c \
../src/thirdparty/freetype/base/ftotval.c \
../src/thirdparty/freetype/base/ftpatent.c \
../src/thirdparty/freetype/base/ftpfr.c \
../src/thirdparty/freetype/base/ftstroke.c \
../src/thirdparty/freetype/base/ftsynth.c \
../src/thirdparty/freetype/base/ftsystem.c \
../src/thirdparty/freetype/base/fttype1.c \
../src/thirdparty/freetype/base/ftwinfnt.c \
../src/thirdparty/freetype/bdf/bdf.c \
../src/thirdparty/freetype/bzip2/ftbzip2.c \
../src/thirdparty/freetype/cache/ftcache.c \
../src/thirdparty/freetype/cff/cff.c \
../src/thirdparty/freetype/cid/type1cid.c \
../src/thirdparty/freetype/gzip/ftgzip.c \
../src/thirdparty/freetype/include/ft2build.h \
../src/thirdparty/freetype/lzw/ftlzw.c \
../src/thirdparty/freetype/pcf/pcf.c \
../src/thirdparty/freetype/pfr/pfr.c \
../src/thirdparty/freetype/psaux/psaux.c \
../src/thirdparty/freetype/pshinter/pshinter.c \
../src/thirdparty/freetype/psnames/psnames.c \
../src/thirdparty/freetype/raster/raster.c \
../src/thirdparty/freetype/sfnt/sfnt.c \
../src/thirdparty/freetype/smooth/smooth.c \
../src/thirdparty/freetype/truetype/truetype.c \
../src/thirdparty/freetype/type1/type1.c \
../src/thirdparty/freetype/type42/type42.c \
../src/thirdparty/freetype/winfonts/winfnt.c
if SYS_IS_APPLE_SILICON
ccextractor_SOURCES += ../src/thirdparty/libpng/arm/arm_init.c \
../src/thirdparty/libpng/arm/filter_neon_intrinsics.c \
../src/thirdparty/libpng/arm/palette_neon_intrinsics.c
endif
ccextractor_CFLAGS = -std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP
ccextractor_LDFLAGS = $(shell pkg-config --libs gpac)
GPAC_CPPFLAGS = $(shell pkg-config --cflags gpac)
ccextractor_CPPFLAGS =-I../src/lib_ccx/ -I../src/thirdparty/libpng/ -I../src/thirdparty/zlib/ -I../src/lib_ccx/zvbi/ -I../src/thirdparty/lib_hash/ -I../src/thirdparty -I../src/ -I../src/thirdparty/freetype/include/
ccextractor_CPPFLAGS += $(GPAC_CPPFLAGS)
ccextractor_CPPFLAGS += $(FFMPEG_CPPFLAGS)
ccextractor_LDADD=-lm -lpthread -ldl
if SYS_IS_LINUX
ccextractor_CFLAGS += -O3 -s
endif
if SYS_IS_MAC
ccextractor_CFLAGS += -Dfopen64=fopen -Dopen64=open -Dlseek64=lseek
ccextractor_LDADD += -liconv -lz
endif
if SYS_IS_64_BIT
ccextractor_CFLAGS += -DGPAC_64_BITS
endif
HARDSUBX_FEATURE_RUST=
if HARDSUBX_IS_ENABLED
ccextractor_CFLAGS += -DENABLE_HARDSUBX
ccextractor_CPPFLAGS+= ${libavcodec_CFLAGS}
ccextractor_CPPFLAGS+= ${libavformat_CFLAGS}
ccextractor_CPPFLAGS+= ${libavutil_CFLAGS}
ccextractor_CPPFLAGS+= ${libswscale_CFLAGS}
AV_LIB = ${libavcodec_LIBS}
AV_LIB += ${libavformat_LIBS}
AV_LIB += ${libavutil_LIBS}
AV_LIB += ${libswscale_LIBS}
ccextractor_LDADD += $(AV_LIB)
HARDSUBX_FEATURE_RUST += --features "hardsubx_ocr"
endif
if OCR_IS_ENABLED
ccextractor_CFLAGS += -DENABLE_OCR -DPN3G_NO_CONFIG_H
LEPT_LIB = ${lept_LIBS}
LEPT_CPPFLAG = ${lept_CFLAGS}
if TESSERACT_PRESENT
TESS_LIB = ${tesseract_LIBS}
TESS_CPPFLAG = ${tesseract_CFLAGS}
else
#fix for raspberry pi not having a pkgconfig file for tesseract
if TESSERACT_PRESENT_RPI
TESS_LIB = -ltesseract
TESS_CPPFLAG = -I/usr/include/tesseract
endif
endif
ccextractor_CPPFLAGS += $(TESS_CPPFLAG)
ccextractor_CPPFLAGS += $(LEPT_CPPFLAG)
ccextractor_LDADD += $(TESS_LIB)
ccextractor_LDADD += $(LEPT_LIB)
endif
ccextractor_LDADD += ./rust/@RUST_TARGET_SUBDIR@/libccx_rust.a
if DEBUG_RELEASE
CARGO_RELEASE_ARGS=
else
CARGO_RELEASE_ARGS=--release
endif
./rust/@RUST_TARGET_SUBDIR@/libccx_rust.a:
cd ../src/rust && \
CARGO_TARGET_DIR=../../mac/rust $(CARGO) build $(HARDSUBX_FEATURE_RUST) $(CARGO_RELEASE_ARGS);
EXTRA_DIST = ../src/lib_ccx/ccfont2.xbm ../src/thirdparty/utf8proc/utf8proc_data.c fonts/ icon/

View File

@@ -1,8 +0,0 @@
Note: I don't currently have a Mac to test Mac builds. An effort is done to ensure that CCExtractor is portable,
which is why it compiles and works in Mac without any effort. But the build script (any of its 2 lines) is not
maintained. If it doesn't compile for this version please fix and send me the new file so I can add it to the
official version.
I know this sucks but I can't really do much more.
Carlos

4
mac/autogen.sh Executable file
View File

@@ -0,0 +1,4 @@
#!/bin/bash
./pre-build.sh
autoreconf -i

View File

@@ -1,17 +1,320 @@
#!/bin/bash
cd `dirname $0`
BLD_FLAGS="-std=gnu99 -Wno-write-strings -DGPAC_CONFIG_DARWIN -D_FILE_OFFSET_BITS=64 -Dfopen64=fopen -Dopen64=open -Dlseek64=lseek"
BLD_INCLUDE="-I../src/lib_ccx -I../src/gpacmp4 -I../src/libpng -I../src/zlib -I../src/zvbi -I../src/lib_hash"
SRC_LIBPNG="$(find ../src/libpng -name '*.c')"
SRC_ZLIB="$(find ../src/zlib -name '*.c')"
SRC_ZVBI="$(find ../src/zvbi -name '*.c')"
RUST_LIB="rust/release/libccx_rust.a"
RUST_PROFILE="--release"
RUST_FEATURES=""
# Parse command line arguments
while [[ $# -gt 0 ]]; do
case $1 in
OCR)
ENABLE_OCR=true
shift
;;
-debug)
DEBUG=true
RUST_PROFILE=""
RUST_LIB="rust/debug/libccx_rust.a"
shift
;;
-hardsubx)
HARDSUBX=true
ENABLE_OCR=true
# Allow overriding FFmpeg version via environment variable
if [ -n "$FFMPEG_VERSION" ]; then
RUST_FEATURES="--features hardsubx_ocr,$FFMPEG_VERSION"
else
RUST_FEATURES="--features hardsubx_ocr"
fi
shift
;;
-system-libs)
# Use system-installed libraries via pkg-config instead of bundled ones
# This is required for Homebrew formula compatibility
USE_SYSTEM_LIBS=true
shift
;;
-*)
echo "Unknown option $1"
exit 1
;;
esac
done
# Determine architecture based on cargo (to ensure consistency with Rust part)
CARGO_ARCH=$(file $(which cargo) | grep -o 'x86_64\|arm64')
if [[ "$CARGO_ARCH" == "x86_64" ]]; then
echo "Detected Intel (x86_64) Cargo. Forcing x86_64 build to match Rust and libraries..."
BLD_ARCH="-arch x86_64"
else
BLD_ARCH="-arch arm64"
fi
BLD_FLAGS="$BLD_ARCH -std=gnu99 -Wno-write-strings -Wno-pointer-sign -D_FILE_OFFSET_BITS=64 -DVERSION_FILE_PRESENT -Dfopen64=fopen -Dopen64=open -Dlseek64=lseek"
# Add flags for bundled libraries (not needed when using system libs)
if [[ "$USE_SYSTEM_LIBS" != "true" ]]; then
BLD_FLAGS="$BLD_FLAGS -DFT2_BUILD_LIBRARY -DGPAC_DISABLE_VTT -DGPAC_DISABLE_OD_DUMP -DGPAC_DISABLE_REMOTERY -DNO_GZIP"
fi
# Add debug flags if needed
if [[ "$DEBUG" == "true" ]]; then
BLD_FLAGS="$BLD_FLAGS -g -fsanitize=address"
fi
# Add OCR support if requested
if [[ "$ENABLE_OCR" == "true" ]]; then
BLD_FLAGS="$BLD_FLAGS -DENABLE_OCR"
fi
# Add hardsubx support if requested
if [[ "$HARDSUBX" == "true" ]]; then
BLD_FLAGS="$BLD_FLAGS -DENABLE_HARDSUBX"
fi
# Set up include paths based on whether we're using system libs or bundled
if [[ "$USE_SYSTEM_LIBS" == "true" ]]; then
# Use system libraries via pkg-config (for Homebrew compatibility)
# Note: -I../src/thirdparty/lib_hash is needed so that "../lib_hash/sha2.h" resolves correctly
# (the .. goes up from lib_hash to thirdparty, then lib_hash/sha2.h finds the file)
BLD_INCLUDE="-I../src/ -I../src/lib_ccx -I../src/thirdparty/lib_hash -I../src/thirdparty"
BLD_INCLUDE="$BLD_INCLUDE $(pkg-config --cflags --silence-errors freetype2)"
BLD_INCLUDE="$BLD_INCLUDE $(pkg-config --cflags --silence-errors gpac)"
BLD_INCLUDE="$BLD_INCLUDE $(pkg-config --cflags --silence-errors libpng)"
BLD_INCLUDE="$BLD_INCLUDE $(pkg-config --cflags --silence-errors libprotobuf-c)"
BLD_INCLUDE="$BLD_INCLUDE $(pkg-config --cflags --silence-errors libutf8proc)"
else
# Use bundled libraries (default for standalone builds)
BLD_INCLUDE="-I../src/ -I../src/lib_ccx -I../src/thirdparty/lib_hash -I../src/thirdparty/libpng -I../src/thirdparty -I../src/thirdparty/zlib -I../src/thirdparty/freetype/include $(pkg-config --cflags --silence-errors gpac)"
fi
# Add FFmpeg include path for Mac
if [[ -d "/opt/homebrew/Cellar/ffmpeg" ]]; then
FFMPEG_VERSION=$(ls -1 /opt/homebrew/Cellar/ffmpeg | head -1)
if [[ -n "$FFMPEG_VERSION" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/opt/homebrew/Cellar/ffmpeg/$FFMPEG_VERSION/include"
fi
elif [[ -d "/usr/local/Cellar/ffmpeg" ]]; then
FFMPEG_VERSION=$(ls -1 /usr/local/Cellar/ffmpeg | head -1)
if [[ -n "$FFMPEG_VERSION" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/usr/local/Cellar/ffmpeg/$FFMPEG_VERSION/include"
fi
fi
# Add Leptonica include path for Mac
if [[ -d "/opt/homebrew/Cellar/leptonica" ]]; then
LEPT_VERSION=$(ls -1 /opt/homebrew/Cellar/leptonica | head -1)
if [[ -n "$LEPT_VERSION" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/opt/homebrew/Cellar/leptonica/$LEPT_VERSION/include"
fi
elif [[ -d "/usr/local/Cellar/leptonica" ]]; then
LEPT_VERSION=$(ls -1 /usr/local/Cellar/leptonica | head -1)
if [[ -n "$LEPT_VERSION" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/usr/local/Cellar/leptonica/$LEPT_VERSION/include"
fi
elif [[ -d "/opt/homebrew/include/leptonica" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/opt/homebrew/include"
elif [[ -d "/usr/local/include/leptonica" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/usr/local/include"
fi
# Add Tesseract include path for Mac
if [[ -d "/opt/homebrew/Cellar/tesseract" ]]; then
TESS_VERSION=$(ls -1 /opt/homebrew/Cellar/tesseract | head -1)
if [[ -n "$TESS_VERSION" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/opt/homebrew/Cellar/tesseract/$TESS_VERSION/include"
fi
elif [[ -d "/usr/local/Cellar/tesseract" ]]; then
TESS_VERSION=$(ls -1 /usr/local/Cellar/tesseract | head -1)
if [[ -n "$TESS_VERSION" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/usr/local/Cellar/tesseract/$TESS_VERSION/include"
fi
elif [[ -d "/opt/homebrew/include/tesseract" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/opt/homebrew/include"
elif [[ -d "/usr/local/include/tesseract" ]]; then
BLD_INCLUDE="$BLD_INCLUDE -I/usr/local/include"
fi
if [[ "$ENABLE_OCR" == "true" ]]; then
BLD_INCLUDE="$BLD_INCLUDE `pkg-config --cflags --silence-errors tesseract`"
fi
SRC_CCX="$(find ../src/lib_ccx -name '*.c')"
SRC_GPAC="$(find ../src/gpacmp4 -name '*.c')"
SRC_LIB_HASH="$(find ../src/lib_hash -name '*.c')"
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_GPAC $SRC_ZVBI $SRC_ZLIB $SRC_LIBPNG $SRC_LIB_HASH"
BLD_LINKER="-lm -liconv"
SRC_LIB_HASH="$(find ../src/thirdparty/lib_hash -name '*.c')"
# Set up sources and linker based on whether we're using system libs or bundled
if [[ "$USE_SYSTEM_LIBS" == "true" ]]; then
# Use system libraries - don't compile bundled sources
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_LIB_HASH"
BLD_LINKER="-lm -liconv -lpthread -ldl"
BLD_LINKER="$BLD_LINKER $(pkg-config --libs --silence-errors freetype2)"
BLD_LINKER="$BLD_LINKER $(pkg-config --libs --silence-errors gpac)"
BLD_LINKER="$BLD_LINKER $(pkg-config --libs --silence-errors libpng)"
BLD_LINKER="$BLD_LINKER $(pkg-config --libs --silence-errors libprotobuf-c)"
BLD_LINKER="$BLD_LINKER $(pkg-config --libs --silence-errors libutf8proc)"
BLD_LINKER="$BLD_LINKER $(pkg-config --libs --silence-errors zlib)"
else
# Use bundled libraries (default)
SRC_LIBPNG="$(find ../src/thirdparty/libpng -name '*.c')"
SRC_UTF8="../src/thirdparty/utf8proc/utf8proc.c"
SRC_ZLIB="$(find ../src/thirdparty/zlib -name '*.c')"
SRC_FREETYPE="../src/thirdparty/freetype/autofit/autofit.c \
../src/thirdparty/freetype/base/ftbase.c \
../src/thirdparty/freetype/base/ftbbox.c \
../src/thirdparty/freetype/base/ftbdf.c \
../src/thirdparty/freetype/base/ftbitmap.c \
../src/thirdparty/freetype/base/ftcid.c \
../src/thirdparty/freetype/base/ftfntfmt.c \
../src/thirdparty/freetype/base/ftfstype.c \
../src/thirdparty/freetype/base/ftgasp.c \
../src/thirdparty/freetype/base/ftglyph.c \
../src/thirdparty/freetype/base/ftgxval.c \
../src/thirdparty/freetype/base/ftinit.c \
../src/thirdparty/freetype/base/ftlcdfil.c \
../src/thirdparty/freetype/base/ftmm.c \
../src/thirdparty/freetype/base/ftotval.c \
../src/thirdparty/freetype/base/ftpatent.c \
../src/thirdparty/freetype/base/ftpfr.c \
../src/thirdparty/freetype/base/ftstroke.c \
../src/thirdparty/freetype/base/ftsynth.c \
../src/thirdparty/freetype/base/ftsystem.c \
../src/thirdparty/freetype/base/fttype1.c \
../src/thirdparty/freetype/base/ftwinfnt.c \
../src/thirdparty/freetype/bdf/bdf.c \
../src/thirdparty/freetype/bzip2/ftbzip2.c \
../src/thirdparty/freetype/cache/ftcache.c \
../src/thirdparty/freetype/cff/cff.c \
../src/thirdparty/freetype/cid/type1cid.c \
../src/thirdparty/freetype/gzip/ftgzip.c \
../src/thirdparty/freetype/lzw/ftlzw.c \
../src/thirdparty/freetype/pcf/pcf.c \
../src/thirdparty/freetype/pfr/pfr.c \
../src/thirdparty/freetype/psaux/psaux.c \
../src/thirdparty/freetype/pshinter/pshinter.c \
../src/thirdparty/freetype/psnames/psnames.c \
../src/thirdparty/freetype/raster/raster.c \
../src/thirdparty/freetype/sfnt/sfnt.c \
../src/thirdparty/freetype/smooth/smooth.c \
../src/thirdparty/freetype/truetype/truetype.c \
../src/thirdparty/freetype/type1/type1.c \
../src/thirdparty/freetype/type42/type42.c \
../src/thirdparty/freetype/winfonts/winfnt.c"
BLD_SOURCES="../src/ccextractor.c $SRC_CCX $SRC_LIB_HASH $SRC_LIBPNG $SRC_UTF8 $SRC_ZLIB $SRC_FREETYPE"
BLD_LINKER="-lm -liconv -lpthread -ldl $(pkg-config --libs --silence-errors gpac)"
fi
if [[ "$ENABLE_OCR" == "true" ]]; then
BLD_LINKER="$BLD_LINKER `pkg-config --libs --silence-errors tesseract` `pkg-config --libs --silence-errors lept`"
fi
if [[ "$HARDSUBX" == "true" ]]; then
# Add FFmpeg library path for Mac
if [[ -d "/opt/homebrew/Cellar/ffmpeg" ]]; then
FFMPEG_VERSION=$(ls -1 /opt/homebrew/Cellar/ffmpeg | head -1)
if [[ -n "$FFMPEG_VERSION" ]]; then
BLD_LINKER="$BLD_LINKER -L/opt/homebrew/Cellar/ffmpeg/$FFMPEG_VERSION/lib"
fi
elif [[ -d "/usr/local/Cellar/ffmpeg" ]]; then
FFMPEG_VERSION=$(ls -1 /usr/local/Cellar/ffmpeg | head -1)
if [[ -n "$FFMPEG_VERSION" ]]; then
BLD_LINKER="$BLD_LINKER -L/usr/local/Cellar/ffmpeg/$FFMPEG_VERSION/lib"
fi
fi
# Add library paths for Leptonica and Tesseract from Cellar
if [[ -d "/opt/homebrew/Cellar/leptonica" ]]; then
LEPT_VERSION=$(ls -1 /opt/homebrew/Cellar/leptonica | head -1)
if [[ -n "$LEPT_VERSION" ]]; then
BLD_LINKER="$BLD_LINKER -L/opt/homebrew/Cellar/leptonica/$LEPT_VERSION/lib"
fi
fi
if [[ -d "/opt/homebrew/Cellar/tesseract" ]]; then
TESS_VERSION=$(ls -1 /opt/homebrew/Cellar/tesseract | head -1)
if [[ -n "$TESS_VERSION" ]]; then
BLD_LINKER="$BLD_LINKER -L/opt/homebrew/Cellar/tesseract/$TESS_VERSION/lib"
fi
fi
# Also add homebrew lib path as fallback
if [[ -d "/opt/homebrew/lib" ]]; then
BLD_LINKER="$BLD_LINKER -L/opt/homebrew/lib"
elif [[ -d "/usr/local/lib" ]]; then
BLD_LINKER="$BLD_LINKER -L/usr/local/lib"
fi
BLD_LINKER="$BLD_LINKER -lswscale -lavutil -pthread -lavformat -lavcodec -lavfilter -lleptonica -ltesseract"
fi
echo "Running pre-build script..."
./pre-build.sh
gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractor $BLD_SOURCES $BLD_LINKER
echo "Trying to compile..."
# Check for cargo
echo "Checking for cargo..."
if ! [ -x "$(command -v cargo)" ]; then
echo 'Error: cargo is not installed.' >&2
exit 1
fi
# Check rust version
rustc_version="$(rustc --version)"
semver=( ${rustc_version//./ } )
version="${semver[1]}.${semver[2]}.${semver[3]}"
MSRV="1.87.0"
if [ "$(printf '%s\n' "$MSRV" "$version" | sort -V | head -n1)" = "$MSRV" ]; then
echo "rustc >= MSRV(${MSRV})"
else
echo "Minimum supported rust version(MSRV) is ${MSRV}, please upgrade rust"
exit 1
fi
echo "Building rust files..."
(cd ../src/rust && CARGO_TARGET_DIR=../../mac/rust cargo build $RUST_PROFILE $RUST_FEATURES) || { echo "Failed building Rust components." ; exit 1; }
# Copy the Rust library
cp $RUST_LIB ./libccx_rust.a
# Add Rust library to linker flags
BLD_LINKER="$BLD_LINKER ./libccx_rust.a"
echo "Building ccextractor"
out=$((LC_ALL=C gcc $BLD_FLAGS $BLD_INCLUDE -o ccextractor $BLD_SOURCES $BLD_LINKER) 2>&1)
res=$?
# Handle common error cases
if [[ $out == *"gcc: command not found"* ]]; then
echo "Error: please install gcc or Xcode command line tools"
exit 1
fi
if [[ $out == *"curl.h: No such file or directory"* ]]; then
echo "Error: please install curl development library"
exit 2
fi
if [[ $out == *"capi.h: No such file or directory"* ]]; then
echo "Error: please install tesseract development library"
exit 3
fi
if [[ $out == *"allheaders.h: No such file or directory"* ]]; then
echo "Error: please install leptonica development library"
exit 4
fi
if [[ $res -ne 0 ]]; then # Unknown error
echo "Compiled with errors"
>&2 echo "$out"
exit 5
fi
if [[ "$out" != "" ]]; then
echo "$out"
echo "Compilation successful, compiler message shown in previous lines"
else
echo "Compilation successful, no compiler messages."
fi

4
mac/cleanup Executable file
View File

@@ -0,0 +1,4 @@
#!/bin/bash
make distclean > /dev/null 2>&1 || true
rm -rf Makefile configure *.in config.status config.log aclocal.m4 build-conf autom4te.cache

159
mac/configure.ac Normal file
View File

@@ -0,0 +1,159 @@
# -*- Autoconf -*-
# Process this file with autoconf to produce a configure script.
AC_PREREQ([2.71])
AC_INIT([CCExtractor],[0.96.5],[carlos@ccextractor.org])
AC_CONFIG_AUX_DIR([build-conf])
AC_CONFIG_SRCDIR([../src/ccextractor.c])
AM_INIT_AUTOMAKE([foreign subdir-objects])
AC_CONFIG_MACRO_DIRS([m4])
# Checks for programs.
AC_PROG_CC
AC_PROG_INSTALL
AC_PROG_MAKE_SET
#Checks for "pkg-config" utility
AC_MSG_CHECKING([pkg-config m4 macros])
if test m4_ifdef([PKG_CHECK_MODULES], [yes], [no]) = yes; then
AC_MSG_RESULT([yes]);
else
AC_MSG_RESULT([no]);
AC_MSG_ERROR([
pkg-config is required.])
fi
# Checks for libraries.
AC_CHECK_LIB([m], [sin], [], [AC_MSG_ERROR(Math library not installed. Install it before proceeding.)])
AC_CHECK_LIB([leptonica], [getLeptonicaVersion], [HAS_LEPT=1 && PKG_CHECK_MODULES([lept], [lept])], [HAS_LEPT=0])
AC_CHECK_LIB([tesseract], [TessVersion], [HAS_TESSERACT=1 && PKG_CHECK_MODULES([tesseract], [tesseract])], [HAS_TESSERACT=0])
AC_CHECK_LIB([avcodec], [avcodec_version], [HAS_AVCODEC=1 && PKG_CHECK_MODULES([libavcodec], [libavcodec])], [HAS_AVCODEC=0])
AC_CHECK_LIB([avformat], [avformat_version], [HAS_AVFORMAT=1 && PKG_CHECK_MODULES([libavformat], [libavformat])], [HAS_AVFORMAT=0])
AC_CHECK_LIB([avutil], [avutil_version], [HAS_AVUTIL=1 && PKG_CHECK_MODULES([libavutil], [libavutil])], [HAS_AVUTIL=0])
AC_CHECK_LIB([swscale], [swscale_version], [HAS_SWSCALE=1 && PKG_CHECK_MODULES([libswscale], [libswscale])], [HAS_SWSCALE=0])
# Checks for header files.
AC_CHECK_HEADERS([arpa/inet.h fcntl.h float.h inttypes.h limits.h locale.h malloc.h netdb.h netinet/in.h stddef.h stdint.h stdlib.h string.h sys/socket.h sys/time.h sys/timeb.h termios.h unistd.h wchar.h])
# Checks for typedefs, structures, and compiler characteristics.
AC_CHECK_HEADER_STDBOOL
AC_C_INLINE
AC_TYPE_INT16_T
AC_TYPE_INT32_T
AC_TYPE_INT64_T
AC_TYPE_INT8_T
AC_TYPE_OFF_T
AC_TYPE_PID_T
AC_TYPE_SIZE_T
AC_TYPE_SSIZE_T
AC_TYPE_UINT16_T
AC_TYPE_UINT32_T
AC_TYPE_UINT64_T
AC_TYPE_UINT8_T
AC_CHECK_TYPES([ptrdiff_t])
# Checks for library functions.
AC_FUNC_ERROR_AT_LINE
AC_FUNC_FSEEKO
AC_FUNC_MALLOC
AC_FUNC_MKTIME
AC_FUNC_REALLOC
AC_FUNC_STRERROR_R
AC_CHECK_FUNCS([floor ftruncate gethostbyname gettimeofday inet_ntoa mblen memchr memmove memset mkdir modf pow realpath rmdir select setlocale socket sqrt strcasecmp strchr strdup strerror strndup strrchr strstr strtol])
# Checks for arguments with configure
AC_ARG_ENABLE([hardsubx],
AS_HELP_STRING([--enable-hardsubx],[Enables extraction of burnt subtitles (hard subtitles)]),
[case "${enableval}" in
yes) hardsubx=true ;;
no) hardsubx=false ;;
*) AC_MSG_ERROR([bad value ${enableval} for --enable-hardsubx]) ;;
esac],[hardsubx=false])
AC_ARG_ENABLE([ocr],
AS_HELP_STRING([--enable-ocr],[Enables Optical Character Recognition]),
[case "${enableval}" in
yes) ocr=true ;;
no) ocr=false ;;
*) AC_MSG_ERROR([bad value ${enableval} for --enable-ocr]) ;;
esac],[ocr=false])
AC_ARG_ENABLE([ffmpeg],
AS_HELP_STRING([--enable-ffmpeg],[Enable FFmpeg integration]),
[case "${enableval}" in
yes) ffmpeg=true ;;
no) ffmpeg=false ;;
*) AC_MSG_ERROR([bad value ${enableval} for --enable-ffmpeg]) ;;
esac],[ffmpeg=false])
#Add argument for rust
AC_ARG_WITH([rust],
AS_HELP_STRING([--with-rust],[Builds CCExtractor with rust library]),
[with_rust=$withval],
[with_rust=yes])
AC_MSG_CHECKING(whether to build with rust library)
if test "x$with_rust" = "xyes" ; then
AC_MSG_RESULT(yes)
#Check if cargo and rust is installed
AC_PATH_PROG([CARGO], [cargo], [notfound])
AS_IF([test "$CARGO" = "notfound"], [AC_MSG_ERROR([cargo is required])])
AC_PATH_PROG([RUSTC], [rustc], [notfound])
AS_IF([test "$RUSTC" = "notfound"], [AC_MSG_ERROR([rustc is required])])
rustc_version=$(rustc --version)
MSRV="1.87.0"
AX_COMPARE_VERSION($rustc_version, [ge], [$MSRV],
[AC_MSG_RESULT(rustc >= $MSRV)],
[AC_MSG_ERROR([Minimum supported rust version(MSRV) is $MSRV, please upgrade rust])])
else
AC_MSG_RESULT(no)
fi
AM_CONDITIONAL([WITH_RUST], [test "x$with_rust" = "xyes"])
AC_ARG_ENABLE(debug,
AS_HELP_STRING([--enable-debug],[Build Rust code with debugging information [default=no]]),
[debug_release=$enableval],
[debug_release=no])
AC_MSG_CHECKING(whether to build Rust code with debugging information)
if test "x$debug_release" = "xyes" ; then
AC_MSG_RESULT(yes)
RUST_TARGET_SUBDIR=debug
else
AC_MSG_RESULT(no)
RUST_TARGET_SUBDIR=release
fi
AM_CONDITIONAL([DEBUG_RELEASE], [test "x$debug_release" = "xyes"])
AC_SUBST([RUST_TARGET_SUBDIR])
#Checks and prompts if libraries found/not found to avoid failure while building
AS_IF([ test x$hardsubx = xtrue && test $HAS_AVCODEC -gt 0 ], [AC_MSG_NOTICE(avcodec library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_AVCODEC -gt 0 ], [AC_MSG_ERROR(avcodec library not found. Please install the avcodec library before proceeding)])
AS_IF([ test x$hardsubx = xtrue && test $HAS_AVFORMAT -gt 0 ], [AC_MSG_NOTICE(avformat library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_AVFORMAT -gt 0 ], [AC_MSG_ERROR(avformat library not found. Please install the avformat library before proceeding)])
AS_IF([ test x$hardsubx = xtrue && test $HAS_AVUTIL -gt 0 ], [AC_MSG_NOTICE(avutil library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_AVUTIL -gt 0 ], [AC_MSG_ERROR(avutil library not found. Please install the avutil library before proceeding)])
AS_IF([ test x$hardsubx = xtrue && test $HAS_SWSCALE -gt 0 ], [AC_MSG_NOTICE(swscale library found)])
AS_IF([ test x$hardsubx = xtrue && test ! $HAS_SWSCALE -gt 0 ], [AC_MSG_ERROR(swscale library not found. Please install the swscale library before proceeding)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test $HAS_TESSERACT -gt 0 ], [TESS_VERSION=$(tesseract --version 2>&1 | grep tesseract) && AC_MSG_NOTICE(tesseract library found... $TESS_VERSION)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test ! $HAS_TESSERACT -gt 0 ], [AC_MSG_ERROR(tesserect library not found. Please install the tesseract library before proceeding)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test $HAS_LEPT -gt 0 ], [LEPT_VERSION=$(tesseract --version 2>&1 | grep leptonica) && AC_MSG_NOTICE(leptonica library found... $LEPT_VERSION)])
AS_IF([ (test x$ocr = xtrue || test x$hardsubx = xtrue) && test ! $HAS_LEPT -gt 0 ], [AC_MSG_ERROR(leptonica library not found. Please install the leptonica library before proceeding)])
#AM_CONDITIONAL(s) for setting values to enable/disable flags in Makefile.am
AM_CONDITIONAL(HARDSUBX_IS_ENABLED, [ test x$hardsubx = xtrue ])
AM_CONDITIONAL(OCR_IS_ENABLED, [ test x$ocr = xtrue || test x$hardsubx = xtrue ])
AM_CONDITIONAL(FFMPEG_IS_ENABLED, [ test x$ffmpeg = xtrue ])
AM_CONDITIONAL(TESSERACT_PRESENT, [ test ! -z "$(pkg-config --libs-only-l --silence-errors tesseract)" ])
AM_CONDITIONAL(TESSERACT_PRESENT_RPI, [ test -d "/usr/include/tesseract" && test $(ls -A /usr/include/tesseract | wc -l) -gt 0 ])
AM_CONDITIONAL(SYS_IS_LINUX, [ test $(uname -s) = "Linux"])
AM_CONDITIONAL(SYS_IS_MAC, [ test $(uname -s) = "Darwin"])
AM_CONDITIONAL(SYS_IS_APPLE_SILICON, [ test $(uname -a | awk '{print $NF}') = "arm64" ])
AM_CONDITIONAL(SYS_IS_64_BIT,[test $(getconf LONG_BIT) = "64"])
AC_CONFIG_FILES([Makefile])
AC_OUTPUT

Some files were not shown because too many files have changed in this diff Show More