mirror of
https://github.com/CCExtractor/ccextractor.git
synced 2026-02-03 21:23:48 +00:00
Fixes #1550 - Docker builds were broken after PR #1535 switched from vendored GPAC to system GPAC. Changes: - Switch from Alpine to Debian Bookworm (Alpine's musl libc has issues with Rust bindgen's libclang dynamic loading) - Support three build variants via BUILD_TYPE argument: - minimal: No OCR support - ocr (default): Tesseract OCR for bitmap subtitles - hardsubx: OCR + FFmpeg for burned-in subtitle extraction - Support dual source modes via USE_LOCAL_SOURCE argument: - 0 (default): Clone from GitHub (standalone Dockerfile) - 1: Use local source (faster for developers) - Add .dockerignore to exclude build artifacts (~2.7GB -> ~900KB context) - Update README.md with comprehensive build instructions Tested all three variants successfully: - minimal: ~130MB image - ocr: ~215MB image - hardsubx: ~610MB image 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
CCExtractor Docker Image
This Dockerfile builds CCExtractor with support for multiple build variants.
Build Variants
| Variant | Description | Features |
|---|---|---|
minimal |
Basic CCExtractor | No OCR support |
ocr |
With OCR support (default) | Tesseract OCR for bitmap subtitles |
hardsubx |
With burned-in subtitle extraction | OCR + FFmpeg for hardcoded subtitles |
Building
Standalone Build (from Dockerfile only)
You can build CCExtractor using just the Dockerfile - it will clone the source from GitHub:
# Default build (OCR enabled)
docker build -t ccextractor docker/
# Minimal build (no OCR)
docker build --build-arg BUILD_TYPE=minimal -t ccextractor docker/
# HardSubX build (OCR + FFmpeg for burned-in subtitles)
docker build --build-arg BUILD_TYPE=hardsubx -t ccextractor docker/
Build from Cloned Repository (faster)
If you have already cloned the repository, you can use local source for faster builds:
git clone https://github.com/CCExtractor/ccextractor.git
cd ccextractor
# Default build (OCR enabled)
docker build --build-arg USE_LOCAL_SOURCE=1 -f docker/Dockerfile -t ccextractor .
# Minimal build
docker build --build-arg USE_LOCAL_SOURCE=1 --build-arg BUILD_TYPE=minimal -f docker/Dockerfile -t ccextractor .
# HardSubX build
docker build --build-arg USE_LOCAL_SOURCE=1 --build-arg BUILD_TYPE=hardsubx -f docker/Dockerfile -t ccextractor .
Build Arguments
| Argument | Default | Description |
|---|---|---|
BUILD_TYPE |
ocr |
Build variant: minimal, ocr, or hardsubx |
USE_LOCAL_SOURCE |
0 |
Set to 1 to use local source instead of cloning |
DEBIAN_VERSION |
bookworm-slim |
Debian version to use as base |
Usage
Basic Usage
# Show version
docker run --rm ccextractor --version
# Show help
docker run --rm ccextractor --help
Processing Local Files
Mount your local directory to process files:
# Process a video file with output file
docker run --rm -v $(pwd):$(pwd) -w $(pwd) ccextractor input.mp4 -o output.srt
# Process using stdout
docker run --rm -v $(pwd):$(pwd) -w $(pwd) ccextractor input.mp4 --stdout > output.srt
Interactive Mode
docker run --rm -it --entrypoint=/bin/bash ccextractor
Image Size
The multi-stage build produces runtime images:
minimal: ~130MBocr: ~215MB (includes Tesseract)hardsubx: ~610MB (includes Tesseract + FFmpeg)