[PR #2004] [MERGED] fix(matroska): Prevent infinite loop on truncated MKV files #2806

Open
opened 2026-01-29 17:23:59 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/CCExtractor/ccextractor/pull/2004
Author: @cfsmp3
Created: 1/10/2026
Status: Merged
Merged: 1/10/2026
Merged by: @cfsmp3

Base: masterHead: fix/matroska-eof-infinite-loop


📝 Commits (1)

  • 067045c fix(matroska): Prevent infinite loop on truncated MKV files

📊 Changes

1 file changed (+17 additions, -0 deletions)

View changed files

📝 src/lib_ccx/matroska.c (+17 -0)

📄 Description

Summary

  • Fix infinite loop when parsing truncated MKV files
  • Add EOF checks after each mkv_read_byte() call in all Matroska parsing loops
  • Files that previously caused timeouts now complete in under a second

Problem

When parsing truncated MKV files, the Matroska parser would enter an infinite loop printing millions of "Unknown element 0xffffffff" warnings. This happened because:

  1. At EOF, fgetc() returns -1 which becomes 0xFF when cast to UBYTE
  2. Reading 4 EOF bytes creates element code 0xFFFFFFFF (unknown element)
  3. The "skip unknown element" logic reads another 0xFF as vint length (127 bytes)
  4. FSEEK past EOF clears the EOF flag without error
  5. The while loop condition (pos + len > get_current_byte) never becomes false because the recorded segment length is larger than the truncated file

Solution

Add feof() checks immediately after each mkv_read_byte() call in all 8 parsing loops. This detects EOF after the failed read and breaks out of the loop cleanly.

Test plan

  • Tested with ticket1398-orig.mkv (10MB truncated, 3hr duration) - previously hung, now completes instantly
  • Tested with azumi.mkv (33MB truncated, 2hr duration) - previously hung, now completes in ~36 seconds with correct subtitle extraction
  • CI regression tests

🤖 Generated with Claude Code


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/CCExtractor/ccextractor/pull/2004 **Author:** [@cfsmp3](https://github.com/cfsmp3) **Created:** 1/10/2026 **Status:** ✅ Merged **Merged:** 1/10/2026 **Merged by:** [@cfsmp3](https://github.com/cfsmp3) **Base:** `master` ← **Head:** `fix/matroska-eof-infinite-loop` --- ### 📝 Commits (1) - [`067045c`](https://github.com/CCExtractor/ccextractor/commit/067045ce92bd1ec45e01de62d0272642cc73e8f2) fix(matroska): Prevent infinite loop on truncated MKV files ### 📊 Changes **1 file changed** (+17 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `src/lib_ccx/matroska.c` (+17 -0) </details> ### 📄 Description ## Summary - Fix infinite loop when parsing truncated MKV files - Add EOF checks after each `mkv_read_byte()` call in all Matroska parsing loops - Files that previously caused timeouts now complete in under a second ## Problem When parsing truncated MKV files, the Matroska parser would enter an infinite loop printing millions of "Unknown element 0xffffffff" warnings. This happened because: 1. At EOF, `fgetc()` returns -1 which becomes 0xFF when cast to UBYTE 2. Reading 4 EOF bytes creates element code 0xFFFFFFFF (unknown element) 3. The "skip unknown element" logic reads another 0xFF as vint length (127 bytes) 4. `FSEEK` past EOF clears the EOF flag without error 5. The while loop condition `(pos + len > get_current_byte)` never becomes false because the recorded segment length is larger than the truncated file ## Solution Add `feof()` checks immediately after each `mkv_read_byte()` call in all 8 parsing loops. This detects EOF after the failed read and breaks out of the loop cleanly. ## Test plan - [x] Tested with `ticket1398-orig.mkv` (10MB truncated, 3hr duration) - previously hung, now completes instantly - [x] Tested with `azumi.mkv` (33MB truncated, 2hr duration) - previously hung, now completes in ~36 seconds with correct subtitle extraction - [ ] CI regression tests 🤖 Generated with [Claude Code](https://claude.com/claude-code) --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 17:23:59 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ccextractor#2806