[PR #1175] Fix DataDescriptorStream infinite loop on signature boundary match #1613

Open
opened 2026-01-29 22:21:23 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/adamhathcock/sharpcompress/pull/1175
Author: @Copilot
Created: 1/28/2026
Status: 🔄 Open

Base: releaseHead: copilot/fix-data-descriptor-stream-bug


📝 Commits (4)

  • 82071af Initial plan
  • c726a13 Fix DataDescriptorStream boundary bug to prevent infinite loop
  • e896179 Add regression test for DataDescriptorStream boundary bug
  • c994a9f Fix DataDescriptorStream to handle legitimate cross-boundary signatures

📊 Changes

2 files changed (+123 additions, -0 deletions)

View changed files

📝 src/SharpCompress/IO/DataDescriptorStream.cs (+32 -0)
📝 tests/SharpCompress.Test/Zip/ZipReaderTests.cs (+91 -0)

📄 Description

Fix DataDescriptorStream boundary bug when signature starts at buffer edge

  • Understand the bug in DataDescriptorStream.Read() method
  • Identify root cause: infinite loop when rewinding to same position
  • Implement correct fix: detect infinite loop vs legitimate partial match
  • Test the fix with existing ZIP tests (38/38 passing)
  • Format code with CSharpier
  • Add regression test for the boundary bug scenario
  • Address feedback: fix handles legitimate cross-boundary signatures
Original prompt

This section details on the original issue you should resolve

<issue_title>DataDescriptorStream boundary bug when signature starts at buffer edge</issue_title>
<issue_description>When the ZIP data is read in streaming mode and the first byte of the data descriptor signature (0x50 = P) appears as the last byte of a read buffer, the reader gets stuck and the stream positioning breaks.

This causes extraction to fail.

This seems to be caused by how DataDescriptorStream.Read() handles partial matches of PK\07\08 across buffer boundaries.

Observed Behavior

When the last byte of a buffer is 0x50 ('P'):
1. _searchPosition becomes 1
2. At end of buffer, this code runs:
if (_searchPosition > 0)
{
read -= _searchPosition;
_stream.Position -= _searchPosition;
_searchPosition = 0;
}
3. One byte is “rewound”
4. On next read, only 1 byte is returned
5. That byte is again P
6. Same logic repeats
7. Read returns 0 → stream appears finished
8. Extraction stops prematurely

Reproduction

This happens reliably with:
• Streaming ZIP reader
• DataDescriptorStream
• BufferSize = 64k
• Payload length > 64k bytes
• Payload filled with 0x50 ('P')

repro.zip</issue_description>

Comments on the Issue (you are @copilot in this section)

@adamhathcock Sounds similar to https://github.com/adamhathcock/sharpcompress/issues/1168

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/adamhathcock/sharpcompress/pull/1175 **Author:** [@Copilot](https://github.com/apps/copilot-swe-agent) **Created:** 1/28/2026 **Status:** 🔄 Open **Base:** `release` ← **Head:** `copilot/fix-data-descriptor-stream-bug` --- ### 📝 Commits (4) - [`82071af`](https://github.com/adamhathcock/sharpcompress/commit/82071af5661009bc104a4619bde6fe72460f280d) Initial plan - [`c726a13`](https://github.com/adamhathcock/sharpcompress/commit/c726a13a97ad2572049b59741c64185fa8515dad) Fix DataDescriptorStream boundary bug to prevent infinite loop - [`e896179`](https://github.com/adamhathcock/sharpcompress/commit/e896179f3eeb1819ae896f1a45ea711c38220777) Add regression test for DataDescriptorStream boundary bug - [`c994a9f`](https://github.com/adamhathcock/sharpcompress/commit/c994a9f8451c7eb3ed4a88d81735f55fabe43759) Fix DataDescriptorStream to handle legitimate cross-boundary signatures ### 📊 Changes **2 files changed** (+123 additions, -0 deletions) <details> <summary>View changed files</summary> 📝 `src/SharpCompress/IO/DataDescriptorStream.cs` (+32 -0) 📝 `tests/SharpCompress.Test/Zip/ZipReaderTests.cs` (+91 -0) </details> ### 📄 Description ## Fix DataDescriptorStream boundary bug when signature starts at buffer edge - [x] Understand the bug in DataDescriptorStream.Read() method - [x] Identify root cause: infinite loop when rewinding to same position - [x] Implement correct fix: detect infinite loop vs legitimate partial match - [x] Test the fix with existing ZIP tests (38/38 passing) - [x] Format code with CSharpier - [x] Add regression test for the boundary bug scenario - [x] Address feedback: fix handles legitimate cross-boundary signatures <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>DataDescriptorStream boundary bug when signature starts at buffer edge</issue_title> > <issue_description>When the ZIP data is read in streaming mode and the first byte of the data descriptor signature (0x50 = P) appears as the last byte of a read buffer, the reader gets stuck and the stream positioning breaks. > > This causes extraction to fail. > > This seems to be caused by how DataDescriptorStream.Read() handles partial matches of PK\07\08 across buffer boundaries. > > ### Observed Behavior > > When the last byte of a buffer is 0x50 ('P'): > 1. _searchPosition becomes 1 > 2. At end of buffer, this code runs: > if (_searchPosition > 0) > { > read -= _searchPosition; > _stream.Position -= _searchPosition; > _searchPosition = 0; > } > 3. One byte is “rewound” > 4. On next read, only 1 byte is returned > 5. That byte is again P > 6. Same logic repeats > 7. Read returns 0 → stream appears finished > 8. Extraction stops prematurely > > ### Reproduction > > This happens reliably with: > • Streaming ZIP reader > • DataDescriptorStream > • BufferSize = 64k > • Payload length > 64k bytes > • Payload filled with 0x50 ('P') > > [repro.zip](https://github.com/user-attachments/files/24890867/repro.zip)</issue_description> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@adamhathcock</author><body> > Sounds similar to https://github.com/adamhathcock/sharpcompress/issues/1168</body></comment_new> > </comments> > </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes adamhathcock/sharpcompress#1173 <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 22:21:23 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#1613