DataDescriptorStream boundary bug when signature starts at buffer edge #773

Open
opened 2026-01-29 22:17:19 +00:00 by claunia · 1 comment
Owner

Originally created by @p247 on GitHub (Jan 27, 2026).

Originally assigned to: @adamhathcock, @Copilot on GitHub.

When the ZIP data is read in streaming mode and the first byte of the data descriptor signature (0x50 = P) appears as the last byte of a read buffer, the reader gets stuck and the stream positioning breaks.

This causes extraction to fail.

This seems to be caused by how DataDescriptorStream.Read() handles partial matches of PK\07\08 across buffer boundaries.

Observed Behavior

When the last byte of a buffer is 0x50 ('P'):
1. _searchPosition becomes 1
2. At end of buffer, this code runs:
if (_searchPosition > 0)
{
read -= _searchPosition;
_stream.Position -= _searchPosition;
_searchPosition = 0;
}
3. One byte is “rewound”
4. On next read, only 1 byte is returned
5. That byte is again P
6. Same logic repeats
7. Read returns 0 → stream appears finished
8. Extraction stops prematurely

Reproduction

This happens reliably with:
• Streaming ZIP reader
• DataDescriptorStream
• BufferSize = 64k
• Payload length > 64k bytes
• Payload filled with 0x50 ('P')

repro.zip

Originally created by @p247 on GitHub (Jan 27, 2026). Originally assigned to: @adamhathcock, @Copilot on GitHub. When the ZIP data is read in streaming mode and the first byte of the data descriptor signature (0x50 = P) appears as the last byte of a read buffer, the reader gets stuck and the stream positioning breaks. This causes extraction to fail. This seems to be caused by how DataDescriptorStream.Read() handles partial matches of PK\07\08 across buffer boundaries. ### Observed Behavior When the last byte of a buffer is 0x50 ('P'): 1. _searchPosition becomes 1 2. At end of buffer, this code runs: if (_searchPosition > 0) { read -= _searchPosition; _stream.Position -= _searchPosition; _searchPosition = 0; } 3. One byte is “rewound” 4. On next read, only 1 byte is returned 5. That byte is again P 6. Same logic repeats 7. Read returns 0 → stream appears finished 8. Extraction stops prematurely ### Reproduction This happens reliably with: • Streaming ZIP reader • DataDescriptorStream • BufferSize = 64k • Payload length > 64k bytes • Payload filled with 0x50 ('P') [repro.zip](https://github.com/user-attachments/files/24890867/repro.zip)
Author
Owner

@adamhathcock commented on GitHub (Jan 28, 2026):

Sounds similar to https://github.com/adamhathcock/sharpcompress/issues/1168

@adamhathcock commented on GitHub (Jan 28, 2026): Sounds similar to https://github.com/adamhathcock/sharpcompress/issues/1168
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#773