Regression: ZIP parsing fails depending on Stream.Read chunking on non-seekable streams #771

New Issue

claunia · 2026-01-29T22:17:18Z

claunia commented

2026-01-29 22:17:18 +00:00

Originally created by @rleroux-regnology on GitHub (Jan 27, 2026).

Originally assigned to: @adamhathcock, @Copilot on GitHub.

Hi,

I’m hitting what looks like a stream chunking sensitivity regression when reading ZIP archives from a non-seekable stream. With the exact same ZIP bytes, SharpCompress will sometimes fail depending only on how the underlying Stream.Read() splits the data.

Regression note

The exact same code path works correctly with SharpCompress 0.40.0.
The failure (ZlibException in the chunked case) starts occurring in newer versions.

Context / real-world scenario

This happens in a real ASP.NET Core streaming pipeline (multipart/form-data):

Source stream: HttpRequest.Body
Read via MultipartReader (multipart/form-data)
Archive entries are processed sequentially using ReaderFactory.Open(...).MoveToNextEntry()
Entry streams are non-seekable by design

A seemingly unrelated change (for example changing a text field value from "my-value" to "my-valu") shifts the alignment of the ZIP part by 1 byte, which changes the short-read pattern seen by SharpCompress and triggers a failure.

To make this report independent of ASP.NET / multipart, the repro below uses a custom non-seekable stream that returns legal short reads.

Reproduction

This snippet (with stream.zip) reads the same ZIP bytes three ways:

Baseline: MemoryStream
Non-seekable stream with short reads (first = 3816)
Non-seekable stream with short reads (first = 3815)

Only case (3) fails with a ZlibException. Wrapping the stream with a simple coalescing wrapper fixes the issue.

var bytes = System.IO.File.ReadAllBytes("stream.zip");

Console.WriteLine($"SharpCompress: {typeof(ReaderFactory).Assembly.FullName}");
Console.WriteLine($"Input bytes: {bytes.Length}");

//
// Case 1 - Baseline
// Read the ZIP from a file-like stream (MemoryStream).
// This always works and serves as the reference behavior.
//
Console.WriteLine("\n== Baseline (MemoryStream) ==");
Dump(ReadEntries(new MemoryStream(bytes, writable: false)));

//
// Case 2 - Chunked non-seekable stream (first read = 3816 bytes)
// This simulates a network/multipart stream with legal short reads.
// This case still works.
//
Console.WriteLine("\n== Chunked (first=3816, then=4096) ==");
Dump(ReadEntries(new PatternReadStream(bytes, first: 3816, chunk: 4096)));

try
{
	//
	// Case 3 - Chunked non-seekable stream (first read = 3815 bytes)
	// Exact same input bytes, only the first Read() returns 1 byte less.
	//
	// This case works correctly on SharpCompress 0.40.0,
	// but throws a ZlibException on newer versions.
	//
	Console.WriteLine("\n== Chunked (first=3815, then=4096) ==");
	Dump(ReadEntries(new PatternReadStream(bytes, first: 3815, chunk: 4096)));
}
catch (ZlibException)
{
	//
	// Case 4 - Workaround
	// Wrap the same failing stream with a coalescing wrapper that
	// fills short reads. This makes SharpCompress behave correctly again.
	//
	Console.WriteLine("\n== Workaround: FillReadStream over chunked(3815/4096) ==");
	using (var s = new PatternReadStream(bytes, first: 3815, chunk: 4096))
	using (var fill = new FillReadStream(s))
	{
		Dump(ReadEntries(fill));
	}
}

static List<string> ReadEntries(Stream s)
{
	var names = new List<string>();
	using var reader = ReaderFactory.Open(s, new ReaderOptions { LeaveStreamOpen = true });

	while (reader.MoveToNextEntry())
	{
		if (reader.Entry.IsDirectory)
		{
			continue;
		}

		names.Add(reader.Entry.Key);

		using var es = reader.OpenEntryStream();
		es.CopyTo(Stream.Null);
	}

	return names;
}

static void Dump(List<string> names)
{
	Console.WriteLine($"Count={names.Count}");

	foreach (var n in names.Take(10))
	{
		Console.WriteLine(" - " + n);
	}

	if (names.Count > 10)
	{
		Console.WriteLine(" - ...");
	}
}

sealed class PatternReadStream : Stream
{
    private readonly MemoryStream inner;
    private readonly int first;
    private readonly int chunk;
    private bool firstDone;

    public PatternReadStream(byte[] bytes, int first, int chunk)
    {
        inner = new MemoryStream(bytes, writable: false);
        this.first = first;
        this.chunk = chunk;
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        int limit = !firstDone ? first : chunk;
        firstDone = true;

        int toRead = Math.Min(count, limit);
        return inner.Read(buffer, offset, toRead);
    }

    public override bool CanRead => true;

    public override bool CanSeek => false;

    public override bool CanWrite => false;

    public override long Length => throw new NotSupportedException();

    public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); }

    public override void Flush() => throw new NotSupportedException();

    public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException();

    public override void SetLength(long value) => throw new NotSupportedException();

    public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException();
}

sealed class FillReadStream : Stream
{
    private readonly Stream inner;

    public FillReadStream(Stream inner) => this.inner = inner;

    public override int Read(byte[] buffer, int offset, int count)
    {
        int total = 0;

        while (total < count)
        {
            int n = inner.Read(buffer, offset + total, count - total);

            if (n == 0)
            {
                break;
            }

            total += n;
        }

        return total;
    }

    public override bool CanRead => inner.CanRead;

    public override bool CanSeek => false;

    public override bool CanWrite => false;

    public override long Length => throw new NotSupportedException();

    public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); }

    public override void Flush() => inner.Flush();

    public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException();

    public override void SetLength(long value) => throw new NotSupportedException();

    public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException();
}

Observed behavior

Baseline (MemoryStream) works.
Chunked(first=3816, then=4096) works.
Chunked(first=3815, then=4096) throw ZlibException.
Wrapping the chunked stream with FillReadStream makes it behave like baseline again

The only difference between the failing and non-failing cases is one byte in the first Read() result.

Why this is problematic

From a consumer point of view:

Returning short reads is valid Stream behavior.
Non-seekable forward-only streams are common in real-world pipelines (HTTP, multipart, network).
Archive parsing should be independent of read chunking.

Currently, SharpCompress becomes sensitive to buffer alignment and short-read patterns, which makes it unreliable in legitimate streaming scenarios.

Expected behavior / suggestion

SharpCompress should not assume that Read() will return “enough” bytes in a single call.

With identical input bytes, parsing should be deterministic regardless of legal short-read patterns.

Workaround on consumer side

Wrapping the input stream with a simple coalescing stream (like FillReadStream) fixes the issue, but this should not be required for valid non-seekable inputs.

Summary

Same ZIP bytes + only different chunking → different behavior.
A 1-byte difference in the first read can trigger a ZlibException.
This breaks legitimate streaming pipelines.
A trivial coalescing wrapper fixes it, suggesting an internal assumption about Read() semantics.

Originally created by @rleroux-regnology on GitHub (Jan 27, 2026). Originally assigned to: @adamhathcock, @Copilot on GitHub. Hi, I’m hitting what looks like a **stream chunking sensitivity regression** when reading ZIP archives from a **non-seekable stream**. With the exact same ZIP bytes, SharpCompress will sometimes fail depending only on how the underlying `Stream.Read()` splits the data. --- ### Regression note The exact same code path works correctly with **SharpCompress 0.40.0**. The failure (`ZlibException` in the chunked case) starts occurring in newer versions. --- ### Context / real-world scenario This happens in a real ASP.NET Core streaming pipeline (multipart/form-data): - Source stream: `HttpRequest.Body` - Read via `MultipartReader` (multipart/form-data) - Archive entries are processed sequentially using `ReaderFactory.Open(...).MoveToNextEntry()` - Entry streams are non-seekable by design A seemingly unrelated change (for example changing a text field value from `"my-value"` to `"my-valu"`) shifts the alignment of the ZIP part by 1 byte, which changes the short-read pattern seen by SharpCompress and triggers a failure. To make this report independent of ASP.NET / multipart, the repro below uses a custom non-seekable stream that returns **legal short reads**. --- ### Reproduction This snippet (with [stream.zip](https://github.com/user-attachments/files/24886895/stream.zip)) reads the same ZIP bytes three ways: 1. Baseline: `MemoryStream` 2. Non-seekable stream with short reads (first = 3816) 3. Non-seekable stream with short reads (first = 3815) Only case (3) fails with a `ZlibException`. Wrapping the stream with a simple coalescing wrapper fixes the issue. ```cs var bytes = System.IO.File.ReadAllBytes("stream.zip"); Console.WriteLine($"SharpCompress: {typeof(ReaderFactory).Assembly.FullName}"); Console.WriteLine($"Input bytes: {bytes.Length}"); // // Case 1 - Baseline // Read the ZIP from a file-like stream (MemoryStream). // This always works and serves as the reference behavior. // Console.WriteLine("\n== Baseline (MemoryStream) =="); Dump(ReadEntries(new MemoryStream(bytes, writable: false))); // // Case 2 - Chunked non-seekable stream (first read = 3816 bytes) // This simulates a network/multipart stream with legal short reads. // This case still works. // Console.WriteLine("\n== Chunked (first=3816, then=4096) =="); Dump(ReadEntries(new PatternReadStream(bytes, first: 3816, chunk: 4096))); try { // // Case 3 - Chunked non-seekable stream (first read = 3815 bytes) // Exact same input bytes, only the first Read() returns 1 byte less. // // This case works correctly on SharpCompress 0.40.0, // but throws a ZlibException on newer versions. // Console.WriteLine("\n== Chunked (first=3815, then=4096) =="); Dump(ReadEntries(new PatternReadStream(bytes, first: 3815, chunk: 4096))); } catch (ZlibException) { // // Case 4 - Workaround // Wrap the same failing stream with a coalescing wrapper that // fills short reads. This makes SharpCompress behave correctly again. // Console.WriteLine("\n== Workaround: FillReadStream over chunked(3815/4096) =="); using (var s = new PatternReadStream(bytes, first: 3815, chunk: 4096)) using (var fill = new FillReadStream(s)) { Dump(ReadEntries(fill)); } } ``` ```cs static List<string> ReadEntries(Stream s) { var names = new List<string>(); using var reader = ReaderFactory.Open(s, new ReaderOptions { LeaveStreamOpen = true }); while (reader.MoveToNextEntry()) { if (reader.Entry.IsDirectory) { continue; } names.Add(reader.Entry.Key); using var es = reader.OpenEntryStream(); es.CopyTo(Stream.Null); } return names; } static void Dump(List<string> names) { Console.WriteLine($"Count={names.Count}"); foreach (var n in names.Take(10)) { Console.WriteLine(" - " + n); } if (names.Count > 10) { Console.WriteLine(" - ..."); } } sealed class PatternReadStream : Stream { private readonly MemoryStream inner; private readonly int first; private readonly int chunk; private bool firstDone; public PatternReadStream(byte[] bytes, int first, int chunk) { inner = new MemoryStream(bytes, writable: false); this.first = first; this.chunk = chunk; } public override int Read(byte[] buffer, int offset, int count) { int limit = !firstDone ? first : chunk; firstDone = true; int toRead = Math.Min(count, limit); return inner.Read(buffer, offset, toRead); } public override bool CanRead => true; public override bool CanSeek => false; public override bool CanWrite => false; public override long Length => throw new NotSupportedException(); public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); } public override void Flush() => throw new NotSupportedException(); public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException(); public override void SetLength(long value) => throw new NotSupportedException(); public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException(); } sealed class FillReadStream : Stream { private readonly Stream inner; public FillReadStream(Stream inner) => this.inner = inner; public override int Read(byte[] buffer, int offset, int count) { int total = 0; while (total < count) { int n = inner.Read(buffer, offset + total, count - total); if (n == 0) { break; } total += n; } return total; } public override bool CanRead => inner.CanRead; public override bool CanSeek => false; public override bool CanWrite => false; public override long Length => throw new NotSupportedException(); public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); } public override void Flush() => inner.Flush(); public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException(); public override void SetLength(long value) => throw new NotSupportedException(); public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException(); } ``` --- ### Observed behavior - `Baseline (MemoryStream)` works. - `Chunked(first=3816, then=4096)` works. - `Chunked(first=3815, then=4096)` throw `ZlibException`. - Wrapping the chunked stream with `FillReadStream` makes it behave like baseline again The only difference between the failing and non-failing cases **is one byte in the first `Read()` result**. --- ### Why this is problematic From a consumer point of view: - Returning **short reads** is valid `Stream` behavior. - Non-seekable forward-only streams are common in real-world pipelines (HTTP, multipart, network). - Archive parsing should be **independent of read chunking**. Currently, SharpCompress becomes sensitive to buffer alignment and short-read patterns, which makes it unreliable in legitimate streaming scenarios. --- ### Expected behavior / suggestion SharpCompress should not assume that `Read()` will return “enough” bytes in a single call. With identical input bytes, parsing should be deterministic regardless of legal short-read patterns. --- ### Workaround on consumer side Wrapping the input stream with a simple coalescing stream (like `FillReadStream`) fixes the issue, but this should not be required for valid non-seekable inputs. --- ### Summary - Same ZIP bytes + only different chunking → different behavior. - A 1-byte difference in the first read can trigger a `ZlibException`. - This breaks legitimate streaming pipelines. - A trivial coalescing wrapper fixes it, suggesting an internal assumption about `Read()` semantics.

claunia added the bug label 2026-01-29 22:17:18 +00:00

claunia commented

2026-01-29 22:17:19 +00:00

@adamhathcock commented on GitHub (Jan 27, 2026):

I think the linked PR covers this. If I'm understanding, a shorter read than requested blows up the parsing. The new code takes that into account.

https://github.com/adamhathcock/sharpcompress/pull/1169

@adamhathcock commented on GitHub (Jan 27, 2026): I think the linked PR covers this. If I'm understanding, a shorter read than requested blows up the parsing. The new code takes that into account. https://github.com/adamhathcock/sharpcompress/pull/1169

claunia commented

2026-01-29 22:17:19 +00:00

@adamhathcock commented on GitHub (Jan 28, 2026):

One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27

@adamhathcock commented on GitHub (Jan 28, 2026): One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27

claunia commented

2026-01-29 22:17:19 +00:00

@rleroux-regnology commented on GitHub (Jan 29, 2026):

One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27

Works perfectly in 0.44.5, thank you very much for your responsiveness on the various fixes ❤

@rleroux-regnology commented on GitHub (Jan 29, 2026): > One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27 Works perfectly in 0.44.5, thank you very much for your responsiveness on the various fixes ❤

claunia referenced this issue

2026-01-29 22:19:37 +00:00

[PR #801] [MERGED] Issue 771, remove throw on flush for readonly streams #1240

claunia referenced this issue

2026-01-29 22:19:38 +00:00

[PR #801] Issue 771, remove throw on flush for readonly streams #1244

Sign in to join this conversation.

Branches Tags

master

copilot/fix-rar-extraction-issues

copilot/add-lzwreader-support

adam/add-alternate-compressions

adam/cleanup-options

copilot/fix-openentrystreamasync-memory-issue

release

adam/more-explode-async

copilot/fix-infinite-loop-rar-archive

adam/data-descriptor-fix

adam/fix-tests-with-proper-rewind

copilot/fix-data-descriptor-stream-bug

adam/lmza-investigation

adam/create-rar-async

adam/async-rar2

copilot/support-multi-threading-path

copilot/sub-pr-1132-again

adam/memory-perf

copilot/add-performance-benchmarking

copilot/sub-pr-1121

copilot/add-password-support-zip-files

copilot/add-so-optimized-zip-support

adam/rar-async-only

copilot/add-buffered-stream-async-read

copilot/sub-pr-1076

copilot/fix-decompression-exception

copilot/fix-archivefactory-issue

copilot/rationalize-sourcestream-volumes

adam/open-async

copilot/add-ace-archive-support

copilot/sub-pr-1040-again

adam/more-async-3

copilot/fix-tararchive-incomplete-iteration

adam/multi-threaded

copilot/sub-pr-1040

adam/awesome-copilot

copilot/fix-ziparchive-extraction-issue

copilot/fix-tararchive-open-crash

copilot/fix-tar-xz-file-reading-issue

copilot/setup-copilot-instructions

copilot/fix-decompression-performance-issue

copilot/convert-stream-access-to-async

adam/enable-agent

adam/async-deflate

adam/async-rar

adam/more-cleanup

adam/zstd

async-2

zstandard

net461-tests

dmg

async

build-netcore3

recycle-memory-stream

presentation

pax

netcore2

zip_encryption

dotnet-tool

tar_redux

native_zlib

Issue-197

system_buffers

TarNames

7zip_sfx

portable_crypto

WinRT

new_7zip

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/sharpcompress#771