Regression: ZIP parsing fails depending on Stream.Read chunking on non-seekable streams #771

Open
opened 2026-01-29 22:17:18 +00:00 by claunia · 3 comments
Owner

Originally created by @rleroux-regnology on GitHub (Jan 27, 2026).

Originally assigned to: @adamhathcock, @Copilot on GitHub.

Hi,

I’m hitting what looks like a stream chunking sensitivity regression when reading ZIP archives from a non-seekable stream. With the exact same ZIP bytes, SharpCompress will sometimes fail depending only on how the underlying Stream.Read() splits the data.


Regression note

The exact same code path works correctly with SharpCompress 0.40.0.
The failure (ZlibException in the chunked case) starts occurring in newer versions.


Context / real-world scenario

This happens in a real ASP.NET Core streaming pipeline (multipart/form-data):

  • Source stream: HttpRequest.Body
  • Read via MultipartReader (multipart/form-data)
  • Archive entries are processed sequentially using ReaderFactory.Open(...).MoveToNextEntry()
  • Entry streams are non-seekable by design

A seemingly unrelated change (for example changing a text field value from "my-value" to "my-valu") shifts the alignment of the ZIP part by 1 byte, which changes the short-read pattern seen by SharpCompress and triggers a failure.

To make this report independent of ASP.NET / multipart, the repro below uses a custom non-seekable stream that returns legal short reads.


Reproduction

This snippet (with stream.zip) reads the same ZIP bytes three ways:

  1. Baseline: MemoryStream
  2. Non-seekable stream with short reads (first = 3816)
  3. Non-seekable stream with short reads (first = 3815)

Only case (3) fails with a ZlibException. Wrapping the stream with a simple coalescing wrapper fixes the issue.

var bytes = System.IO.File.ReadAllBytes("stream.zip");

Console.WriteLine($"SharpCompress: {typeof(ReaderFactory).Assembly.FullName}");
Console.WriteLine($"Input bytes: {bytes.Length}");

//
// Case 1 - Baseline
// Read the ZIP from a file-like stream (MemoryStream).
// This always works and serves as the reference behavior.
//
Console.WriteLine("\n== Baseline (MemoryStream) ==");
Dump(ReadEntries(new MemoryStream(bytes, writable: false)));

//
// Case 2 - Chunked non-seekable stream (first read = 3816 bytes)
// This simulates a network/multipart stream with legal short reads.
// This case still works.
//
Console.WriteLine("\n== Chunked (first=3816, then=4096) ==");
Dump(ReadEntries(new PatternReadStream(bytes, first: 3816, chunk: 4096)));

try
{
	//
	// Case 3 - Chunked non-seekable stream (first read = 3815 bytes)
	// Exact same input bytes, only the first Read() returns 1 byte less.
	//
	// This case works correctly on SharpCompress 0.40.0,
	// but throws a ZlibException on newer versions.
	//
	Console.WriteLine("\n== Chunked (first=3815, then=4096) ==");
	Dump(ReadEntries(new PatternReadStream(bytes, first: 3815, chunk: 4096)));
}
catch (ZlibException)
{
	//
	// Case 4 - Workaround
	// Wrap the same failing stream with a coalescing wrapper that
	// fills short reads. This makes SharpCompress behave correctly again.
	//
	Console.WriteLine("\n== Workaround: FillReadStream over chunked(3815/4096) ==");
	using (var s = new PatternReadStream(bytes, first: 3815, chunk: 4096))
	using (var fill = new FillReadStream(s))
	{
		Dump(ReadEntries(fill));
	}
}
static List<string> ReadEntries(Stream s)
{
	var names = new List<string>();
	using var reader = ReaderFactory.Open(s, new ReaderOptions { LeaveStreamOpen = true });

	while (reader.MoveToNextEntry())
	{
		if (reader.Entry.IsDirectory)
		{
			continue;
		}

		names.Add(reader.Entry.Key);

		using var es = reader.OpenEntryStream();
		es.CopyTo(Stream.Null);
	}

	return names;
}

static void Dump(List<string> names)
{
	Console.WriteLine($"Count={names.Count}");

	foreach (var n in names.Take(10))
	{
		Console.WriteLine(" - " + n);
	}

	if (names.Count > 10)
	{
		Console.WriteLine(" - ...");
	}
}

sealed class PatternReadStream : Stream
{
    private readonly MemoryStream inner;
    private readonly int first;
    private readonly int chunk;
    private bool firstDone;

    public PatternReadStream(byte[] bytes, int first, int chunk)
    {
        inner = new MemoryStream(bytes, writable: false);
        this.first = first;
        this.chunk = chunk;
    }

    public override int Read(byte[] buffer, int offset, int count)
    {
        int limit = !firstDone ? first : chunk;
        firstDone = true;

        int toRead = Math.Min(count, limit);
        return inner.Read(buffer, offset, toRead);
    }

    public override bool CanRead => true;

    public override bool CanSeek => false;

    public override bool CanWrite => false;

    public override long Length => throw new NotSupportedException();

    public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); }

    public override void Flush() => throw new NotSupportedException();

    public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException();

    public override void SetLength(long value) => throw new NotSupportedException();

    public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException();
}

sealed class FillReadStream : Stream
{
    private readonly Stream inner;

    public FillReadStream(Stream inner) => this.inner = inner;

    public override int Read(byte[] buffer, int offset, int count)
    {
        int total = 0;

        while (total < count)
        {
            int n = inner.Read(buffer, offset + total, count - total);

            if (n == 0)
            {
                break;
            }

            total += n;
        }

        return total;
    }

    public override bool CanRead => inner.CanRead;

    public override bool CanSeek => false;

    public override bool CanWrite => false;

    public override long Length => throw new NotSupportedException();

    public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); }

    public override void Flush() => inner.Flush();

    public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException();

    public override void SetLength(long value) => throw new NotSupportedException();

    public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException();
}


Observed behavior

  • Baseline (MemoryStream) works.
  • Chunked(first=3816, then=4096) works.
  • Chunked(first=3815, then=4096) throw ZlibException.
  • Wrapping the chunked stream with FillReadStream makes it behave like baseline again

The only difference between the failing and non-failing cases is one byte in the first Read() result.


Why this is problematic

From a consumer point of view:

  • Returning short reads is valid Stream behavior.
  • Non-seekable forward-only streams are common in real-world pipelines (HTTP, multipart, network).
  • Archive parsing should be independent of read chunking.

Currently, SharpCompress becomes sensitive to buffer alignment and short-read patterns, which makes it unreliable in legitimate streaming scenarios.


Expected behavior / suggestion

SharpCompress should not assume that Read() will return “enough” bytes in a single call.

With identical input bytes, parsing should be deterministic regardless of legal short-read patterns.


Workaround on consumer side

Wrapping the input stream with a simple coalescing stream (like FillReadStream) fixes the issue, but this should not be required for valid non-seekable inputs.


Summary

  • Same ZIP bytes + only different chunking → different behavior.
  • A 1-byte difference in the first read can trigger a ZlibException.
  • This breaks legitimate streaming pipelines.
  • A trivial coalescing wrapper fixes it, suggesting an internal assumption about Read() semantics.
Originally created by @rleroux-regnology on GitHub (Jan 27, 2026). Originally assigned to: @adamhathcock, @Copilot on GitHub. Hi, I’m hitting what looks like a **stream chunking sensitivity regression** when reading ZIP archives from a **non-seekable stream**. With the exact same ZIP bytes, SharpCompress will sometimes fail depending only on how the underlying `Stream.Read()` splits the data. --- ### Regression note The exact same code path works correctly with **SharpCompress 0.40.0**. The failure (`ZlibException` in the chunked case) starts occurring in newer versions. --- ### Context / real-world scenario This happens in a real ASP.NET Core streaming pipeline (multipart/form-data): - Source stream: `HttpRequest.Body` - Read via `MultipartReader` (multipart/form-data) - Archive entries are processed sequentially using `ReaderFactory.Open(...).MoveToNextEntry()` - Entry streams are non-seekable by design A seemingly unrelated change (for example changing a text field value from `"my-value"` to `"my-valu"`) shifts the alignment of the ZIP part by 1 byte, which changes the short-read pattern seen by SharpCompress and triggers a failure. To make this report independent of ASP.NET / multipart, the repro below uses a custom non-seekable stream that returns **legal short reads**. --- ### Reproduction This snippet (with [stream.zip](https://github.com/user-attachments/files/24886895/stream.zip)) reads the same ZIP bytes three ways: 1. Baseline: `MemoryStream` 2. Non-seekable stream with short reads (first = 3816) 3. Non-seekable stream with short reads (first = 3815) Only case (3) fails with a `ZlibException`. Wrapping the stream with a simple coalescing wrapper fixes the issue. ```cs var bytes = System.IO.File.ReadAllBytes("stream.zip"); Console.WriteLine($"SharpCompress: {typeof(ReaderFactory).Assembly.FullName}"); Console.WriteLine($"Input bytes: {bytes.Length}"); // // Case 1 - Baseline // Read the ZIP from a file-like stream (MemoryStream). // This always works and serves as the reference behavior. // Console.WriteLine("\n== Baseline (MemoryStream) =="); Dump(ReadEntries(new MemoryStream(bytes, writable: false))); // // Case 2 - Chunked non-seekable stream (first read = 3816 bytes) // This simulates a network/multipart stream with legal short reads. // This case still works. // Console.WriteLine("\n== Chunked (first=3816, then=4096) =="); Dump(ReadEntries(new PatternReadStream(bytes, first: 3816, chunk: 4096))); try { // // Case 3 - Chunked non-seekable stream (first read = 3815 bytes) // Exact same input bytes, only the first Read() returns 1 byte less. // // This case works correctly on SharpCompress 0.40.0, // but throws a ZlibException on newer versions. // Console.WriteLine("\n== Chunked (first=3815, then=4096) =="); Dump(ReadEntries(new PatternReadStream(bytes, first: 3815, chunk: 4096))); } catch (ZlibException) { // // Case 4 - Workaround // Wrap the same failing stream with a coalescing wrapper that // fills short reads. This makes SharpCompress behave correctly again. // Console.WriteLine("\n== Workaround: FillReadStream over chunked(3815/4096) =="); using (var s = new PatternReadStream(bytes, first: 3815, chunk: 4096)) using (var fill = new FillReadStream(s)) { Dump(ReadEntries(fill)); } } ``` ```cs static List<string> ReadEntries(Stream s) { var names = new List<string>(); using var reader = ReaderFactory.Open(s, new ReaderOptions { LeaveStreamOpen = true }); while (reader.MoveToNextEntry()) { if (reader.Entry.IsDirectory) { continue; } names.Add(reader.Entry.Key); using var es = reader.OpenEntryStream(); es.CopyTo(Stream.Null); } return names; } static void Dump(List<string> names) { Console.WriteLine($"Count={names.Count}"); foreach (var n in names.Take(10)) { Console.WriteLine(" - " + n); } if (names.Count > 10) { Console.WriteLine(" - ..."); } } sealed class PatternReadStream : Stream { private readonly MemoryStream inner; private readonly int first; private readonly int chunk; private bool firstDone; public PatternReadStream(byte[] bytes, int first, int chunk) { inner = new MemoryStream(bytes, writable: false); this.first = first; this.chunk = chunk; } public override int Read(byte[] buffer, int offset, int count) { int limit = !firstDone ? first : chunk; firstDone = true; int toRead = Math.Min(count, limit); return inner.Read(buffer, offset, toRead); } public override bool CanRead => true; public override bool CanSeek => false; public override bool CanWrite => false; public override long Length => throw new NotSupportedException(); public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); } public override void Flush() => throw new NotSupportedException(); public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException(); public override void SetLength(long value) => throw new NotSupportedException(); public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException(); } sealed class FillReadStream : Stream { private readonly Stream inner; public FillReadStream(Stream inner) => this.inner = inner; public override int Read(byte[] buffer, int offset, int count) { int total = 0; while (total < count) { int n = inner.Read(buffer, offset + total, count - total); if (n == 0) { break; } total += n; } return total; } public override bool CanRead => inner.CanRead; public override bool CanSeek => false; public override bool CanWrite => false; public override long Length => throw new NotSupportedException(); public override long Position { get => throw new NotSupportedException(); set => throw new NotSupportedException(); } public override void Flush() => inner.Flush(); public override long Seek(long offset, SeekOrigin origin) => throw new NotSupportedException(); public override void SetLength(long value) => throw new NotSupportedException(); public override void Write(byte[] buffer, int offset, int count) => throw new NotSupportedException(); } ``` --- ### Observed behavior - `Baseline (MemoryStream)` works. - `Chunked(first=3816, then=4096)` works. - `Chunked(first=3815, then=4096)` throw `ZlibException`. - Wrapping the chunked stream with `FillReadStream` makes it behave like baseline again The only difference between the failing and non-failing cases **is one byte in the first `Read()` result**. --- ### Why this is problematic From a consumer point of view: - Returning **short reads** is valid `Stream` behavior. - Non-seekable forward-only streams are common in real-world pipelines (HTTP, multipart, network). - Archive parsing should be **independent of read chunking**. Currently, SharpCompress becomes sensitive to buffer alignment and short-read patterns, which makes it unreliable in legitimate streaming scenarios. --- ### Expected behavior / suggestion SharpCompress should not assume that `Read()` will return “enough” bytes in a single call. With identical input bytes, parsing should be deterministic regardless of legal short-read patterns. --- ### Workaround on consumer side Wrapping the input stream with a simple coalescing stream (like `FillReadStream`) fixes the issue, but this should not be required for valid non-seekable inputs. --- ### Summary - Same ZIP bytes + only different chunking → different behavior. - A 1-byte difference in the first read can trigger a `ZlibException`. - This breaks legitimate streaming pipelines. - A trivial coalescing wrapper fixes it, suggesting an internal assumption about `Read()` semantics.
claunia added the bug label 2026-01-29 22:17:18 +00:00
Author
Owner

@adamhathcock commented on GitHub (Jan 27, 2026):

I think the linked PR covers this. If I'm understanding, a shorter read than requested blows up the parsing. The new code takes that into account.

https://github.com/adamhathcock/sharpcompress/pull/1169

@adamhathcock commented on GitHub (Jan 27, 2026): I think the linked PR covers this. If I'm understanding, a shorter read than requested blows up the parsing. The new code takes that into account. https://github.com/adamhathcock/sharpcompress/pull/1169
Author
Owner

@adamhathcock commented on GitHub (Jan 28, 2026):

One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27

@adamhathcock commented on GitHub (Jan 28, 2026): One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27
Author
Owner

@rleroux-regnology commented on GitHub (Jan 29, 2026):

One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27

Works perfectly in 0.44.5, thank you very much for your responsiveness on the various fixes ❤

@rleroux-regnology commented on GitHub (Jan 29, 2026): > One of the latest betas should have this fix https://www.nuget.org/packages/SharpCompress/0.44.5-beta.27 Works perfectly in 0.44.5, thank you very much for your responsiveness on the various fixes ❤
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#771