[PR #1114] Fix async decompression of .7z files by implementing Memory<byte> ReadAsync overload #1546

New Issue

claunia · 2026-01-29T22:21:04Z

claunia commented

2026-01-29 22:21:04 +00:00

Original Pull Request: https://github.com/adamhathcock/sharpcompress/pull/1114

State: closed
Merged: Yes

Async extraction of .7z files with LZMA/LZMA2 compression threw DataErrorException when using CopyToAsync(), while synchronous CopyTo() worked correctly.

Root Cause

In .NET 6+, ReadExactlyAsync calls ReadAsync(Memory<byte>, CancellationToken). BufferedSubStream only implemented the legacy byte[] overload, causing the base Stream class to fall back to synchronous reads. This corrupted cache state when LZMA's RangeCoder mixed sync ReadByte() calls with async operations.

Changes

BufferedSubStream: Added ReadAsync(Memory<byte>, CancellationToken) and RefillCacheAsync() for true async I/O
Tests: Added async test coverage for LZMA, LZMA2, Solid, BZip2, and PPMd archives

Example

// This now works correctly with async operations
using var archive = ArchiveFactory.Open(archivePath);
foreach (var entry in archive.Entries.Where(e => !e.IsDirectory))
{
    using var stream = await entry.OpenEntryStreamAsync(cancellationToken);
    await stream.CopyToAsync(outputStream, cancellationToken);  // Previously threw DataErrorException
}

The fix ensures async operations remain async throughout the decompression pipeline, preventing sync-over-async patterns.

Original prompt

This section details on the original issue you should resolve

<issue_title>decompressing big .7z file throws error</issue_title>
<issue_description>lib version 0.42.1
under .net 10

code:
public class SharpCompressExtractor : IArchiveExtractor
{
    public async Task<IReadOnlyCollection<FileInfo>> ExtractAsync(
        string archivePath,
        string destinationDirectory,
        CancellationToken token)
    {
        if (!File.Exists(archivePath))
        {
            throw new FileNotFoundException($"Nie znaleziono archiwum: {archivePath}");
        }

        var extractedFiles = new List<FileInfo>();

        using var archive = ArchiveFactory.Open(archivePath);

        foreach (var entry in archive.Entries)
        {
            if (entry.IsDirectory)
            {
                continue;
            }

            token.ThrowIfCancellationRequested();

            var targetPath = Path.Combine(destinationDirectory, entry.Key);

            var targetDir = Path.GetDirectoryName(targetPath);

            if (!string.IsNullOrEmpty(targetDir) && !Directory.Exists(targetDir))
            {
                Directory.CreateDirectory(targetDir);
            }

            using var sourceStream = await entry.OpenEntryStreamAsync(token);

            var fileOptions = new FileStreamOptions
            {
                Mode = FileMode.Create,
                Access = FileAccess.Write,
                Share = FileShare.None,
                BufferSize = 1_024 * 1_024,
                Options = FileOptions.Asynchronous
            };

            await using var targetStream = new FileStream(targetPath, fileOptions);

            await sourceStream.CopyToAsync(targetStream, token);

            var fileInfo = new FileInfo(targetPath);

            extractedFiles.Add(fileInfo);
        }

        return extractedFiles;
    }
}
usage
            try
            {
                var files = await fileExtractor.ExtractAsync(
    "C:\\whitelist\\20251223.7z",
    "c:/whitelist/out",
    CancellationToken.None);
            }
            catch (Exception ex)
            {
                
            }
Data Error

at SharpCompress.Compressors.LZMA.LzmaStream.d__52.MoveNext()
at SharpCompress.Compressors.LZMA.LzmaStream.d__56.MoveNext()
at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
at SharpCompress.IO.ReadOnlySubStream.d__24.MoveNext()
at System.IO.Stream.<g__Core|30_0>d.MoveNext()
at SharpCompressExtractor.d__0.MoveNext() in SharpCompressExtractor.cs

file : probably any from https://www.podatki.gov.pl/narzedzia/white-list/

but for sure exist for this url: https://plikplaski.mf.gov.pl/pliki//20251223.7z

this file perfectly uncompress under win11 using native extractor or 7zip manager
</issue_description>

<agent_instructions>compare the synchronous code path to the new asynchronous code path to reproduce the issue</agent_instructions>

Comments on the Issue (you are @copilot in this section)
@adamhathcock Sounds like the fix I did for https://github.com/adamhathcock/sharpcompress/pull/1081
I can validate next week though.</comment_new>
<comment_new>@adamhathcock
You're right....testing this myself the async path is broken. However, the sync path works.

If you change CopyToAsync to CopytTo then it works for me

gonna look at a real fix though</comment_new>

Fixes adamhathcock/sharpcompress#1086

✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

**Original Pull Request:** https://github.com/adamhathcock/sharpcompress/pull/1114 **State:** closed **Merged:** Yes --- Async extraction of .7z files with LZMA/LZMA2 compression threw `DataErrorException` when using `CopyToAsync()`, while synchronous `CopyTo()` worked correctly. ## Root Cause In .NET 6+, `ReadExactlyAsync` calls `ReadAsync(Memory<byte>, CancellationToken)`. `BufferedSubStream` only implemented the legacy `byte[]` overload, causing the base `Stream` class to fall back to synchronous reads. This corrupted cache state when LZMA's `RangeCoder` mixed sync `ReadByte()` calls with async operations. ## Changes - **`BufferedSubStream`**: Added `ReadAsync(Memory<byte>, CancellationToken)` and `RefillCacheAsync()` for true async I/O - **Tests**: Added async test coverage for LZMA, LZMA2, Solid, BZip2, and PPMd archives ## Example ```csharp // This now works correctly with async operations using var archive = ArchiveFactory.Open(archivePath); foreach (var entry in archive.Entries.Where(e => !e.IsDirectory)) { using var stream = await entry.OpenEntryStreamAsync(cancellationToken); await stream.CopyToAsync(outputStream, cancellationToken); // Previously threw DataErrorException } ``` The fix ensures async operations remain async throughout the decompression pipeline, preventing sync-over-async patterns.  <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>decompressing big .7z file throws error</issue_title> > <issue_description>lib version 0.42.1 > under .net 10 > > code: > > ``` > public class SharpCompressExtractor : IArchiveExtractor > { > public async Task<IReadOnlyCollection<FileInfo>> ExtractAsync( > string archivePath, > string destinationDirectory, > CancellationToken token) > { > if (!File.Exists(archivePath)) > { > throw new FileNotFoundException($"Nie znaleziono archiwum: {archivePath}"); > } > > var extractedFiles = new List<FileInfo>(); > > using var archive = ArchiveFactory.Open(archivePath); > > foreach (var entry in archive.Entries) > { > if (entry.IsDirectory) > { > continue; > } > > token.ThrowIfCancellationRequested(); > > var targetPath = Path.Combine(destinationDirectory, entry.Key); > > var targetDir = Path.GetDirectoryName(targetPath); > > if (!string.IsNullOrEmpty(targetDir) && !Directory.Exists(targetDir)) > { > Directory.CreateDirectory(targetDir); > } > > using var sourceStream = await entry.OpenEntryStreamAsync(token); > > var fileOptions = new FileStreamOptions > { > Mode = FileMode.Create, > Access = FileAccess.Write, > Share = FileShare.None, > BufferSize = 1_024 * 1_024, > Options = FileOptions.Asynchronous > }; > > await using var targetStream = new FileStream(targetPath, fileOptions); > > await sourceStream.CopyToAsync(targetStream, token); > > var fileInfo = new FileInfo(targetPath); > > extractedFiles.Add(fileInfo); > } > > return extractedFiles; > } > } > ``` > > usage > ``` > try > { > var files = await fileExtractor.ExtractAsync( > "C:\\whitelist\\20251223.7z", > "c:/whitelist/out", > CancellationToken.None); > } > catch (Exception ex) > { > > } > ``` > > Data Error > > at SharpCompress.Compressors.LZMA.LzmaStream.<DecodeChunkHeaderAsync>d__52.MoveNext() > at SharpCompress.Compressors.LZMA.LzmaStream.<ReadAsync>d__56.MoveNext() > at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult() > at SharpCompress.IO.ReadOnlySubStream.<ReadAsync>d__24.MoveNext() > at System.IO.Stream.<<CopyToAsync>g__Core|30_0>d.MoveNext() > at SharpCompressExtractor.<ExtractAsync>d__0.MoveNext() in SharpCompressExtractor.cs > > file : probably any from https://www.podatki.gov.pl/narzedzia/white-list/ > > but for sure exist for this url: https://plikplaski.mf.gov.pl/pliki//20251223.7z > > this file perfectly uncompress under win11 using native extractor or 7zip manager > </issue_description> > > <agent_instructions>compare the synchronous code path to the new asynchronous code path to reproduce the issue</agent_instructions> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@adamhathcock</author><body> > Sounds like the fix I did for https://github.com/adamhathcock/sharpcompress/pull/1081 > > I can validate next week though.</body></comment_new> > <comment_new><author>@adamhathcock</author><body> > You're right....testing this myself the async path is broken. However, the sync path works. > > If you change `CopyToAsync` to `CopytTo` then it works for me > > gonna look at a real fix though</body></comment_new> > </comments> > </details>  - Fixes adamhathcock/sharpcompress#1086  --- ✨ Let Copilot coding agent [set things up for you](https://github.com/adamhathcock/sharpcompress/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.

claunia added the pull-request label 2026-01-29 22:21:04 +00:00

claunia closed this issue

2026-01-29 22:21:05 +00:00

Sign in to join this conversation.

Branches Tags

master

release

adam/merge-release-to-master

dependabot/nuget/xunit.v3-3.2.2

adam/more-explode-async

copilot/fix-infinite-loop-rar-archive

adam/data-descriptor-fix

adam/fix-tests-with-proper-rewind

copilot/fix-data-descriptor-stream-bug

adam/lmza-investigation

adam/create-rar-async

adam/async-rar2

copilot/support-multi-threading-path

copilot/sub-pr-1132-again

adam/memory-perf

copilot/add-performance-benchmarking

copilot/sub-pr-1121

copilot/add-password-support-zip-files

copilot/add-so-optimized-zip-support

adam/rar-async-only

copilot/add-buffered-stream-async-read

copilot/sub-pr-1076

copilot/fix-decompression-exception

copilot/fix-archivefactory-issue

copilot/rationalize-sourcestream-volumes

adam/open-async

copilot/add-ace-archive-support

copilot/sub-pr-1040-again

adam/more-async-3

copilot/fix-tararchive-incomplete-iteration

adam/multi-threaded

copilot/sub-pr-1040

adam/awesome-copilot

copilot/fix-ziparchive-extraction-issue

copilot/fix-tararchive-open-crash

copilot/fix-tar-xz-file-reading-issue

copilot/setup-copilot-instructions

copilot/fix-decompression-performance-issue

copilot/convert-stream-access-to-async

adam/enable-agent

adam/async-deflate

adam/async-rar

adam/more-cleanup

adam/zstd

async-2

zstandard

net461-tests

dmg

async

build-netcore3

recycle-memory-stream

presentation

pax

netcore2

zip_encryption

dotnet-tool

tar_redux

native_zlib

Issue-197

system_buffers

TarNames

7zip_sfx

portable_crypto

WinRT

new_7zip

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/sharpcompress#1546