[PR #1114] Fix async decompression of .7z files by implementing Memory<byte> ReadAsync overload #1546

Closed
opened 2026-01-29 22:21:04 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/adamhathcock/sharpcompress/pull/1114

State: closed
Merged: Yes


Async extraction of .7z files with LZMA/LZMA2 compression threw DataErrorException when using CopyToAsync(), while synchronous CopyTo() worked correctly.

Root Cause

In .NET 6+, ReadExactlyAsync calls ReadAsync(Memory<byte>, CancellationToken). BufferedSubStream only implemented the legacy byte[] overload, causing the base Stream class to fall back to synchronous reads. This corrupted cache state when LZMA's RangeCoder mixed sync ReadByte() calls with async operations.

Changes

  • BufferedSubStream: Added ReadAsync(Memory<byte>, CancellationToken) and RefillCacheAsync() for true async I/O
  • Tests: Added async test coverage for LZMA, LZMA2, Solid, BZip2, and PPMd archives

Example

// This now works correctly with async operations
using var archive = ArchiveFactory.Open(archivePath);
foreach (var entry in archive.Entries.Where(e => !e.IsDirectory))
{
    using var stream = await entry.OpenEntryStreamAsync(cancellationToken);
    await stream.CopyToAsync(outputStream, cancellationToken);  // Previously threw DataErrorException
}

The fix ensures async operations remain async throughout the decompression pipeline, preventing sync-over-async patterns.

Original prompt

This section details on the original issue you should resolve

<issue_title>decompressing big .7z file throws error</issue_title>
<issue_description>lib version 0.42.1
under .net 10

code:

public class SharpCompressExtractor : IArchiveExtractor
{
    public async Task<IReadOnlyCollection<FileInfo>> ExtractAsync(
        string archivePath,
        string destinationDirectory,
        CancellationToken token)
    {
        if (!File.Exists(archivePath))
        {
            throw new FileNotFoundException($"Nie znaleziono archiwum: {archivePath}");
        }

        var extractedFiles = new List<FileInfo>();

        using var archive = ArchiveFactory.Open(archivePath);

        foreach (var entry in archive.Entries)
        {
            if (entry.IsDirectory)
            {
                continue;
            }

            token.ThrowIfCancellationRequested();

            var targetPath = Path.Combine(destinationDirectory, entry.Key);

            var targetDir = Path.GetDirectoryName(targetPath);

            if (!string.IsNullOrEmpty(targetDir) && !Directory.Exists(targetDir))
            {
                Directory.CreateDirectory(targetDir);
            }

            using var sourceStream = await entry.OpenEntryStreamAsync(token);

            var fileOptions = new FileStreamOptions
            {
                Mode = FileMode.Create,
                Access = FileAccess.Write,
                Share = FileShare.None,
                BufferSize = 1_024 * 1_024,
                Options = FileOptions.Asynchronous
            };

            await using var targetStream = new FileStream(targetPath, fileOptions);

            await sourceStream.CopyToAsync(targetStream, token);

            var fileInfo = new FileInfo(targetPath);

            extractedFiles.Add(fileInfo);
        }

        return extractedFiles;
    }
}

usage

            try
            {
                var files = await fileExtractor.ExtractAsync(
    "C:\\whitelist\\20251223.7z",
    "c:/whitelist/out",
    CancellationToken.None);
            }
            catch (Exception ex)
            {
                
            }

Data Error

at SharpCompress.Compressors.LZMA.LzmaStream.d__52.MoveNext()
at SharpCompress.Compressors.LZMA.LzmaStream.d__56.MoveNext()
at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
at SharpCompress.IO.ReadOnlySubStream.d__24.MoveNext()
at System.IO.Stream.<g__Core|30_0>d.MoveNext()
at SharpCompressExtractor.d__0.MoveNext() in SharpCompressExtractor.cs

file : probably any from https://www.podatki.gov.pl/narzedzia/white-list/

but for sure exist for this url: https://plikplaski.mf.gov.pl/pliki//20251223.7z

this file perfectly uncompress under win11 using native extractor or 7zip manager
</issue_description>

<agent_instructions>compare the synchronous code path to the new asynchronous code path to reproduce the issue</agent_instructions>

Comments on the Issue (you are @copilot in this section)

@adamhathcock Sounds like the fix I did for https://github.com/adamhathcock/sharpcompress/pull/1081

I can validate next week though.</comment_new>
<comment_new>@adamhathcock
You're right....testing this myself the async path is broken. However, the sync path works.

If you change CopyToAsync to CopytTo then it works for me

gonna look at a real fix though</comment_new>


Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

**Original Pull Request:** https://github.com/adamhathcock/sharpcompress/pull/1114 **State:** closed **Merged:** Yes --- Async extraction of .7z files with LZMA/LZMA2 compression threw `DataErrorException` when using `CopyToAsync()`, while synchronous `CopyTo()` worked correctly. ## Root Cause In .NET 6+, `ReadExactlyAsync` calls `ReadAsync(Memory<byte>, CancellationToken)`. `BufferedSubStream` only implemented the legacy `byte[]` overload, causing the base `Stream` class to fall back to synchronous reads. This corrupted cache state when LZMA's `RangeCoder` mixed sync `ReadByte()` calls with async operations. ## Changes - **`BufferedSubStream`**: Added `ReadAsync(Memory<byte>, CancellationToken)` and `RefillCacheAsync()` for true async I/O - **Tests**: Added async test coverage for LZMA, LZMA2, Solid, BZip2, and PPMd archives ## Example ```csharp // This now works correctly with async operations using var archive = ArchiveFactory.Open(archivePath); foreach (var entry in archive.Entries.Where(e => !e.IsDirectory)) { using var stream = await entry.OpenEntryStreamAsync(cancellationToken); await stream.CopyToAsync(outputStream, cancellationToken); // Previously threw DataErrorException } ``` The fix ensures async operations remain async throughout the decompression pipeline, preventing sync-over-async patterns. <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > > ---- > > *This section details on the original issue you should resolve* > > <issue_title>decompressing big .7z file throws error</issue_title> > <issue_description>lib version 0.42.1 > under .net 10 > > code: > > ``` > public class SharpCompressExtractor : IArchiveExtractor > { > public async Task<IReadOnlyCollection<FileInfo>> ExtractAsync( > string archivePath, > string destinationDirectory, > CancellationToken token) > { > if (!File.Exists(archivePath)) > { > throw new FileNotFoundException($"Nie znaleziono archiwum: {archivePath}"); > } > > var extractedFiles = new List<FileInfo>(); > > using var archive = ArchiveFactory.Open(archivePath); > > foreach (var entry in archive.Entries) > { > if (entry.IsDirectory) > { > continue; > } > > token.ThrowIfCancellationRequested(); > > var targetPath = Path.Combine(destinationDirectory, entry.Key); > > var targetDir = Path.GetDirectoryName(targetPath); > > if (!string.IsNullOrEmpty(targetDir) && !Directory.Exists(targetDir)) > { > Directory.CreateDirectory(targetDir); > } > > using var sourceStream = await entry.OpenEntryStreamAsync(token); > > var fileOptions = new FileStreamOptions > { > Mode = FileMode.Create, > Access = FileAccess.Write, > Share = FileShare.None, > BufferSize = 1_024 * 1_024, > Options = FileOptions.Asynchronous > }; > > await using var targetStream = new FileStream(targetPath, fileOptions); > > await sourceStream.CopyToAsync(targetStream, token); > > var fileInfo = new FileInfo(targetPath); > > extractedFiles.Add(fileInfo); > } > > return extractedFiles; > } > } > ``` > > usage > ``` > try > { > var files = await fileExtractor.ExtractAsync( > "C:\\whitelist\\20251223.7z", > "c:/whitelist/out", > CancellationToken.None); > } > catch (Exception ex) > { > > } > ``` > > Data Error > > at SharpCompress.Compressors.LZMA.LzmaStream.<DecodeChunkHeaderAsync>d__52.MoveNext() > at SharpCompress.Compressors.LZMA.LzmaStream.<ReadAsync>d__56.MoveNext() > at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult() > at SharpCompress.IO.ReadOnlySubStream.<ReadAsync>d__24.MoveNext() > at System.IO.Stream.<<CopyToAsync>g__Core|30_0>d.MoveNext() > at SharpCompressExtractor.<ExtractAsync>d__0.MoveNext() in SharpCompressExtractor.cs > > file : probably any from https://www.podatki.gov.pl/narzedzia/white-list/ > > but for sure exist for this url: https://plikplaski.mf.gov.pl/pliki//20251223.7z > > this file perfectly uncompress under win11 using native extractor or 7zip manager > </issue_description> > > <agent_instructions>compare the synchronous code path to the new asynchronous code path to reproduce the issue</agent_instructions> > > ## Comments on the Issue (you are @copilot in this section) > > <comments> > <comment_new><author>@adamhathcock</author><body> > Sounds like the fix I did for https://github.com/adamhathcock/sharpcompress/pull/1081 > > I can validate next week though.</body></comment_new> > <comment_new><author>@adamhathcock</author><body> > You're right....testing this myself the async path is broken. However, the sync path works. > > If you change `CopyToAsync` to `CopytTo` then it works for me > > gonna look at a real fix though</body></comment_new> > </comments> > </details> <!-- START COPILOT CODING AGENT SUFFIX --> - Fixes adamhathcock/sharpcompress#1086 <!-- START COPILOT CODING AGENT TIPS --> --- ✨ Let Copilot coding agent [set things up for you](https://github.com/adamhathcock/sharpcompress/issues/new?title=✨+Set+up+Copilot+instructions&body=Configure%20instructions%20for%20this%20repository%20as%20documented%20in%20%5BBest%20practices%20for%20Copilot%20coding%20agent%20in%20your%20repository%5D%28https://gh.io/copilot-coding-agent-tips%29%2E%0A%0A%3COnboard%20this%20repo%3E&assignees=copilot) — coding agent works faster and does higher quality work when set up for your repo.
claunia added the pull-request label 2026-01-29 22:21:04 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#1546