mirror of
https://github.com/adamhathcock/sharpcompress.git
synced 2026-02-04 05:25:00 +00:00
[PR #1106] [CLOSED] Cache decoder streams in 7Zip extraction to eliminate redundant allocations #1535
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
📋 Pull Request Information
Original PR: https://github.com/adamhathcock/sharpcompress/pull/1106
Author: @Copilot
Created: 1/4/2026
Status: ❌ Closed
Base:
master← Head:copilot/fix-7zip-extraction-performance📝 Commits (5)
679f210Initial plan74cfc98Add stream caching to ArchiveDatabase for 7Zip extraction performancef093339Update ArchiveDatabase.cs1b404d6Fix decoder stream caching to handle partial reads correctly49abd31Format test file with CSharpier📊 Changes
2 files changed (+311 additions, -12 deletions)
View changed files
📝
src/SharpCompress/Archives/SevenZip/SevenZipArchive.cs(+97 -12)➕
tests/SharpCompress.Test/SevenZip/SevenZipStreamCachingTests.cs(+214 -0)📄 Description
Fix 7Zip Extraction Performance (Revised Implementation)
Problem Analysis
The original caching implementation had a critical bug identified in code review: it optimistically updated the cached position before data was actually read, causing corruption if entries were only partially consumed.
Root Cause of Bug
The original
ArchiveDatabase.GetFolderStream()approach:skipSize + fileSizefileSizebytes would be readadditionalSkipand read corrupted dataRevised Solution ✅
Moved caching from ArchiveDatabase to SevenZipReader where sequential access is enforced:
SevenZipReader-level caching
_cachedFolderStream,_cachedFolder, and_cachedPositionMoveToNextEntry()Correct position tracking
_cachedPositiontoskipSize + fileSize(expected position after read)additionalSkip = skipSize - _cachedPositionadditionalSkip == 0: already at correct position (common case for sequential reads)additionalSkip > 0: skips forward to correct position (handles partial reads)additionalSkip < 0: disposes cached stream and creates fresh (non-sequential access)Archive API unaffected
ArchiveDatabase.GetFolderStream()reverted to always create fresh streamsExtractAllEntries()) uses caching - optimized performancePerformance Verification ✅
Original Issue Benchmark:
Solution Impact:
Benchmark Results (Solid Archive):
Testing ✅
Added comprehensive test suite
SevenZipStreamCachingTests.cs:Results:
Original prompt
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.
🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.