[PR #1100] [MERGED] Consolidate stream extension methods and simplify with framework methods #1530

Open
opened 2026-01-29 22:20:59 +00:00 by claunia · 0 comments
Owner

📋 Pull Request Information

Original PR: https://github.com/adamhathcock/sharpcompress/pull/1100
Author: @Copilot
Created: 1/3/2026
Status: Merged
Merged: 1/5/2026
Merged by: @adamhathcock

Base: adam/asyncHead: copilot/fix-read-method-implementations


📝 Commits (6)

  • 5464054 Consolidate reads
  • 1a71c01 Consolidate ReadExact and ReadFully methods into Utility.cs
  • 05642cb Use ArrayPool for temporary buffers in BinaryReaderExtensions
  • 372ecb7 Use threshold-based ArrayPool strategy for BinaryReaderExtensions
  • 7701522 Add input validation for ReadBytesAsync count parameter
  • 9bd86f6 Replace manual TransferTo implementation with Stream.CopyTo framework methods

📊 Changes

7 files changed (+429 additions, -377 deletions)

View changed files

📝 src/SharpCompress/Common/AsyncBinaryReader.cs (+5 -27)
📝 src/SharpCompress/Compressors/LZMA/Utilites/Utils.cs (+0 -35)
📝 src/SharpCompress/Compressors/Xz/MultiByteIntegers.cs (+2 -35)
src/SharpCompress/Polyfills/BinaryReaderExtensions.cs (+65 -0)
📝 src/SharpCompress/Polyfills/StreamExtensions.cs (+79 -43)
📝 src/SharpCompress/Utility.cs (+186 -237)
📝 tests/SharpCompress.Test/UtilityTests.cs (+92 -0)

📄 Description

The codebase had duplicate implementations of stream reading extension methods scattered across multiple files. This PR consolidates all ReadFully, ReadExact, ReadByteAsync, and ReadBytesAsync extension methods into a single location for better maintainability and optimizes memory allocation using a threshold-based ArrayPool strategy. Additionally, simplifies TransferTo implementation by leveraging framework's Stream.CopyTo methods.

Changes

  • Moved BinaryReader extension methods: Relocated ReadByteAsync and ReadBytesAsync from Compressors/Xz/MultiByteIntegers.cs to BinaryReaderExtensions.cs
  • Added centralized ReadExact methods: Added ReadExact and ReadExactAsync to Utility.cs with proper parameter validation
  • Fixed missing ReadFully implementations: Restored missing ReadFully implementations for non-.NET 6.0+ targets with conditional compilation
  • Removed duplicate implementations: Consolidated ReadExact from:
    • Compressors/LZMA/Utilites/Utils.cs (removed duplicate)
    • Common/AsyncBinaryReader.cs (now uses centralized version)
    • Polyfills/BinaryReaderExtensions.cs (simplified to use centralized version)
    • Polyfills/StreamExtensions.cs (delegates to centralized version)
  • Fixed bug: BinaryReaderExtensions.ReadBytesAsync was reading only 1 byte instead of the requested count
  • Updated callers: Modified MultiByteIntegers.ReadXZIntegerAsync to call reader.ReadByteAsync() instead of local implementation
  • Added test coverage: 7 tests covering edge cases (empty streams, insufficient data, sequential reads)
  • Memory optimization: Implemented threshold-based ArrayPool strategy in BinaryReaderExtensions:
    • ReadByteAsync: Uses simple allocation for single byte (ArrayPool overhead not justified)
    • ReadBytesAsync: Uses direct allocation for small reads (≤256 bytes), ArrayPool for larger reads (>256 bytes)
    • Balances performance vs. GC pressure based on allocation size
  • Input validation: Added parameter validation for ReadBytesAsync (negative check, zero-length optimization)
  • Simplified TransferTo implementation: Replaced manual buffer management with framework's Stream.CopyTo() and CopyToAsync() methods:
    • Uses ReadOnlySubStream to limit reading to specified maxLength
    • Removed 84 lines of manual buffer management code
    • Leverages optimized framework implementation

Result

All stream and binary reader extension methods now live in centralized locations with optimized memory allocation:

// Before: scattered across multiple files
internal static class MultiByteIntegers {
    public static async Task<byte> ReadByteAsync(...) { }
}
internal static class Utils {
    public static void ReadExact(...) { }
}
internal class AsyncBinaryReader {
    private async ValueTask ReadExactAsync(...) { }
}

// After: consolidated with memory optimization
internal static class Utility {
    public static bool ReadFully(this Stream stream, ...) { }
    public static void ReadExact(this Stream stream, ...) { }
    public static async Task ReadExactAsync(this Stream stream, ...) { }
    
    // Simplified using framework methods
    public long TransferTo(Stream destination, long maxLength) { }
    public async Task<long> TransferToAsync(...) { }
}
public static class BinaryReaderExtensions {
    // Optimized with threshold-based ArrayPool strategy
    public static async Task<byte> ReadByteAsync(this BinaryReader reader, ...) { }
    public static async Task<byte[]> ReadBytesAsync(this BinaryReader reader, ...) { }
}

Performance Characteristics

  • Memory: Reduced allocations for large reads (>256 bytes) via ArrayPool
  • CPU: Minimal copy overhead only for large reads where GC savings justify it
  • Threshold: 256 bytes chosen as sweet spot balancing typical header/small data sizes vs. large buffers
  • TransferTo: More efficient by using framework's optimized CopyTo implementation

Testing

  • All 63 UtilityTests pass
  • All 72 XZ tests pass
  • All 102 Tar tests pass
  • All 46 SevenZip/LZMA tests pass
  • Code formatted with CSharpier
  • No security alerts
Original prompt

branch off of adam/async and fix the multiple implementations of ReadFully or ReadByte to have a single extension method for all streams


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.


🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.

## 📋 Pull Request Information **Original PR:** https://github.com/adamhathcock/sharpcompress/pull/1100 **Author:** [@Copilot](https://github.com/apps/copilot-swe-agent) **Created:** 1/3/2026 **Status:** ✅ Merged **Merged:** 1/5/2026 **Merged by:** [@adamhathcock](https://github.com/adamhathcock) **Base:** `adam/async` ← **Head:** `copilot/fix-read-method-implementations` --- ### 📝 Commits (6) - [`5464054`](https://github.com/adamhathcock/sharpcompress/commit/54640548ed191980a09e0eb34834ec52a74e460e) Consolidate reads - [`1a71c01`](https://github.com/adamhathcock/sharpcompress/commit/1a71c01fd4734741997cc6ae18ec6bd9fe45914b) Consolidate ReadExact and ReadFully methods into Utility.cs - [`05642cb`](https://github.com/adamhathcock/sharpcompress/commit/05642cbdc6dbce6983fdf32b3329f3db1e296024) Use ArrayPool for temporary buffers in BinaryReaderExtensions - [`372ecb7`](https://github.com/adamhathcock/sharpcompress/commit/372ecb77d0a92e599a8735eee5fa6d3ec1d4f876) Use threshold-based ArrayPool strategy for BinaryReaderExtensions - [`7701522`](https://github.com/adamhathcock/sharpcompress/commit/77015224f6be8a6524c604bb8810953fa14780ec) Add input validation for ReadBytesAsync count parameter - [`9bd86f6`](https://github.com/adamhathcock/sharpcompress/commit/9bd86f64c927e474c29b70185c8b80af667f62a3) Replace manual TransferTo implementation with Stream.CopyTo framework methods ### 📊 Changes **7 files changed** (+429 additions, -377 deletions) <details> <summary>View changed files</summary> 📝 `src/SharpCompress/Common/AsyncBinaryReader.cs` (+5 -27) 📝 `src/SharpCompress/Compressors/LZMA/Utilites/Utils.cs` (+0 -35) 📝 `src/SharpCompress/Compressors/Xz/MultiByteIntegers.cs` (+2 -35) ➕ `src/SharpCompress/Polyfills/BinaryReaderExtensions.cs` (+65 -0) 📝 `src/SharpCompress/Polyfills/StreamExtensions.cs` (+79 -43) 📝 `src/SharpCompress/Utility.cs` (+186 -237) 📝 `tests/SharpCompress.Test/UtilityTests.cs` (+92 -0) </details> ### 📄 Description The codebase had duplicate implementations of stream reading extension methods scattered across multiple files. This PR consolidates all `ReadFully`, `ReadExact`, `ReadByteAsync`, and `ReadBytesAsync` extension methods into a single location for better maintainability and optimizes memory allocation using a threshold-based ArrayPool strategy. Additionally, simplifies `TransferTo` implementation by leveraging framework's `Stream.CopyTo` methods. ## Changes - **Moved BinaryReader extension methods**: Relocated `ReadByteAsync` and `ReadBytesAsync` from `Compressors/Xz/MultiByteIntegers.cs` to `BinaryReaderExtensions.cs` - **Added centralized ReadExact methods**: Added `ReadExact` and `ReadExactAsync` to `Utility.cs` with proper parameter validation - **Fixed missing ReadFully implementations**: Restored missing `ReadFully` implementations for non-.NET 6.0+ targets with conditional compilation - **Removed duplicate implementations**: Consolidated `ReadExact` from: - `Compressors/LZMA/Utilites/Utils.cs` (removed duplicate) - `Common/AsyncBinaryReader.cs` (now uses centralized version) - `Polyfills/BinaryReaderExtensions.cs` (simplified to use centralized version) - `Polyfills/StreamExtensions.cs` (delegates to centralized version) - **Fixed bug**: `BinaryReaderExtensions.ReadBytesAsync` was reading only 1 byte instead of the requested count - **Updated callers**: Modified `MultiByteIntegers.ReadXZIntegerAsync` to call `reader.ReadByteAsync()` instead of local implementation - **Added test coverage**: 7 tests covering edge cases (empty streams, insufficient data, sequential reads) - **Memory optimization**: Implemented threshold-based ArrayPool strategy in `BinaryReaderExtensions`: - `ReadByteAsync`: Uses simple allocation for single byte (ArrayPool overhead not justified) - `ReadBytesAsync`: Uses direct allocation for small reads (≤256 bytes), ArrayPool for larger reads (&gt;256 bytes) - Balances performance vs. GC pressure based on allocation size - **Input validation**: Added parameter validation for `ReadBytesAsync` (negative check, zero-length optimization) - **Simplified TransferTo implementation**: Replaced manual buffer management with framework's `Stream.CopyTo()` and `CopyToAsync()` methods: - Uses `ReadOnlySubStream` to limit reading to specified maxLength - Removed 84 lines of manual buffer management code - Leverages optimized framework implementation ## Result All stream and binary reader extension methods now live in centralized locations with optimized memory allocation: ```csharp // Before: scattered across multiple files internal static class MultiByteIntegers { public static async Task<byte> ReadByteAsync(...) { } } internal static class Utils { public static void ReadExact(...) { } } internal class AsyncBinaryReader { private async ValueTask ReadExactAsync(...) { } } // After: consolidated with memory optimization internal static class Utility { public static bool ReadFully(this Stream stream, ...) { } public static void ReadExact(this Stream stream, ...) { } public static async Task ReadExactAsync(this Stream stream, ...) { } // Simplified using framework methods public long TransferTo(Stream destination, long maxLength) { } public async Task<long> TransferToAsync(...) { } } public static class BinaryReaderExtensions { // Optimized with threshold-based ArrayPool strategy public static async Task<byte> ReadByteAsync(this BinaryReader reader, ...) { } public static async Task<byte[]> ReadBytesAsync(this BinaryReader reader, ...) { } } ``` ## Performance Characteristics - **Memory**: Reduced allocations for large reads (&gt;256 bytes) via ArrayPool - **CPU**: Minimal copy overhead only for large reads where GC savings justify it - **Threshold**: 256 bytes chosen as sweet spot balancing typical header/small data sizes vs. large buffers - **TransferTo**: More efficient by using framework's optimized CopyTo implementation ## Testing - ✅ All 63 UtilityTests pass - ✅ All 72 XZ tests pass - ✅ All 102 Tar tests pass - ✅ All 46 SevenZip/LZMA tests pass - ✅ Code formatted with CSharpier - ✅ No security alerts <!-- START COPILOT CODING AGENT SUFFIX --> <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > branch off of adam/async and fix the multiple implementations of ReadFully or ReadByte to have a single extension method for all streams </details> <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs. --- <sub>🔄 This issue represents a GitHub Pull Request. It cannot be merged through Gitea due to API limitations.</sub>
claunia added the pull-request label 2026-01-29 22:20:59 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#1530