[PR #1100] Consolidate stream extension methods and simplify with framework methods #1533

Closed
opened 2026-01-29 22:20:59 +00:00 by claunia · 0 comments
Owner

Original Pull Request: https://github.com/adamhathcock/sharpcompress/pull/1100

State: closed
Merged: Yes


The codebase had duplicate implementations of stream reading extension methods scattered across multiple files. This PR consolidates all ReadFully, ReadExact, ReadByteAsync, and ReadBytesAsync extension methods into a single location for better maintainability and optimizes memory allocation using a threshold-based ArrayPool strategy. Additionally, simplifies TransferTo implementation by leveraging framework's Stream.CopyTo methods.

Changes

  • Moved BinaryReader extension methods: Relocated ReadByteAsync and ReadBytesAsync from Compressors/Xz/MultiByteIntegers.cs to BinaryReaderExtensions.cs
  • Added centralized ReadExact methods: Added ReadExact and ReadExactAsync to Utility.cs with proper parameter validation
  • Fixed missing ReadFully implementations: Restored missing ReadFully implementations for non-.NET 6.0+ targets with conditional compilation
  • Removed duplicate implementations: Consolidated ReadExact from:
    • Compressors/LZMA/Utilites/Utils.cs (removed duplicate)
    • Common/AsyncBinaryReader.cs (now uses centralized version)
    • Polyfills/BinaryReaderExtensions.cs (simplified to use centralized version)
    • Polyfills/StreamExtensions.cs (delegates to centralized version)
  • Fixed bug: BinaryReaderExtensions.ReadBytesAsync was reading only 1 byte instead of the requested count
  • Updated callers: Modified MultiByteIntegers.ReadXZIntegerAsync to call reader.ReadByteAsync() instead of local implementation
  • Added test coverage: 7 tests covering edge cases (empty streams, insufficient data, sequential reads)
  • Memory optimization: Implemented threshold-based ArrayPool strategy in BinaryReaderExtensions:
    • ReadByteAsync: Uses simple allocation for single byte (ArrayPool overhead not justified)
    • ReadBytesAsync: Uses direct allocation for small reads (≤256 bytes), ArrayPool for larger reads (>256 bytes)
    • Balances performance vs. GC pressure based on allocation size
  • Input validation: Added parameter validation for ReadBytesAsync (negative check, zero-length optimization)
  • Simplified TransferTo implementation: Replaced manual buffer management with framework's Stream.CopyTo() and CopyToAsync() methods:
    • Uses ReadOnlySubStream to limit reading to specified maxLength
    • Removed 84 lines of manual buffer management code
    • Leverages optimized framework implementation

Result

All stream and binary reader extension methods now live in centralized locations with optimized memory allocation:

// Before: scattered across multiple files
internal static class MultiByteIntegers {
    public static async Task<byte> ReadByteAsync(...) { }
}
internal static class Utils {
    public static void ReadExact(...) { }
}
internal class AsyncBinaryReader {
    private async ValueTask ReadExactAsync(...) { }
}

// After: consolidated with memory optimization
internal static class Utility {
    public static bool ReadFully(this Stream stream, ...) { }
    public static void ReadExact(this Stream stream, ...) { }
    public static async Task ReadExactAsync(this Stream stream, ...) { }
    
    // Simplified using framework methods
    public long TransferTo(Stream destination, long maxLength) { }
    public async Task<long> TransferToAsync(...) { }
}
public static class BinaryReaderExtensions {
    // Optimized with threshold-based ArrayPool strategy
    public static async Task<byte> ReadByteAsync(this BinaryReader reader, ...) { }
    public static async Task<byte[]> ReadBytesAsync(this BinaryReader reader, ...) { }
}

Performance Characteristics

  • Memory: Reduced allocations for large reads (>256 bytes) via ArrayPool
  • CPU: Minimal copy overhead only for large reads where GC savings justify it
  • Threshold: 256 bytes chosen as sweet spot balancing typical header/small data sizes vs. large buffers
  • TransferTo: More efficient by using framework's optimized CopyTo implementation

Testing

  • All 63 UtilityTests pass
  • All 72 XZ tests pass
  • All 102 Tar tests pass
  • All 46 SevenZip/LZMA tests pass
  • Code formatted with CSharpier
  • No security alerts
Original prompt

branch off of adam/async and fix the multiple implementations of ReadFully or ReadByte to have a single extension method for all streams


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

**Original Pull Request:** https://github.com/adamhathcock/sharpcompress/pull/1100 **State:** closed **Merged:** Yes --- The codebase had duplicate implementations of stream reading extension methods scattered across multiple files. This PR consolidates all `ReadFully`, `ReadExact`, `ReadByteAsync`, and `ReadBytesAsync` extension methods into a single location for better maintainability and optimizes memory allocation using a threshold-based ArrayPool strategy. Additionally, simplifies `TransferTo` implementation by leveraging framework's `Stream.CopyTo` methods. ## Changes - **Moved BinaryReader extension methods**: Relocated `ReadByteAsync` and `ReadBytesAsync` from `Compressors/Xz/MultiByteIntegers.cs` to `BinaryReaderExtensions.cs` - **Added centralized ReadExact methods**: Added `ReadExact` and `ReadExactAsync` to `Utility.cs` with proper parameter validation - **Fixed missing ReadFully implementations**: Restored missing `ReadFully` implementations for non-.NET 6.0+ targets with conditional compilation - **Removed duplicate implementations**: Consolidated `ReadExact` from: - `Compressors/LZMA/Utilites/Utils.cs` (removed duplicate) - `Common/AsyncBinaryReader.cs` (now uses centralized version) - `Polyfills/BinaryReaderExtensions.cs` (simplified to use centralized version) - `Polyfills/StreamExtensions.cs` (delegates to centralized version) - **Fixed bug**: `BinaryReaderExtensions.ReadBytesAsync` was reading only 1 byte instead of the requested count - **Updated callers**: Modified `MultiByteIntegers.ReadXZIntegerAsync` to call `reader.ReadByteAsync()` instead of local implementation - **Added test coverage**: 7 tests covering edge cases (empty streams, insufficient data, sequential reads) - **Memory optimization**: Implemented threshold-based ArrayPool strategy in `BinaryReaderExtensions`: - `ReadByteAsync`: Uses simple allocation for single byte (ArrayPool overhead not justified) - `ReadBytesAsync`: Uses direct allocation for small reads (≤256 bytes), ArrayPool for larger reads (&gt;256 bytes) - Balances performance vs. GC pressure based on allocation size - **Input validation**: Added parameter validation for `ReadBytesAsync` (negative check, zero-length optimization) - **Simplified TransferTo implementation**: Replaced manual buffer management with framework's `Stream.CopyTo()` and `CopyToAsync()` methods: - Uses `ReadOnlySubStream` to limit reading to specified maxLength - Removed 84 lines of manual buffer management code - Leverages optimized framework implementation ## Result All stream and binary reader extension methods now live in centralized locations with optimized memory allocation: ```csharp // Before: scattered across multiple files internal static class MultiByteIntegers { public static async Task<byte> ReadByteAsync(...) { } } internal static class Utils { public static void ReadExact(...) { } } internal class AsyncBinaryReader { private async ValueTask ReadExactAsync(...) { } } // After: consolidated with memory optimization internal static class Utility { public static bool ReadFully(this Stream stream, ...) { } public static void ReadExact(this Stream stream, ...) { } public static async Task ReadExactAsync(this Stream stream, ...) { } // Simplified using framework methods public long TransferTo(Stream destination, long maxLength) { } public async Task<long> TransferToAsync(...) { } } public static class BinaryReaderExtensions { // Optimized with threshold-based ArrayPool strategy public static async Task<byte> ReadByteAsync(this BinaryReader reader, ...) { } public static async Task<byte[]> ReadBytesAsync(this BinaryReader reader, ...) { } } ``` ## Performance Characteristics - **Memory**: Reduced allocations for large reads (&gt;256 bytes) via ArrayPool - **CPU**: Minimal copy overhead only for large reads where GC savings justify it - **Threshold**: 256 bytes chosen as sweet spot balancing typical header/small data sizes vs. large buffers - **TransferTo**: More efficient by using framework's optimized CopyTo implementation ## Testing - ✅ All 63 UtilityTests pass - ✅ All 72 XZ tests pass - ✅ All 102 Tar tests pass - ✅ All 46 SevenZip/LZMA tests pass - ✅ Code formatted with CSharpier - ✅ No security alerts <!-- START COPILOT CODING AGENT SUFFIX --> <!-- START COPILOT ORIGINAL PROMPT --> <details> <summary>Original prompt</summary> > branch off of adam/async and fix the multiple implementations of ReadFully or ReadByte to have a single extension method for all streams </details> <!-- START COPILOT CODING AGENT TIPS --> --- 💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more [Copilot coding agent tips](https://gh.io/copilot-coding-agent-tips) in the docs.
claunia added the pull-request label 2026-01-29 22:21:00 +00:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#1533