Excessive memory consumption when scanning large MS-CAB files #200

Closed
opened 2026-01-29 21:07:04 +00:00 by claunia · 3 comments
Owner

Originally created by @Zopolis4 on GitHub (Jun 14, 2025).

When attempting to scan large MS-CAB files, an excessive amount of memory is consumed, appearing to be more than double the size of the file.

To demonstrate, this is what happens when I try and run ProtectionScan on Content1.cab and providing it with 2GB of memory. (If I do not cap it like this, the high memory consumption results in my computer being unresponsive and I have to reboot)

$ sudo systemd-run --scope -p MemoryMax=2000M /tmp/ProtectionScan -d Content1.cab
Running as unit: run-ree8d5917c911412094017ea6a37891d7.scope; invocation ID: 30aa8b43f21d442db81b2cbb4a8167a5
0.00%:  - 
0.00%: Content1.cab - Checking file
System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown.
   at System.IO.MemoryStream.set_Capacity(Int32 value)
   at System.IO.MemoryStream.EnsureCapacity(Int32 value)
   at SabreTools.Compression.MSZIP.Decompressor.CopyTo(Stream source, Stream dest)
   at BinaryObjectScanner.FileType.MicrosoftCAB.Extract(Stream stream, String file, String outDir, Boolean includeDebug)
Time elapsed: 00:00:11.3040508
100.00%: Content1.cab - 

For context, Content1.cab is 878 MB.

This issue has resulted in me being unable to dump multiple discs (http://redump.org/disc/49677/, for example) with protection scanning enabled, as it consumes all the available memory and then everything becomes unresponsive and the scan never finishes.

Originally created by @Zopolis4 on GitHub (Jun 14, 2025). When attempting to scan large MS-CAB files, an excessive amount of memory is consumed, appearing to be more than double the size of the file. To demonstrate, this is what happens when I try and run ProtectionScan on `Content1.cab` and providing it with 2GB of memory. (If I do not cap it like this, the high memory consumption results in my computer being unresponsive and I have to reboot) ``` $ sudo systemd-run --scope -p MemoryMax=2000M /tmp/ProtectionScan -d Content1.cab Running as unit: run-ree8d5917c911412094017ea6a37891d7.scope; invocation ID: 30aa8b43f21d442db81b2cbb4a8167a5 0.00%: - 0.00%: Content1.cab - Checking file System.OutOfMemoryException: Exception of type 'System.OutOfMemoryException' was thrown. at System.IO.MemoryStream.set_Capacity(Int32 value) at System.IO.MemoryStream.EnsureCapacity(Int32 value) at SabreTools.Compression.MSZIP.Decompressor.CopyTo(Stream source, Stream dest) at BinaryObjectScanner.FileType.MicrosoftCAB.Extract(Stream stream, String file, String outDir, Boolean includeDebug) Time elapsed: 00:00:11.3040508 100.00%: Content1.cab - ``` For context, `Content1.cab` is 878 MB. This issue has resulted in me being unable to dump multiple discs (http://redump.org/disc/49677/, for example) with protection scanning enabled, as it consumes all the available memory and then everything becomes unresponsive and the scan never finishes.
Author
Owner

@mnadareski commented on GitHub (Sep 7, 2025):

This should be addressed as of the latest build due to massive internal changes to MS-CAB handling.

@mnadareski commented on GitHub (Sep 7, 2025): This should be addressed as of the latest build due to massive internal changes to MS-CAB handling.
Author
Owner

@Zopolis4 commented on GitHub (Sep 8, 2025):

The scanning is near-instant now, which I presume is correct, and I don't OOM anymore, which was this issue.

@Zopolis4 commented on GitHub (Sep 8, 2025): The scanning is near-instant now, which I presume is correct, and I don't OOM anymore, which was this issue.
Author
Owner

@mnadareski commented on GitHub (Sep 8, 2025):

It was probably LZX-based, which means it can't effectively be scanned. But it does still cache the chunks, so seeing that it didn't OOM is a very good sign.

@mnadareski commented on GitHub (Sep 8, 2025): It was probably LZX-based, which means it can't effectively be scanned. But it does still cache the chunks, so seeing that it didn't OOM is a very good sign.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SabreTools/BinaryObjectScanner#200