Correct mszip extraction on some cabs currently relies on an exception being thrown #17

Closed
opened 2026-01-29 21:16:35 +00:00 by claunia · 1 comment
Owner

Originally created by @HeroponRikiBestest on GitHub (Nov 30, 2025).

To preface; technically, there's no issue with mszip extraction (that I know of) right now; what I describe results in the proper file data being extracted.

At the moment, on some mszip cabs, extraction will hit the following:

4b39ee8d00/SabreTools.Serialization/Wrappers/MicrosoftCabinet.cs (L268)

The program will pad the block out to 32768 bytes, and write it to the stream.
Every time this happens, the immediate subsequent block will throw an exception on this line:

4b39ee8d00/SabreTools.Serialization/Wrappers/MicrosoftCabinet.cs (L262)

Resulting in log output looking like this:

Data block 27168 in folder 0 had mismatching sizes. Expected: 32768, Got: 13139
System.IO.InvalidDataException: source
   at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug)
System.IO.InvalidDataException: source
   at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug)
Data block 10321 in folder 0 had mismatching sizes. Expected: 32768, Got: 13139
System.IO.InvalidDataException: source
   at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug)
Data block 16068 in folder 0 had mismatching sizes. Expected: 32768, Got: 16904
System.IO.InvalidDataException: source
   at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug)

However, this doesn't actually seem to cause any issues. The resulting files that are extracted have all the right data, and match 1:1 against extraction from other programs like 7z.

Despite this, I'm opening an issue since it seems potentially dangerous to rely on an exception being thrown for proper output, especially if any other changes are made to this section. It also makes it more difficult to find actual issues in the console output.

This happens on some number of mszip cabs, but not all of them. The cab used in this issue was the 27-part cab inside 2K Sports Major League Baseball 2K9 (USA).iso http://redump.org/disc/59967/ , where this occurs several times.

Other examples include the multi-part cab inside Singularity (Europe) (En,Fr,Es,It).iso http://redump.org/disc/52122/ , inside Lost Planet - Extreme Condition - Colonies Edition (Russia) (En,Ja,Fr,De,Es,It,Ko,Pl,Ru) (Disc 1) http://redump.org/disc/80571/ , and inside Brothers in Arms - Hell's Highway (Germany) (En,De) http://redump.org/disc/85478/ . More examples can be provided if necessary.

Originally created by @HeroponRikiBestest on GitHub (Nov 30, 2025). To preface; technically, there's no issue with mszip extraction (that I know of) right now; what I describe results in the proper file data being extracted. At the moment, on some mszip cabs, extraction will hit the following: https://github.com/SabreTools/SabreTools.Serialization/blob/4b39ee8d0058ce20101c0ae41ef755e6fe6e3f50/SabreTools.Serialization/Wrappers/MicrosoftCabinet.cs#L268 The program will pad the block out to 32768 bytes, and write it to the stream. Every time this happens, the immediate subsequent block will throw an exception on this line: https://github.com/SabreTools/SabreTools.Serialization/blob/4b39ee8d0058ce20101c0ae41ef755e6fe6e3f50/SabreTools.Serialization/Wrappers/MicrosoftCabinet.cs#L262 Resulting in log output looking like this: ```cs Data block 27168 in folder 0 had mismatching sizes. Expected: 32768, Got: 13139 System.IO.InvalidDataException: source at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug) System.IO.InvalidDataException: source at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug) Data block 10321 in folder 0 had mismatching sizes. Expected: 32768, Got: 13139 System.IO.InvalidDataException: source at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug) Data block 16068 in folder 0 had mismatching sizes. Expected: 32768, Got: 16904 System.IO.InvalidDataException: source at SabreTools.Serialization.Wrappers.MicrosoftCabinet.DecompressMSZIPBlock(Int32 folderIndex, Decompressor mszip, Int32 blockIndex, CFDATA block, Byte[] blockData, Boolean includeDebug) ``` However, this doesn't actually seem to cause any issues. The resulting files that are extracted have all the right data, and match 1:1 against extraction from other programs like 7z. Despite this, I'm opening an issue since it seems potentially dangerous to rely on an exception being thrown for proper output, especially if any other changes are made to this section. It also makes it more difficult to find actual issues in the console output. This happens on some number of mszip cabs, but not all of them. The cab used in this issue was the 27-part cab inside `2K Sports Major League Baseball 2K9 (USA).iso` http://redump.org/disc/59967/ , where this occurs several times. Other examples include the multi-part cab inside `Singularity (Europe) (En,Fr,Es,It).iso` http://redump.org/disc/52122/ , inside `Lost Planet - Extreme Condition - Colonies Edition (Russia) (En,Ja,Fr,De,Es,It,Ko,Pl,Ru) (Disc 1)` http://redump.org/disc/80571/ , and inside `Brothers in Arms - Hell's Highway (Germany) (En,De)` http://redump.org/disc/85478/ . More examples can be provided if necessary.
Author
Owner

@mnadareski commented on GitHub (Jan 5, 2026):

Reopening as the first PR doesn't fully cover this, as already discussed with @HeroponRikiBestest

@mnadareski commented on GitHub (Jan 5, 2026): Reopening as the first PR doesn't fully cover this, as already discussed with @HeroponRikiBestest
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: SabreTools/SabreTools.Serialization#17