Zip64 - 0 Size on Entry > 4GiB #436

Closed
opened 2026-01-29 22:12:03 +00:00 by claunia · 5 comments
Owner

Originally created by @Nanook on GitHub (Feb 10, 2021).

I have a zip containing 1 file that's greater than 4GiB.

The Entry returns a Size of 0. I've had a look at the code and the correct size appears to be in the RelativeOffsetOfEntryHeader property of the Zip64ExtendedInformationExtraField class.

I've made a workaround but I don't know how correct it is

var zip64ExtraData = Extra.OfType<Zip64ExtendedInformationExtraField>().FirstOrDefault();
if (zip64ExtraData != null)
{
    if (CompressedSize == uint.MaxValue)
    {
        CompressedSize = zip64ExtraData.CompressedSize;
    }

    if (UncompressedSize == uint.MaxValue)
    {
        //hacky fix for resolving 0 Size being returned for files > 4GiB. This returns the correct size for the test case
        if (zip64ExtraData.UncompressedSize == 0 && zip64ExtraData.RelativeOffsetOfEntryHeader > (long)uint.MaxValue)
            UncompressedSize = zip64ExtraData.RelativeOffsetOfEntryHeader;
        else
            UncompressedSize = zip64ExtraData.UncompressedSize;
    }

    if (RelativeOffsetOfEntryHeader == uint.MaxValue)
    {
        RelativeOffsetOfEntryHeader = zip64ExtraData.RelativeOffsetOfEntryHeader;
    }
}

There appears to be no other header set if (RelativeOffsetOfEntryHeader == uint.MaxValue) is false
Any advice would be appreciated. I'm happy to try and create a PR. I've not done this before.

Originally created by @Nanook on GitHub (Feb 10, 2021). I have a zip containing 1 file that's greater than 4GiB. The Entry returns a Size of 0. I've had a look at the code and the correct size appears to be in the RelativeOffsetOfEntryHeader property of the Zip64ExtendedInformationExtraField class. I've made a workaround but I don't know how correct it is ``` var zip64ExtraData = Extra.OfType<Zip64ExtendedInformationExtraField>().FirstOrDefault(); if (zip64ExtraData != null) { if (CompressedSize == uint.MaxValue) { CompressedSize = zip64ExtraData.CompressedSize; } if (UncompressedSize == uint.MaxValue) { //hacky fix for resolving 0 Size being returned for files > 4GiB. This returns the correct size for the test case if (zip64ExtraData.UncompressedSize == 0 && zip64ExtraData.RelativeOffsetOfEntryHeader > (long)uint.MaxValue) UncompressedSize = zip64ExtraData.RelativeOffsetOfEntryHeader; else UncompressedSize = zip64ExtraData.UncompressedSize; } if (RelativeOffsetOfEntryHeader == uint.MaxValue) { RelativeOffsetOfEntryHeader = zip64ExtraData.RelativeOffsetOfEntryHeader; } } ``` There appears to be no other header set `if (RelativeOffsetOfEntryHeader == uint.MaxValue)` is false Any advice would be appreciated. I'm happy to try and create a PR. I've not done this before.
Author
Owner

@adamhathcock commented on GitHub (Feb 10, 2021):

Might need to research the ZIp64 loading of the Extra is correct

from APPNOTE.txt

  4.5.3 -Zip64 Extended Information Extra Field (0x0001):

      The following is the layout of the zip64 extended 
      information "extra" block. If one of the size or
      offset fields in the Local or Central directory
      record is too small to hold the required data,
      a Zip64 extended information record is created.
      The order of the fields in the zip64 extended 
      information record is fixed, but the fields MUST
      only appear if the corresponding Local or Central
      directory record field is set to 0xFFFF or 0xFFFFFFFF.

      Note: all fields stored in Intel low-byte/high-byte order.

        Value      Size       Description
        -----      ----       -----------
(ZIP64) 0x0001     2 bytes    Tag for this "extra" block type
        Size       2 bytes    Size of this "extra" block
        Original 
        Size       8 bytes    Original uncompressed file size
        Compressed
        Size       8 bytes    Size of compressed data
        Relative Header
        Offset     8 bytes    Offset of local header record
        Disk Start
        Number     4 bytes    Number of the disk on which
                              this file starts 

      This entry in the Local header MUST include BOTH original
      and compressed file size fields. If encrypting the 
      central directory and bit 13 of the general purpose bit
      flag is set indicating masking, the value stored in the
      Local Header for the original file size will be zero.

The code does different things based on the length which feels wrong:
faf1a9f7e4/src/SharpCompress/Common/Zip/Headers/LocalEntryHeaderExtraFactory.cs (L83)

You could be getting the situation were only the uncompressed size is provided but putting it in the wrong field

@adamhathcock commented on GitHub (Feb 10, 2021): Might need to research the ZIp64 loading of the Extra is correct from APPNOTE.txt ``` 4.5.3 -Zip64 Extended Information Extra Field (0x0001): The following is the layout of the zip64 extended information "extra" block. If one of the size or offset fields in the Local or Central directory record is too small to hold the required data, a Zip64 extended information record is created. The order of the fields in the zip64 extended information record is fixed, but the fields MUST only appear if the corresponding Local or Central directory record field is set to 0xFFFF or 0xFFFFFFFF. Note: all fields stored in Intel low-byte/high-byte order. Value Size Description ----- ---- ----------- (ZIP64) 0x0001 2 bytes Tag for this "extra" block type Size 2 bytes Size of this "extra" block Original Size 8 bytes Original uncompressed file size Compressed Size 8 bytes Size of compressed data Relative Header Offset 8 bytes Offset of local header record Disk Start Number 4 bytes Number of the disk on which this file starts This entry in the Local header MUST include BOTH original and compressed file size fields. If encrypting the central directory and bit 13 of the general purpose bit flag is set indicating masking, the value stored in the Local Header for the original file size will be zero. ``` The code does different things based on the length which feels wrong: https://github.com/adamhathcock/sharpcompress/blob/faf1a9f7e4fa6b3ac71a4cbf7f73abf7257ed87a/src/SharpCompress/Common/Zip/Headers/LocalEntryHeaderExtraFactory.cs#L83 You could be getting the situation were only the uncompressed size is provided but putting it in the wrong field
Author
Owner

@Nanook commented on GitHub (Feb 10, 2021):

Thanks, I was apprehensive I'd made a bad assumption so went for the defensive change.

I think you're correct. I don't know which app created the file I'm using so I created another with 7Zip. < 4GiB compressed but > 4GiB decompressed. It has the same behaviour.

            switch (DataBytes.Length)
            {
                case 4:
                    VolumeNumber = BinaryPrimitives.ReadUInt32LittleEndian(DataBytes);
                    return;
                case 8:
                    UncompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); //works for test zip and a new 7zip created zip
                    //Original - RelativeOffsetOfEntryHeader = BinaryPrimitives.ReadInt64LittleEndian(DataBytes);
                    return;
                case 12:
                    RelativeOffsetOfEntryHeader = BinaryPrimitives.ReadInt64LittleEndian(DataBytes);
                    VolumeNumber = BinaryPrimitives.ReadUInt32LittleEndian(DataBytes.AsSpan(8));
                    return;
                case 16:
                    UncompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes);
                    CompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes.AsSpan(8));
                    return;
                case 20:

Both indicate that length of 8 should be as above. Creating another test zip with a file > 4GiB that compresses to > 4GiB hits case 16; and works correctly as is. Just thought I'd test that as I don't know how to create the other examples.

@Nanook commented on GitHub (Feb 10, 2021): Thanks, I was apprehensive I'd made a bad assumption so went for the defensive change. I think you're correct. I don't know which app created the file I'm using so I created another with 7Zip. < 4GiB compressed but > 4GiB decompressed. It has the same behaviour. ``` switch (DataBytes.Length) { case 4: VolumeNumber = BinaryPrimitives.ReadUInt32LittleEndian(DataBytes); return; case 8: UncompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); //works for test zip and a new 7zip created zip //Original - RelativeOffsetOfEntryHeader = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); return; case 12: RelativeOffsetOfEntryHeader = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); VolumeNumber = BinaryPrimitives.ReadUInt32LittleEndian(DataBytes.AsSpan(8)); return; case 16: UncompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); CompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes.AsSpan(8)); return; case 20: ``` Both indicate that length of 8 should be as above. Creating another test zip with a file > 4GiB that compresses to > 4GiB hits `case 16;` and works correctly as is. Just thought I'd test that as I don't know how to create the other examples.
Author
Owner

@adamhathcock commented on GitHub (Feb 10, 2021):

I think that switch statement shouldn't be there. Looking at the spec, there is a given order of values but it can be variable.

So the first 8 bytes, if exists, is always the uncompressed length.
So the next 8 bytes, if exists, is always the compressed length.
So the next 8 bytes, if exists, is always the offset of the local header.
etc.

@adamhathcock commented on GitHub (Feb 10, 2021): I think that switch statement shouldn't be there. Looking at the spec, there is a given order of values but it can be variable. So the first 8 bytes, if exists, is always the uncompressed length. So the next 8 bytes, if exists, is always the compressed length. So the next 8 bytes, if exists, is always the offset of the local header. etc.
Author
Owner

@Nanook commented on GitHub (Feb 10, 2021):

        private void Process()
        {
            if (DataBytes.Length >= 8)
                UncompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); //as per spec
            if (DataBytes.Length >= 16)
                CompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes.AsSpan(8));
            if (DataBytes.Length >= 24)
                RelativeOffsetOfEntryHeader = BinaryPrimitives.ReadInt64LittleEndian(DataBytes.AsSpan(16));
            if (DataBytes.Length >= 28)
                VolumeNumber = BinaryPrimitives.ReadUInt32LittleEndian(DataBytes.AsSpan(24));

            if (DataBytes.Length > 28)
                throw new ArchiveException("Unexpected size of of Zip64 extended information extra field");
        }

I think this matches the spec. You beat me to it ha

@Nanook commented on GitHub (Feb 10, 2021): ``` private void Process() { if (DataBytes.Length >= 8) UncompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes); //as per spec if (DataBytes.Length >= 16) CompressedSize = BinaryPrimitives.ReadInt64LittleEndian(DataBytes.AsSpan(8)); if (DataBytes.Length >= 24) RelativeOffsetOfEntryHeader = BinaryPrimitives.ReadInt64LittleEndian(DataBytes.AsSpan(16)); if (DataBytes.Length >= 28) VolumeNumber = BinaryPrimitives.ReadUInt32LittleEndian(DataBytes.AsSpan(24)); if (DataBytes.Length > 28) throw new ArchiveException("Unexpected size of of Zip64 extended information extra field"); } ``` I think this matches the spec. You beat me to it ha
Author
Owner

@adamhathcock commented on GitHub (Feb 14, 2021):

Released https://github.com/adamhathcock/sharpcompress/releases/tag/0.28

@adamhathcock commented on GitHub (Feb 14, 2021): Released https://github.com/adamhathcock/sharpcompress/releases/tag/0.28
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#436