[PR #211] Add zip64 #903

New Issue

claunia · 2026-01-29T22:18:08Z

claunia commented

2026-01-29 22:18:08 +00:00

Original Pull Request: https://github.com/adamhathcock/sharpcompress/pull/211

State: closed
Merged: Yes

This adds zip64 writing support.

The way zip64 is implemented is by appending a set of "extra" values to the header.
In the zip file there are two headers for each file: local (before the stream) and central (end of file).

The central header is simple enough, and most implementations simply use this and mostly ignore the other header. Once we are writing the central header, we have all the information required, so we can just write it.

For the local header, there is a tradeoff. The "extra" bytes take up 2+2+8+8=20 bytes pr. entry. This header is only required if either stream size (compressed and uncompressed) exceeds uint.MaxValue, but we do not know the length of the stream before writing.

The dilemma is: do we write it for all files, in case one is too long? Or do we not write it and risk overflowing the values?

Since the header is "mostly ignored" we could live with this being broken. On the other hand, if we can use it we should.

I have added a hint value to the ZipWriteEntryOptions that can force this header on and off. I have also added a flag in ZipWriterOptions that enable the extra fields for all entries by default. If the caller does not set any flags, as I would assume most would, I use a threshold of 2GiB to toggle zip64 headers. If the underlying stream is already 2GiB or more, zip64 is automatically enabled in the local header.
This is not a perfect solution, but the I figure most users would write smaller zip files and never notices. Larger zip files are not really impacted by 20 bytes here and there. You can of course defeat the scheme by writing a 1.9GiB file, and then a +4GiB file, thus hitting the limitations before the automatic upgrade kicks in.

If the stream is non-seekable, we have another issue, namely that the file would normally set a flag and then write the Crc, uncompressed size, and compressed size in a special trailing header. This trailing header has not been updated to use zip64, so we cannot write the correct values in it. We can also not use both trailing headers and "extra" data. This was clarified from PKWare: https://blogs.oracle.com/xuemingshen/entry/is_zipinput_outputstream_handling_of

In the case of streaming, the local headers are written with the trailing data, which may overflow. But the headers do contain the crc32 value, and may contain correct data if the sizes are less than uint.MaxValue. Again, the central directory header has the correct values.

Not sure how to deal with testing, as it requires files +4GiB to hit the limitations.

**Original Pull Request:** https://github.com/adamhathcock/sharpcompress/pull/211 **State:** closed **Merged:** Yes --- This adds zip64 writing support. The way zip64 is implemented is by appending a set of "extra" values to the header. In the zip file there are two headers for each file: local (before the stream) and central (end of file). The central header is simple enough, and most implementations simply use this and mostly ignore the other header. Once we are writing the central header, we have all the information required, so we can just write it. For the local header, there is a tradeoff. The "extra" bytes take up 2+2+8+8=20 bytes pr. entry. This header is only required if either stream size (compressed and uncompressed) exceeds `uint.MaxValue`, but we do not know the length of the stream before writing. The dilemma is: do we write it for all files, in case one is too long? Or do we not write it and risk overflowing the values? Since the header is "mostly ignored" we could live with this being broken. On the other hand, if we can use it we should. I have added a hint value to the `ZipWriteEntryOptions` that can force this header on and off. I have also added a flag in `ZipWriterOptions` that enable the extra fields for all entries by default. If the caller does not set any flags, as I would assume most would, I use a threshold of `2GiB` to toggle zip64 headers. If the underlying stream is already 2GiB or more, zip64 is automatically enabled in the local header. This is not a perfect solution, but the I figure most users would write smaller zip files and never notices. Larger zip files are not really impacted by 20 bytes here and there. You can of course defeat the scheme by writing a 1.9GiB file, and then a +4GiB file, thus hitting the limitations before the automatic upgrade kicks in. If the stream is non-seekable, we have another issue, namely that the file would normally set a flag and then write the Crc, uncompressed size, and compressed size in a special trailing header. This trailing header has not been updated to use zip64, so we cannot write the correct values in it. We can also not use both trailing headers and "extra" data. This was clarified from PKWare: https://blogs.oracle.com/xuemingshen/entry/is_zipinput_outputstream_handling_of In the case of streaming, the local headers are written with the trailing data, which may overflow. But the headers do contain the crc32 value, and may contain correct data if the sizes are less than `uint.MaxValue`. Again, the central directory header has the correct values. Not sure how to deal with testing, as it requires files +4GiB to hit the limitations.

claunia added the pull-request label 2026-01-29 22:18:08 +00:00

claunia referenced this issue

2026-01-29 22:20:01 +00:00

[PR #903] Base Reader implementation of .ARC format #1331

Sign in to join this conversation.

Branches Tags

master

release

adam/merge-release-to-master

dependabot/nuget/xunit.v3-3.2.2

adam/more-explode-async

copilot/fix-infinite-loop-rar-archive

adam/data-descriptor-fix

adam/fix-tests-with-proper-rewind

copilot/fix-data-descriptor-stream-bug

adam/lmza-investigation

adam/create-rar-async

adam/async-rar2

copilot/support-multi-threading-path

copilot/sub-pr-1132-again

adam/memory-perf

copilot/add-performance-benchmarking

copilot/sub-pr-1121

copilot/add-password-support-zip-files

copilot/add-so-optimized-zip-support

adam/rar-async-only

copilot/add-buffered-stream-async-read

copilot/sub-pr-1076

copilot/fix-decompression-exception

copilot/fix-archivefactory-issue

copilot/rationalize-sourcestream-volumes

adam/open-async

copilot/add-ace-archive-support

copilot/sub-pr-1040-again

adam/more-async-3

copilot/fix-tararchive-incomplete-iteration

adam/multi-threaded

copilot/sub-pr-1040

adam/awesome-copilot

copilot/fix-ziparchive-extraction-issue

copilot/fix-tararchive-open-crash

copilot/fix-tar-xz-file-reading-issue

copilot/setup-copilot-instructions

copilot/fix-decompression-performance-issue

copilot/convert-stream-access-to-async

adam/enable-agent

adam/async-deflate

adam/async-rar

adam/more-cleanup

adam/zstd

async-2

zstandard

net461-tests

dmg

async

build-netcore3

recycle-memory-stream

presentation

pax

netcore2

zip_encryption

dotnet-tool

tar_redux

native_zlib

Issue-197

system_buffers

TarNames

7zip_sfx

portable_crypto

WinRT

new_7zip

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: starred/sharpcompress#903