Streams larger than uint.MaxValue cause broken zip files #154

Open
opened 2026-01-29 22:07:28 +00:00 by claunia · 0 comments
Owner

Originally created by @kenkendk on GitHub (Mar 9, 2017).

I have tracked down an issue causing failures when attempting to read zip files with SharpCompress (files created AND read by SharpCompress).

The error message is Unknown header {value}, where {value} is some random bytes from the file. This is similar to issue #33, but they report it for a much smaller file (I have tested the file mentioned in the issue, and it does not appear to cause any errors).

The problem is that the Central Directory Entry is limited to storing the Size and HeaderOffset as uint values.
There are no checks in SharpCompress if this limit is exceeded, causing the creation to succeed, but then failing to read them later. In the example below, this is done with a single file of 4GB size, but it can also be achieved with many smaller files, as long as the HeaderOffset value becomes larger than 2^32.

There is another issue in that the number of files are limited to ushort, but this appears to have no effects other than reporting the wrong number of files, which is not directly exposed.

A workaround to reading such a file is using the forward-only interface, which does not read the Central Directory Entry, and this can correctly read the file contents unless there is a single file larger than 2^32.

I can think of two solutions:

  1. Prevent creating files with offsets larger than 2^32, simply throwing an exception if this is detected.

  2. Support zip64, which replaces the size and offset with 0xffffffff and stores a 64bit value in the extended information.

I think support for zip64 is the better choice here. Reading support for zip64 has already been added, but it requires that the correct zip64 records are written.

I will see if I can make a PR that adds zip64 support.

Example code that reproduces the issue:

using System;
using System.IO;
using System.Linq;
using SharpCompress.Archives;
using SharpCompress.Common;
using SharpCompress.Writers;
using SharpCompress.Writers.Zip;

namespace BreakTester
{
	class MainClass
	{
		public static void Main(string[] args)
		{
			// Set up the parameters
			var files = 1;
			var filesize = 4 * 1024 * 1024 * 1024L;
			var write_chunk_size = 1024 * 1024;

			var filename = "broken.zip";
			if (!File.Exists(filename))
				CreateBrokenZip(filename, files, filesize, write_chunk_size);

			var res = ReadBrokenZip(filename);
			if (res.Item1 != files)
				throw new Exception($"Incorrect number of items reported: {res.Item1}, should have been {files}");
			if (res.Item2 != filesize)
				throw new Exception($"Incorrect number of items reported: {res.Item2}, should have been {files * filesize}");
		}

		public static void CreateBrokenZip(string filename, long files, long filesize, long chunksize)
		{
			var data = new byte[chunksize];

			// We use the store method to make sure our data is not compressed,
			// and thus it is easier to make large files
			var opts = new ZipWriterOptions(CompressionType.None) { };

			using (var zip = File.OpenWrite(filename))
			using (var zipWriter = (ZipWriter)WriterFactory.Open(zip, ArchiveType.Zip, opts))
			{

				for (var i = 0; i < files; i++)
					using (var str = zipWriter.WriteToStream(i.ToString(), new ZipWriterEntryOptions() { DeflateCompressionLevel = SharpCompress.Compressors.Deflate.CompressionLevel.None }))
					{
						var left = filesize;
						while (left > 0)
						{
							var b = (int)Math.Min(left, data.Length);
							str.Write(data, 0, b);
							left -= b;
						}
					}
			}
		}

		public static Tuple<long, long> ReadBrokenZip(string filename)
		{
			using (var archive = ArchiveFactory.Open(filename))
			{
				return new Tuple<long, long>(
					archive.Entries.Count(),
					archive.Entries.Select(x => x.Size).Sum()
				);
			}
		}

		public static Tuple<long, long> ReadBrokenZipForwardOnly(string filename)
		{
			long count = 0;
			long size = 0;
			using (var fs = File.OpenRead(filename))
			using (var rd = ZipReader.Open(fs, new ReaderOptions() { LookForHeader = false }))
				while (rd.MoveToNextEntry())
				{
					count++;
					size += rd.Entry.Size;
				}

			return new Tuple<long, long>(count, size);
		}
	}
}

Originally created by @kenkendk on GitHub (Mar 9, 2017). I have tracked down an issue causing failures when attempting to read zip files with SharpCompress (files created AND read by SharpCompress). The error message is `Unknown header {value}`, where `{value}` is some random bytes from the file. This is similar to issue #33, but they report it for a much smaller file (I have tested the file mentioned in the issue, and it does not appear to cause any errors). The problem is that the Central Directory Entry is limited to storing the `Size` and `HeaderOffset` as `uint` values. There are no checks in SharpCompress if this limit is exceeded, causing the creation to succeed, but then failing to read them later. In the example below, this is done with a single file of 4GB size, but it can also be achieved with many smaller files, as long as the `HeaderOffset` value becomes larger than `2^32`. There is another issue in that the number of files are limited to `ushort`, but this appears to have no effects other than reporting the wrong number of files, which is not directly exposed. A workaround to reading such a file is using the forward-only interface, which does not read the Central Directory Entry, and this can correctly read the file contents unless there is a single file larger than `2^32`. I can think of two solutions: 1) Prevent creating files with offsets larger than `2^32`, simply throwing an exception if this is detected. 2) Support zip64, which replaces the size and offset with `0xffffffff` and stores a 64bit value in the extended information. I think support for zip64 is the better choice here. Reading support for zip64 has already been added, but it requires that the correct zip64 records are written. I will see if I can make a PR that adds zip64 support. Example code that reproduces the issue: ```csharp using System; using System.IO; using System.Linq; using SharpCompress.Archives; using SharpCompress.Common; using SharpCompress.Writers; using SharpCompress.Writers.Zip; namespace BreakTester { class MainClass { public static void Main(string[] args) { // Set up the parameters var files = 1; var filesize = 4 * 1024 * 1024 * 1024L; var write_chunk_size = 1024 * 1024; var filename = "broken.zip"; if (!File.Exists(filename)) CreateBrokenZip(filename, files, filesize, write_chunk_size); var res = ReadBrokenZip(filename); if (res.Item1 != files) throw new Exception($"Incorrect number of items reported: {res.Item1}, should have been {files}"); if (res.Item2 != filesize) throw new Exception($"Incorrect number of items reported: {res.Item2}, should have been {files * filesize}"); } public static void CreateBrokenZip(string filename, long files, long filesize, long chunksize) { var data = new byte[chunksize]; // We use the store method to make sure our data is not compressed, // and thus it is easier to make large files var opts = new ZipWriterOptions(CompressionType.None) { }; using (var zip = File.OpenWrite(filename)) using (var zipWriter = (ZipWriter)WriterFactory.Open(zip, ArchiveType.Zip, opts)) { for (var i = 0; i < files; i++) using (var str = zipWriter.WriteToStream(i.ToString(), new ZipWriterEntryOptions() { DeflateCompressionLevel = SharpCompress.Compressors.Deflate.CompressionLevel.None })) { var left = filesize; while (left > 0) { var b = (int)Math.Min(left, data.Length); str.Write(data, 0, b); left -= b; } } } } public static Tuple<long, long> ReadBrokenZip(string filename) { using (var archive = ArchiveFactory.Open(filename)) { return new Tuple<long, long>( archive.Entries.Count(), archive.Entries.Select(x => x.Size).Sum() ); } } public static Tuple<long, long> ReadBrokenZipForwardOnly(string filename) { long count = 0; long size = 0; using (var fs = File.OpenRead(filename)) using (var rd = ZipReader.Open(fs, new ReaderOptions() { LookForHeader = false })) while (rd.MoveToNextEntry()) { count++; size += rd.Entry.Size; } return new Tuple<long, long>(count, size); } } } ```
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#154