Expose more deflate options #223

Open
opened 2026-01-29 22:08:37 +00:00 by claunia · 9 comments
Owner

Originally created by @Coloris on GitHub (Jul 21, 2017).

Is-it possible to achieve the following with sharpcompress ? : https://stackoverflow.com/questions/45225914/how-to-get-the-compressed-size-ziparchive-of-a-filestream-block

Originally created by @Coloris on GitHub (Jul 21, 2017). Is-it possible to achieve the following with sharpcompress ? : https://stackoverflow.com/questions/45225914/how-to-get-the-compressed-size-ziparchive-of-a-filestream-block
claunia added the enhancementup for grabs labels 2026-01-29 22:08:37 +00:00
Author
Owner

@adamhathcock commented on GitHub (Jul 22, 2017):

Just use DeflateStream on each 64k block. No need to actually make a zip file.

@adamhathcock commented on GitHub (Jul 22, 2017): Just use DeflateStream on each 64k block. No need to actually make a zip file.
Author
Owner

@Coloris commented on GitHub (Jul 22, 2017):

I'm writing an windows store package editor, the appx format used here is basically a zip( can be zip64) deflate file with some XML metadatas.

In order to create a package, I for example need to split a file in several 64kb chunks, compress each of them using the deflate method then store the compressed length of each chunk inside an XML file. So I need to make a zip (appx) file in the end.

Are you saying that either compressing a file with a deflate-stream or directly into a ZIP, the compressed data are going to be strictly identical on both ends ? If not, is it possible to compress data by chunk and get the compressed length of each ?

@Coloris commented on GitHub (Jul 22, 2017): I'm writing an windows store package editor, the appx format used here is basically a zip( can be zip64) deflate file with some XML metadatas. In order to create a package, I for example need to split a file in several 64kb chunks, compress each of them using the deflate method then store the compressed length of each chunk inside an XML file. So I need to make a zip (appx) file in the end. Are you saying that either compressing a file with a deflate-stream or directly into a ZIP, the compressed data are going to be strictly identical on both ends ? If not, is it possible to compress data by chunk and get the compressed length of each ?
Author
Owner

@adamhathcock commented on GitHub (Jul 22, 2017):

You could make a zip and get the length but it’s more efficient just to make a deflate stream write to a file or memory and get the length that way. It doesn’t matter if it’s 64k chunks or whole files.

I can’t help with the appx spec. I thought it was a regular zip file.

@adamhathcock commented on GitHub (Jul 22, 2017): You could make a zip and get the length but it’s more efficient just to make a deflate stream write to a file or memory and get the length that way. It doesn’t matter if it’s 64k chunks or whole files. I can’t help with the appx spec. I thought it was a regular zip file.
Author
Owner

@Coloris commented on GitHub (Jul 22, 2017):

Thanks for your help adam. It is a regular zip file :p But it requires an XML file that contains, for every single entry, a hash of each 64kb uncompressed block and the size of the compressed block inside the archive.

A better explanation from Microsoft :

When an app file is added to the app package, it is first divided into 64-KB blocks, and each block is hashed using the algorithm specified. If the size of the file is not an even multiple of 64 KB, the size of the final block is inferred as the remainder of the file size divided by 64 KB.

The Size attribute value is the size of the data block as stored in the app package. This is usually smaller than 64 KB because each block is commonly compressed before being stored in the app package. Because data compression (Deflate algorithm) produces a variable-length result, the Size attribute must be specified for all blocks of a file stored in compressed form within the package.

At the very end, I need to make a zip (appx) file, can I first compress each block with a deflate stream, then collect the compressed length on-the-fly and finally compress the source file into a zip entry ?

Is processing this way going to give me the same compressed data on both ends ? I ask because I imagine that compressing a file by chunk is not the sane as doing it in a single shot ? In my case, it is not really important if the compression is not very efficient.

@Coloris commented on GitHub (Jul 22, 2017): Thanks for your help adam. It is a regular zip file :p But it requires an XML file that contains, for every single entry, a hash of each 64kb uncompressed block and the size of the compressed block inside the archive. A better explanation from Microsoft : > When an app file is added to the app package, it is first divided into 64-KB blocks, and each block is hashed using the algorithm specified. If the size of the file is not an even multiple of 64 KB, the size of the final block is inferred as the remainder of the file size divided by 64 KB. > The Size attribute value is the size of the data block as stored in the app package. This is usually smaller than 64 KB because each block is commonly compressed before being stored in the app package. Because data compression (Deflate algorithm) produces a variable-length result, the Size attribute must be specified for all blocks of a file stored in compressed form within the package. At the very end, I need to make a zip (appx) file, can I first compress each block with a deflate stream, then collect the compressed length on-the-fly and finally compress the source file into a zip entry ? Is processing this way going to give me the same compressed data on both ends ? I ask because I imagine that compressing a file by chunk is not the sane as doing it in a single shot ? In my case, it is not really important if the compression is not very efficient.
Author
Owner

@adamhathcock commented on GitHub (Jul 23, 2017):

No it won’t be the same as doing a single stream vs 64k chunks. You can do anything you’re talking about with DeflateStream and/or Zip classes. It’s just a matter of figuring out the algorithm for making appx files which I can’t help with.

@adamhathcock commented on GitHub (Jul 23, 2017): No it won’t be the same as doing a single stream vs 64k chunks. You can do anything you’re talking about with DeflateStream and/or Zip classes. It’s just a matter of figuring out the algorithm for making appx files which I can’t help with.
Author
Owner

@Coloris commented on GitHub (Jul 23, 2017):

Sure I wasn't expecting you to help me with the appx part :D How can I access the deflate stream with sharpcompress while adding a zip entry ? Is the compression done in real-time or only when the stream is closed ?

@Coloris commented on GitHub (Jul 23, 2017): Sure I wasn't expecting you to help me with the appx part :D How can I access the deflate stream with sharpcompress while adding a zip entry ? Is the compression done in real-time or only when the stream is closed ?
Author
Owner

@adamhathcock commented on GitHub (Jul 23, 2017):

It's done in real time when writing the stream. Closing the stream just guarentees the final bit which is what you need to do 64k at a time.

You're going to have to chunk the file and compress first to get the data you want then later create the final zip file.

@adamhathcock commented on GitHub (Jul 23, 2017): It's done in real time when writing the stream. Closing the stream just guarentees the final bit which is what you need to do 64k at a time. You're going to have to chunk the file and compress first to get the data you want then later create the final zip file.
Author
Owner

@Coloris commented on GitHub (Jul 24, 2017):

Ok, this make sense ! One last question if you don't mind, Is there any equivalent of zlib Z_FULL_FLUSH when using a DeflateStream with the Zip class ? This is what I need (and cannot do with native .net io.compression since it is too high level).

@Coloris commented on GitHub (Jul 24, 2017): Ok, this make sense ! One last question if you don't mind, Is there any equivalent of zlib Z_FULL_FLUSH when using a DeflateStream with the Zip class ? This is what I need (and cannot do with native .net io.compression since it is too high level).
Author
Owner

@adamhathcock commented on GitHub (Jul 24, 2017):

I don’t currently expose it but I could easily on ZipWriterOptions. The enum is FlushType and could be used along side DeflateCompression.

A pull request is welcomed :)

@adamhathcock commented on GitHub (Jul 24, 2017): I don’t currently expose it but I could easily on ZipWriterOptions. The enum is FlushType and could be used along side DeflateCompression. A pull request is welcomed :)
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#223