[Question] LZMA2 Compression? / LZMA Compression level? #594

Open
opened 2026-01-29 22:14:19 +00:00 by claunia · 8 comments
Owner

Originally created by @Shivansps on GitHub (Oct 6, 2023).

Hi, first off i wanted to thank everyone involved in this project for all the amazing work, ive been using this lib for a while and its great.

-First i wanted to ask about the status of LZMA2 compression, to me knowledge there is only LZMA compression support, it may already be in and i might not know about it.

-Second, there is any LZMA compression level option? because comparing to the final size of a file compressed with 7z or the LZMA C# SDK at max level the files produced by SharpCompress are considerably bigger.

Thank you.

Originally created by @Shivansps on GitHub (Oct 6, 2023). Hi, first off i wanted to thank everyone involved in this project for all the amazing work, ive been using this lib for a while and its great. -First i wanted to ask about the status of LZMA2 compression, to me knowledge there is only LZMA compression support, it may already be in and i might not know about it. -Second, there is any LZMA compression level option? because comparing to the final size of a file compressed with 7z or the LZMA C# SDK at max level the files produced by SharpCompress are considerably bigger. Thank you.
claunia added the enhancementup for grabs labels 2026-01-29 22:14:19 +00:00
Author
Owner

@adamhathcock commented on GitHub (Oct 8, 2023):

LZMA2 could be added. Would need some source and a PR

I think the compression level is setable directly on LZMAStream but not exposed outwardly. Needs a PR

@adamhathcock commented on GitHub (Oct 8, 2023): LZMA2 could be added. Would need some source and a PR I think the compression level is setable directly on LZMAStream but not exposed outwardly. Needs a PR
Author
Owner

@GamingCity commented on GitHub (Oct 11, 2023):

I took a look here
632b83f75d/src/SharpCompress/Writers/Zip/ZipWriter.cs (L395)
Default LZMA settings are 32 fastbytes and 1MB dictionary (1<<20)

I testes compressing a single 26MB file:
SharpCompress
5.30MB 32 fastbytes / 1MB dictionary
5.29MB 32 fastbytes / 8MB dictionary
2.92MB 32 fastbytes / 16MB dictionary
2.79MB 64 fastbytes / 16MB dictionary
5.13MB 128 fastbytes / 1MB dictionary
5.13MB 128 fastbytes / 8MB dictionary
2.80MB 128 fastbytes / 16MB dictionary
2.79MB 160 fastbytes / 16MB dictionary
2.79MB 192 fastbytes / 16MB dictionary
2.79MB 256 fastbytes / 16MB dictionary

7z(22.01)
2.82MB 48 wordsize / Ultra / LZMA / 64MB dictionary
2.79MB 64 wordsize / Ultra / LZMA / 16MB dictionary
2.79MB 64 wordsize / Ultra / LZMA / 64MB dictionary
2.79MB 64 wordsize / Ultra / LZMA2 / 64MB dictionary
2.78MB 128 wordsize / Ultra / LZMA / 64MB dictionary

So its not only the compression level (fastbytes) but the dictionary size as well. Be warned that compressing with a 16MB dictionary uses a lot more ram in sharpcompress.

But that was a single file, i went ahead and compressed the 940MB folder
2:38 - 152MB 7z(22.01) LZMA Max 16MB Dictionary 64 words
2:58 - 148MB 7z(22.01) LZMA Ultra 16MB Dictionary 64 words
2:03 - 143MB 7z(22.01) LZMA2 Ultra 64MB Dictionary 64 words

16:23 - 184MB Sharpcompress default
25:43 - 153MB Sharpcompress 64 fastbytes / 16MB dictionary

35m - 152MB Sharpcompress 128 fastbytes / 16MB dictionary

And ive been unable to get it down to 148MB like 7z Ultra does by increasing the fastbytes more, not sure why.

Yeah compression speed is what has me worried here because it is unusuable, not sure what kind of magic 7z does, but this is way too slow, and the c# LZMA sdk is just as slow as well.

@GamingCity commented on GitHub (Oct 11, 2023): I took a look here https://github.com/adamhathcock/sharpcompress/blob/632b83f75ddd1eca52ce633048ce4235485b5db7/src/SharpCompress/Writers/Zip/ZipWriter.cs#L395 Default LZMA settings are 32 fastbytes and 1MB dictionary (1<<20) I testes compressing a single 26MB file: SharpCompress 5.30MB 32 fastbytes / 1MB dictionary 5.29MB 32 fastbytes / 8MB dictionary 2.92MB 32 fastbytes / 16MB dictionary 2.79MB 64 fastbytes / 16MB dictionary 5.13MB 128 fastbytes / 1MB dictionary 5.13MB 128 fastbytes / 8MB dictionary 2.80MB 128 fastbytes / 16MB dictionary 2.79MB 160 fastbytes / 16MB dictionary 2.79MB 192 fastbytes / 16MB dictionary 2.79MB 256 fastbytes / 16MB dictionary 7z(22.01) 2.82MB 48 wordsize / Ultra / LZMA / 64MB dictionary 2.79MB 64 wordsize / Ultra / LZMA / 16MB dictionary 2.79MB 64 wordsize / Ultra / LZMA / 64MB dictionary 2.79MB 64 wordsize / Ultra / LZMA2 / 64MB dictionary 2.78MB 128 wordsize / Ultra / LZMA / 64MB dictionary So its not only the compression level (fastbytes) but the dictionary size as well. Be warned that compressing with a 16MB dictionary uses a lot more ram in sharpcompress. But that was a single file, i went ahead and compressed the 940MB folder 2:38 - 152MB 7z(22.01) LZMA Max 16MB Dictionary 64 words 2:58 - 148MB 7z(22.01) LZMA Ultra 16MB Dictionary 64 words 2:03 - 143MB 7z(22.01) LZMA2 Ultra 64MB Dictionary 64 words 16:23 - 184MB Sharpcompress default 25:43 - 153MB Sharpcompress 64 fastbytes / 16MB dictionary >35m - 152MB Sharpcompress 128 fastbytes / 16MB dictionary And ive been unable to get it down to 148MB like 7z Ultra does by increasing the fastbytes more, not sure why. Yeah compression speed is what has me worried here because it is unusuable, not sure what kind of magic 7z does, but this is way too slow, and the c# LZMA sdk is just as slow as well.
Author
Owner

@TimLCondor commented on GitHub (Oct 11, 2023):

Note that the fastBytes parameter is not a higher=higher compression. The default in LZMA is 128 (of 273) and in my case 100 to 128 often resulted in a tiny bit better compression than the highest value (but mostly the same). Together with a dictionary of size (1<<23) I get nearly the same compression as 7zip.

SharpCompress is slow because it uses the official LZMA C# SDK, which is an unoptimized never-updated sloppy translation from 2013. 7zip uses the algorithm written and optimized in ANSI-C.

@TimLCondor commented on GitHub (Oct 11, 2023): Note that the fastBytes parameter is not a higher=higher compression. The default in LZMA is 128 (of 273) and in my case 100 to 128 often resulted in a tiny bit better compression than the highest value (but mostly the same). Together with a dictionary of size (1<<23) I get nearly the same compression as 7zip. SharpCompress is slow because it uses the official LZMA C# SDK, which is an unoptimized never-updated sloppy translation from 2013. 7zip uses the algorithm written and optimized in ANSI-C.
Author
Owner

@GamingCity commented on GitHub (Oct 11, 2023):

1<<24 dictionary size seems to be giving me the best results along with a fastbytes of at least 64 (default is 32)
7z has another setting called "wordsize" that seems to have a huge impact in final size, but im not seeing it on the lzma sdk.

I guess i will have to profile this to see were is wasting so much time.

@GamingCity commented on GitHub (Oct 11, 2023): 1<<24 dictionary size seems to be giving me the best results along with a fastbytes of at least 64 (default is 32) 7z has another setting called "wordsize" that seems to have a huge impact in final size, but im not seeing it on the lzma sdk. I guess i will have to profile this to see were is wasting so much time.
Author
Owner

@Shivansps commented on GitHub (Oct 13, 2023):

Ive been looking at the methods were the LZMA SDK spends so much time and i only managed to get marginal gains in speed. Maybe someone with more C# experience than me can figure something out.

I cant belive the state of the LZMA SDK for C#, no LZMA2, no XZ and LZMA compression is just unusable and it was like this 10 years ago from what im seeing. C# is slower than C++ but come on, you cant take over 35 minutes to compress a 900mb folder.
Sorry for the little rant, but i just cant belive this.

@Shivansps commented on GitHub (Oct 13, 2023): Ive been looking at the methods were the LZMA SDK spends so much time and i only managed to get marginal gains in speed. Maybe someone with more C# experience than me can figure something out. I cant belive the state of the LZMA SDK for C#, no LZMA2, no XZ and LZMA compression is just unusable and it was like this 10 years ago from what im seeing. C# is slower than C++ but come on, you cant take over 35 minutes to compress a 900mb folder. Sorry for the little rant, but i just cant belive this.
Author
Owner

@adamhathcock commented on GitHub (Oct 13, 2023):

if you find an implementation that is faster then let me know.

@adamhathcock commented on GitHub (Oct 13, 2023): if you find an implementation that is faster then let me know.
Author
Owner

@Shivansps commented on GitHub (Oct 13, 2023):

Ive looked but i dont think there is one. But not giving up the attempts to fix it myself it is one method thats seems to be causing all the problems.

632b83f75d/src/SharpCompress/Compressors/LZMA/LZ/LzBinTree.cs (L142)

This is one of those things were SIMD extensions maybe applied. But im not experienced in any of that.

@Shivansps commented on GitHub (Oct 13, 2023): Ive looked but i dont think there is one. But not giving up the attempts to fix it myself it is one method thats seems to be causing all the problems. https://github.com/adamhathcock/sharpcompress/blob/632b83f75ddd1eca52ce633048ce4235485b5db7/src/SharpCompress/Compressors/LZMA/LZ/LzBinTree.cs#L142 This is one of those things were SIMD extensions maybe applied. But im not experienced in any of that.
Author
Owner

@adamhathcock commented on GitHub (Oct 16, 2023):

Good luck. I've never looked at compression algorithms myself. I was more concerned about the interface and maybe some archive formats.

@adamhathcock commented on GitHub (Oct 16, 2023): Good luck. I've never looked at compression algorithms myself. I was more concerned about the interface and maybe some archive formats.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/sharpcompress#594