Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not clear how to trade most resources for most compression #26

Open
ribasushi opened this issue Feb 13, 2020 · 3 comments
Open

Not clear how to trade most resources for most compression #26

ribasushi opened this issue Feb 13, 2020 · 3 comments

Comments

@ribasushi
Copy link

The documentation of WriterConfig is somewhat sparse. I would like to emulate ( in spirit, I understand the algorithm is not perfect ) the result of xz --lzma2=preset=9,dict=128MiB

Could you please point me to a "starting point" ?

Thanks!

@ulikunitz
Copy link
Owner

The DictCap field defines the dictionary size. So DictCap = 128 * 1024 * 1024 would define a larger dictionary. Note that the actual memory consumption is a multiple of the dictionary size, because dictionary need to be hashed. The default size is already 8 MByte so only for files with a larger size there will be any effect at all.

The LZMA properties (lc, lp, pb) are the same as described in the xz manual.

The buffer size might increase compression speed a little bit. It has almost no effect on compression ratio.

@mfischr
Copy link

mfischr commented Mar 6, 2020

Surprisingly, with xz, the values are always lc=3,lp=0,pb=2 no matter what preset you choose. According to the manual, the preset affects other settings like dictionary size, match finder, 'nice', and 'depth'.

Btw, the values of LC, LP, PB are stored in a single byte encoded according to this formula, and in one .xz file I tried, that byte appeared at position 0x1d.

That said, I can't get this package to replicate the same results I'm getting with preset=2 (it's about 20% larger)

@ulikunitz
Copy link
Owner

The package doesn't implement the same algorithm as xz. So the results and compression rate will be different.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants