Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

high allocation ratio #47

Open
yuvalgut opened this issue Apr 10, 2022 · 3 comments
Open

high allocation ratio #47

yuvalgut opened this issue Apr 10, 2022 · 3 comments

Comments

@yuvalgut
Copy link

I have a processing scenario where I read lzma objects and I need to decompress them.
while using pprof I could see that the lzma reader allocates buffers for every message:
0 0% 0.0046% 433804685.70MB 96.50% github.com/ulikunitz/xz/lzma.NewReader (inline)
4122.21MB 0.00092% 0.0056% 433804685.70MB 96.50% github.com/ulikunitz/xz/lzma.ReaderConfig.NewReader
2414.61MB 0.00054% 0.0061% 432805222.15MB 96.28% github.com/ulikunitz/xz/lzma.newDecoderDict (inline)
432802807.54MB 96.28% 96.28% 432802807.54MB 96.28% github.com/ulikunitz/xz/lzma.newBuffer (inline)

can we add some option for allowing to have a pool of that buffer? or some other way to reuse a reader?

@ulikunitz
Copy link
Owner

Why is that a problem? The buffer is allocated once per LZMA object and collected by the GC. You can control the size of the buffer, while creating the LZMA object.

@yuvalgut
Copy link
Author

Hi thanks for the response!
from the reader simple test:
r, err := NewReader(xz) if err != nil { t.Fatalf("NewReader error %s", err) } var buf bytes.Buffer if _, err = io.Copy(&buf, r); err != nil { t.Fatalf("io.Copy error %s", err) }
when r, err := NewReader(xz) is called the dict buffer gets allocated.
then we call io.Copy(&buf, r) which reads the uncompressed data into the 'client' buffer.
so now we have the dict buffer already allocated - we could have used it in order to decompress another lzma data but there is no 'reset' option, so we have to recreate a reader with NewReader(xz) which will allocate another dict buffer instead of using the one we already allocated and used.

let me know if that makes sense
thanks again

@ulikunitz
Copy link
Owner

I'm currently reworking the LZMA package to support parallel & faster compression and faster decompression. I will look into Reset options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants