Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about memory management in tgzip #35

Closed
Christian-Sander opened this issue Dec 2, 2020 · 6 comments
Closed

Question about memory management in tgzip #35

Christian-Sander opened this issue Dec 2, 2020 · 6 comments

Comments

@Christian-Sander
Copy link

I'm currently looking into comrpressing some data and have two questions regarding the example implementation in tgzip and its usage in other programs:

  1. It appears as if tgzip.c doesn't free any memory and lets the OS clean up the allocated memory. If so, what is the user meant to free to avoid memory leaks? I'm thinking that I'm meant to free comp.out.outbuf and comp.hash_table, as those appear to be heap-allocated. Is there anything else I'm missing? As far as I can tell there is only one call internally to realloc for the comp.out.outbuf buffer.
  2. Does the dict_size need to be 32 kB, even if compressing smaller data and in memoy-limited environments? Does it correlate to the hash_bits field? I have a maximum of 1 MB of data that needs to be compressed, usually much less, and in PC-based tests using python gzip with max. compression level it compresses extremely well (1-0.1% of original size).
@pfalcon
Copy link
Owner

pfalcon commented Dec 3, 2020

What I can say from my memory:

  1. The library itself doesn't do any memory allocation, it follows "dependency injection" pattern, where any buffers are allocated by the client and passed in as pointers.
  2. tgzip.c is a sample application which is intended to be simple, and thus may indeed rely on the behavior of a POSIX OS which guarantees that any resources allocated by a process will be freed on the process exit.
  3. dict_size is a DEFLATE/gzip param. hash_bits is a param of uzlib's compression algorithm. So, they're orthogonal params. Making both better configurable is a long-standing TODO task. For now you can patch the source.
  4. The compression quality of uzlib is not comparable to that of gzip. I specifically coded as simple as possible, and thus as small as realistically possible, algo. Though if hash_bits approaches infinity, the compression rate also approaches the highest possible for the LZ compression ;-).

@Christian-Sander
Copy link
Author

Thank you for your replies. One comment to the first point:

  1. In defl_static.c:
    out->outbuf = sresize(out->outbuf, out->outsize, unsigned char);
    (sresize is a macro for reallloc). So there is actually one allocation happening, atleast when outbuf is not allocated before. This means we're both right!

@pfalcon
Copy link
Owner

pfalcon commented Jan 12, 2021

Where do you see that called?

@github-actions
Copy link

github-actions bot commented May 9, 2021

Thanks for your submission. However there was no (further) activity
on it for some time. Below are possible reasons and/or means to
proceed further:

  • Please make sure that your submission contains clear and complete
    description of the issue and information required to understand and
    reproduce it (exact version of all components involved, etc.). Please
    refer to numerous guides on submitting helpful bugreports, e.g.
    https://www.chiark.greenend.org.uk/~sgtatham/bugs.html

  • Please make sure that your feature request/suggestion/change aligns
    well with the project purpose, goals, and process.

  • If you face the issue you report while working on a community-oriented
    open-source project, feel free to provide additional context and
    information - that may help to prioritize the issue better.

  • As many open-source projects, this project is severely under-staffed
    and under-resourced. If you don't run a community-oriented open-source
    project, and would like to get additional support/maintenance, feel
    free to request a support contract.

Thanks for your understanding!

@github-actions github-actions bot added the Stale label May 9, 2021
@github-actions
Copy link

Closing due to inactivity.

@smdjeff
Copy link

smdjeff commented Nov 11, 2021

I noticed the exact same and didn't see this one closed, so opened another. See #41. I laid out the call chain that is used to sneakily allocate memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants