Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add lzip support #126

Closed
hohav opened this issue Dec 23, 2023 · 4 comments
Closed

Add lzip support #126

hohav opened this issue Dec 23, 2023 · 4 comments
Labels
enhancement New feature or request

Comments

@hohav
Copy link

hohav commented Dec 23, 2023

Lzip is a lesser-known format, but offers advantages over other formats. The ability to mount .lz files would be very nice.

@mxmlnkn
Copy link
Owner

mxmlnkn commented Dec 23, 2023

I have stumbled upon this format a few times already but never saw it used. Are you using it or are there public datasets distributed in this format?

Like many formats in #109, it probably would be somewhat supported by libarchive. There are two issues:

  • libarchive has no fully usable Python bindings last time I checked and is therefore hard to integrate.
    • This lzip module seems to have no API for seeking, which would be required for ratarmount.
  • I doubt that libarchive implements fast seeking for generic lzip files for the same reasons it was hard to do in gzip and xz and others, i.e., they are not indexed and even worse, lzip has a window size up to 512 MiB. Current solutions would have to store the full window size per seek point, although I'm trying to reduce that somewhat.
    • It seems that the kind of indexing required by ratarmount is in general supported but the file has to have been created with tarlz or plzip and even tarlz has some option for controlling seekability / parallel decompression capability for a single file inside the tar. If created in such a manner, it would be a perfect fit for ratarmount. I would hope that libarchive supports seeking for such files but it would have to be checked.

@hohav
Copy link
Author

hohav commented Dec 24, 2023

I'm evaluating lzip for long-term archival purposes, and the ability to mount a large archive would be a point in its favor. But as you say it's not widely used, and there are reasonable alternatives. So waiting for support via libarchive seems perfectly reasonable, even if that's not in the cards for now. Thanks for looking into it; feel free to close the issue if that makes sense.

@mxmlnkn
Copy link
Owner

mxmlnkn commented Dec 24, 2023

Other projects very similar to tarlz are pixz and t2sz. They also create indexes and compress the files in the TAR independently and therefore easy to mount.

I'll leave this issue open as there might be others wanting the same.

@mxmlnkn mxmlnkn added the enhancement New feature or request label Feb 23, 2024
@mxmlnkn
Copy link
Owner

mxmlnkn commented Apr 5, 2024

@hohav I'm finished with the libarchive backend in #130 , which also supports lzip. If you want to test it out, I'd welcome feedback. That branch can be installed directly from git with:

python3 -m pip install --user --force-reinstall \
    'git+https://github.com/mxmlnkn/ratarmount.git@libarchive#egginfo=ratarmountcore&subdirectory=core' \
    'git+https://github.com/mxmlnkn/ratarmount.git@libarchive#egginfo=ratarmount'

There are some inherent performance problems with libarchive, but I tried to avoid the bigger pitfalls that e.g. archivemount suffers from.

mxmlnkn added a commit that referenced this issue Apr 7, 2024
@mxmlnkn mxmlnkn closed this as completed Apr 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants