-
Notifications
You must be signed in to change notification settings - Fork 430
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
does gcsfuse support content-encoding: gzip? #671
Comments
Thanks for the request here. As you pointed out, a while back there was a good reason for gcsfuse to not support content encoding but we can definitely investigate if that is possible currently. Please let us take a look and we'll post an update soon. |
@avidullu any update on this? As a user of GCP's Vertex Pipelines, I'd really, really like to have an option to read compressed files on executors via locally mounted GCS. Actually, I was quite surprised that something like this doesn't work ;) |
Same for zip. Currently are |
With v1.1 Cloud Storage FUSE now supports reading back objects as gzip if content-encoding: gzip metadata is set. The detailed behavior is documented under "File transcoding". For the reasons already mentioned, Cloud Storage FUSE only supports reading the file back as gzip and does not do decompressive transcoding over the wire. Previously, Cloud Storage FUSE would just return an error and not even allow the file to be returned as gzip. Attempting to use Cloud Storage FUSE to edit or modify objects with content-encoding: gzip can produce unpredictable behavior. This is because Cloud Storage FUSE uploads the object content as it is (without compressing it) while retaining content-encoding: gzip, and if this content is not properly gzip-compressed, it might fail in being read from the server by other clients such as gsutil. This is because other clients employ decompressive transcoding while reading, and it will fail for improper gzip content. So, unzipping the file within the GCSfuse directory will create the unzipped files as new files in GCS, or can be unzipped to a local directory. If unzipped within a GCSfuse directory and then modified, the files will be written back as raw and uncompressed, not as gzip. If the files need to be modified, and written back as gzip, this process should be done outside of the GCSfuse mounted directory: gunzip to local directory-->read/modify files locally-->re-zip locally--> replace old gzip file in GCSfuse directory with new gzip |
I have a gz file on GCS with
content-conding:gzip
, and this is the error I saw when I tried togunzip
it.By looking at this issue #165, it seems that gcsfuse doesn't intend to support this back in 2016. Is this still the decision nowadays? Thanks!
The text was updated successfully, but these errors were encountered: