Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Storage] Switch to GCSFuse 2.0 #3442

Closed
romilbhardwaj opened this issue Apr 16, 2024 · 1 comment
Closed

[Storage] Switch to GCSFuse 2.0 #3442

romilbhardwaj opened this issue Apr 16, 2024 · 1 comment

Comments

@romilbhardwaj
Copy link
Collaborator

GCSFuse 2.0 was recently announced. Most importantly, it adds read caching:

Cloud Storage FUSE V2 provides important stability, functionality, and performance enhancements, including the introduction of a file cache that allows repeat file reads to be served from a local, faster cache storage of choice, such as a Local SSD, Persistent Disk, or even in-memory /tmpfs. The Cloud Storage FUSE file cache makes AI/ML training faster and more cost-effective by reducing the time spent waiting for data, with up to 2.3x faster training time and 3.4x higher throughput observed in training runs. This is especially valuable for multi epoch training and can serve small and random I/O operations significantly faster. The file cache feature is disabled by default and is enabled by passing a directory to 'cache-dir'. See overview of caching for more details.

We should update to v2 and turn on read caching (and set its size as a fixed percentage of available disk?)

@romilbhardwaj
Copy link
Collaborator Author

Closed with #3619.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant