-
-
Notifications
You must be signed in to change notification settings - Fork 132
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add option to skip downloading output from S3 to local for AWS runs #578
Comments
Would welcome a PR. I propose to make this configurable by adding a parameter to the
I don't know from the top of my head whether or not downloading the results file is always required just for the remainder of the logic to work (to know whether a task has finished). If it is, then |
Currently for AWS runs, the results dir of each task is downloaded from S3 to the local machine that executed the AWS run.
With large-scale runs and additional meta-data, these downloads can become very large (multiple terabytes), leading to out-of-disk on the host machine and potential network errors / bandwidth limitations.
It would be nice to be able to specify to skip the local download (but still have the files saved to S3 from the worker nodes). I primarily work with the S3 files directly for post-run aggregation, which is more convenient generally.
The text was updated successfully, but these errors were encountered: