Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow different GCP credentials for reading and writing in the same Spark Session #1249

Open
sid-habu opened this issue Sep 12, 2024 · 1 comment

Comments

@sid-habu
Copy link

S3 and ADLS connectors allow setting credentials per bucket in the same Spark session. However, I am unable to find a similar config for GCS.

My use case is to read from a GCS bucket in a project and write to a GCS bucket in another project with a different set of credentials. Overriding the Hadoop conf just before writing causes failures due to race conditions in the token caching.

Is there a workaround to achieve this in Spark for GCS?

@sid-habu
Copy link
Author

We resolved this by extending GoogleHadoopFileSystem and overriding initialize method to read our custom auth properties and set fs.gs.auth for the URI representing the GCS bucket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant