-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[object-store]: Implement credential_process support for S3 #6422
Comments
@alamb since the linked PR is closed, can we mark this as closed? |
This is new, and although related to the linked issues, it is not closed |
For additional context see #5143. Copying some of the info here: I think the usecase this feature would support is
Since the users don't control the pola.rs source or distribution, they can not use the existing object_store The proposal on this ticket is to add an mechanism that can call out to an external program / process to get credentials. While less efficient this would allow someone to plug in whatever authentication mechanism they wanted without having to change the source code @tustvold notes that we need to ensure this type of mechanism does not compromise system security (e.g. perhaps it has to be enabled by deafult Also, he mentioned that the Azure client has something similar -- |
See also related ticket in pola-rs that @tustvold filed: pola-rs/polars#18979 |
TBC I view this very much as a hack around an API limitation in Polars, I would prefer we try to fix this there before resorting to this - pola-rs/polars#18979 (comment) |
Can you explain why you think credential process support related to Polars? To me is a gap in |
Your original request concerned supporting a broader range of auth within the context of polars. Credential process support was then proposed as a way to workaround the inability to override the credential configuration within polars. By fixing this limitation of polars we not only provide a way for users to use credential process, via an AWS SDK that implements it, but also the full flexibility of all the other auth possibilities exposed by these SDKs. I'd naturally prefer the solution that gives users the most flexibility and avoids needing to revisit this again when someone comes along requesting SSO or similar |
I would also like to avoid a similar conversation with users of systems other than polars. I agree in an ideal world, perhaps polars would implement a user APIs to fully configure S3 auth via the Even if they did this, however, I think we will continue to have similar conversations with other downtream users I view this credential process not as a hack but a general purpose configuration convention that works for any subsequent user (similarly to how object_store also supports the standard AWS configuration environment variables without any downstream crate configuration |
I think this proposal doesn't meaningfully help in this regard because:
So we'd end up with users continuing to come here asking about this, and we'd have to direct them to some object_store specific environment configuration to call out to some external process they have to setup. Encouraging the downstreams to expose, or otherwise utilize the object_store credential provider API would avoid this entirely. |
fwiw, the deltalake-aws crate glues the aws-sdk and a CredentialProvider together, which is what I think most people generally want, so Polars can yoink that code if they'd like, or we could yoink it into a different shared crate 🤷 |
Polars exposed it as part of their API and configured it to use boto when available. I'm going to close this as I don't think it is planned anymore |
Credential process is a flexible solution for providing custom authentication mechanisms for object store. It is described as a part of the AWS SDK documentation and implementing it would allow more complex use cases to be fully supported by the current setup, without adding particular complexity.
How does it work?
When user decides to use the credential process, when a client needs credentials it invokes the process, which replies with a defined schema like so:
The client knows when the expiration will occur, and will re-invoke the process when required.
What can we do?
We can then extend the AmazonS3Builder to support this use case via an environment variable
The text was updated successfully, but these errors were encountered: