Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[object-store]: Implement credential_process support for S3 #6422

Closed
edmondop opened this issue Sep 19, 2024 · 11 comments
Closed

[object-store]: Implement credential_process support for S3 #6422

edmondop opened this issue Sep 19, 2024 · 11 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface

Comments

@edmondop
Copy link

Credential process is a flexible solution for providing custom authentication mechanisms for object store. It is described as a part of the AWS SDK documentation and implementing it would allow more complex use cases to be fully supported by the current setup, without adding particular complexity.

How does it work?

When user decides to use the credential process, when a client needs credentials it invokes the process, which replies with a defined schema like so:

{
    "Version": 1,
    "AccessKeyId": "an AWS access key",
    "SecretAccessKey": "your AWS secret access key",
    "SessionToken": "the AWS session token for temporary credentials", 
    "Expiration": "RFC3339 timestamp for when the credentials expire"
}  

The client knows when the expiration will occur, and will re-invoke the process when required.

What can we do?

We can then extend the AmazonS3Builder to support this use case via an environment variable

@edmondop edmondop added the enhancement Any new improvement worthy of a entry in the changelog label Sep 19, 2024
@ByteBaker
Copy link
Contributor

@alamb since the linked PR is closed, can we mark this as closed?

@edmondop
Copy link
Author

This is new, and although related to the linked issues, it is not closed

@alamb
Copy link
Contributor

alamb commented Sep 24, 2024

For additional context see #5143. Copying some of the info here:

I think the usecase this feature would support is

  1. User uses object_store indirectly via polars
  2. polars does not provide any way to modify / configure s3 connections at runtime

Since the users don't control the pola.rs source or distribution, they can not use the existing object_store CredentialProvider trait.

The proposal on this ticket is to add an mechanism that can call out to an external program / process to get credentials. While less efficient this would allow someone to plug in whatever authentication mechanism they wanted without having to change the source code

@tustvold notes that we need to ensure this type of mechanism does not compromise system security (e.g. perhaps it has to be enabled by deafult

Also, he mentioned that the Azure client has something similar -- MicrosoftAzureBuilder::with_use_azure_cli that we could use as a model

@alamb
Copy link
Contributor

alamb commented Sep 28, 2024

See also related ticket in pola-rs that @tustvold filed: pola-rs/polars#18979

@tustvold
Copy link
Contributor

TBC I view this very much as a hack around an API limitation in Polars, I would prefer we try to fix this there before resorting to this - pola-rs/polars#18979 (comment)

@edmondop
Copy link
Author

Can you explain why you think credential process support related to Polars? To me is a gap in AmazonS3Builder in object store

@tustvold
Copy link
Contributor

Your original request concerned supporting a broader range of auth within the context of polars. Credential process support was then proposed as a way to workaround the inability to override the credential configuration within polars. By fixing this limitation of polars we not only provide a way for users to use credential process, via an AWS SDK that implements it, but also the full flexibility of all the other auth possibilities exposed by these SDKs.

I'd naturally prefer the solution that gives users the most flexibility and avoids needing to revisit this again when someone comes along requesting SSO or similar

@alamb
Copy link
Contributor

alamb commented Sep 30, 2024

I'd naturally prefer the solution that gives users the most flexibility and avoids needing to revisit this again when someone comes along requesting SSO or similar

I would also like to avoid a similar conversation with users of systems other than polars.

I agree in an ideal world, perhaps polars would implement a user APIs to fully configure S3 auth via the object_store using the existing APIs.

Even if they did this, however, I think we will continue to have similar conversations with other downtream users

I view this credential process not as a hack but a general purpose configuration convention that works for any subsequent user (similarly to how object_store also supports the standard AWS configuration environment variables without any downstream crate configuration

@tustvold
Copy link
Contributor

tustvold commented Sep 30, 2024

Even if they did this, however, I think we will continue to have similar conversations with other downtream users

I think this proposal doesn't meaningfully help in this regard because:

  • It is specific to AWS
  • AWS only supports credential process as part of AWS_PROFILE which we don't support

So we'd end up with users continuing to come here asking about this, and we'd have to direct them to some object_store specific environment configuration to call out to some external process they have to setup.

Encouraging the downstreams to expose, or otherwise utilize the object_store credential provider API would avoid this entirely.

@rtyler
Copy link
Contributor

rtyler commented Nov 12, 2024

fwiw, the deltalake-aws crate glues the aws-sdk and a CredentialProvider together, which is what I think most people generally want, so Polars can yoink that code if they'd like, or we could yoink it into a different shared crate 🤷

@tustvold
Copy link
Contributor

Polars exposed it as part of their API and configured it to use boto when available.

I'm going to close this as I don't think it is planned anymore

@tustvold tustvold closed this as not planned Won't fix, can't repro, duplicate, stale Nov 12, 2024
@alamb alamb added the object-store Object Store Interface label Nov 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog object-store Object Store Interface
Projects
None yet
Development

No branches or pull requests

5 participants