-
Notifications
You must be signed in to change notification settings - Fork 249
Initial SigV4 Auth Support for Catalog Federation #1805
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
42458a7
to
ef131e2
Compare
db0f8c9
to
5246c12
Compare
5246c12
to
f123692
Compare
f123692
to
b5bc2b5
Compare
Previously, the `maxCode` was generated using `AuthenticationType.values()`. It worked fine because Polaris supported the same number of Connection and Authentication types. The enum conversion would break if we add new authenication or connection types independently as in apache#1805.
Previously, the `maxCode` was generated using `AuthenticationType.values()`. It worked fine because Polaris supported the same number of Connection and Authentication types. The enum conversion would break if we add new authenication or connection types independently as in apache#1805.
polaris-core/src/main/java/org/apache/polaris/core/identity/mutation/EntityMutationEngine.java
Show resolved
Hide resolved
* based on configured or contextual logic (e.g., injecting service identity info, computing derived | ||
* fields). | ||
*/ | ||
public interface EntityMutationEngine { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unfortunate that we haven't had a chance yet to make PolarisBaseEntity
immutable, but maybe our naming for this set of classes should be something that more strongly implies producing a "new transformed copy of the entity" rather than make it tempting for anyone to try mutating them in-place.
What does everyone think of naming these instead:
- EntityTransformationEngine/EntityTransformer/TransformationPoint
- EntityDecorationEngine/EntityDecorator/DecorationPoint
?
connectionConfigInfoDpo.withServiceIdentity( | ||
serviceIdentityRegistry.assignServiceIdentity(ServiceIdentityType.AWS_IAM)); | ||
builder.setConnectionConfigInfoDpo(injectedConnectionConfigInfoDpo); | ||
builder.setEntityVersion(entity.getEntityVersion() + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally we should only increment the entityVersion if the intermediate version is possibly visible to the outside world. It may or may not end up being harmless to modify the version multiple times during construction of the entity, but this would be inconsistent with other steps of the entity-building process.
On the other side, it's true that it might be nice to have something that represents the complete entity contents as a version or hashCode so that we can compare across different transformations.
For now, we should probably avoid incrementing the entityVersion here for consistency with how other things also still construct parts of the entity during creation/updates without incrementing the version for every intermediate state.
...ava/org/apache/polaris/core/persistence/transactional/TransactionalMetaStoreManagerImpl.java
Outdated
Show resolved
Hide resolved
@@ -473,6 +475,12 @@ private void revokeGrantRecord( | |||
|
|||
ms.persistStorageIntegrationIfNeededInCurrentTxn(callCtx, catalog, integration); | |||
|
|||
// CATALOG_PRE_PERSIST mutation point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, unfortunately we didn't quite manage to refactor all of the shared logic nryerrn AtomicOperationMetaStoreManager
and TransactionalMetaStoreManagerImpl
into a base class yet; the best we have is the prepareToPersistNewEntity
hook defined in BaseMetaStoreManager
.
This means either this logic needs to be added in both the AtomicOperationMetaStoreManager
and TransactionalMetaStoreManagerImpl
, or needs to be abstracted into BaseMetaStoreManager
somehow while making sure both of those impls invoke the shared logic. persistNewEntity
isn't specific to catalog-creation so you'd need an enum on entityType there if you just put this logic in there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it, thanks for catching this issue and the suggestion! Let me revise my PR!
@@ -188,6 +188,17 @@ polaris.oidc.principal-roles-mapper.type=default | |||
# polaris.storage.gcp.token=token | |||
# polaris.storage.gcp.lifespan=PT1H | |||
|
|||
# Polaris Entity Mutation Config | |||
polaris.entity-mutation.mutators=no-op,catalog-connection-config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the one hand, it's useful to be able to have this kind of generalized transformer/mutator hook, but if the hook is also strongly coupled to particular features, we don't want to require separate configuration of the info injection here vs the core feature (e.g. sigv4) somewhere else.
Especially if we end up having feature configurations for the core feature, then it's important that the feature itself auto-configures itself to be functional. In a way then the classes that define the sigv4 functionality should be what injects the mutator/transformer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey Dennis, that's a good point, perhaps we can consider making some transformers tied to the feature parameters rather than the config itself if they are part of the core functionality of the feature.
Introduction
This PR adds SigV4 authentication support for Catalog Federation in Polaris, enabling integrations with services like AWS Glue.
It introduces a complete and extensible credential management flow based on the Apache Polaris Creds Management Proposal. The key design goals are to:
The implementation supports both self-managed (single-tenant) and SaaS (multi-tenant) deployments, while ensuring a clean separation between user-supplied authentication parameters and Polaris-managed identities/credentials.
How to review this PR?
This PR touches many test files because a new field was added to
PolarisCallContext
.Pls focus on the core logic. You can safely ignore most of the test churn, please concentrate on the new components and flow listed below.
For the class name, I'm open to choose other names to make it clearer.
End-to-End flow
The full flow of accessing remote catalog now looks like:
ConnectionConfigInfoModel
with SigV4 authserviceIdentity
field is left nullCatalogEntityConnectionConfigMutator
injectsServiceIdentityInfoDpo
into the entity usingServiceIdentityRegistry
ConnectionConfigInfoDpo
identityReference
for credential lookupPolarisCredentialManager
usesServiceIdentityRegistry
to resolve aResolvedServiceIdentity
Key Components Introduced
ServiceIdentityInfoModel
: The model we use to surface polaris service identity to users.ServiceIdentityInfoDpo
: Persisted identity info injected into the catalog entity (includes hidden identityReference), the dpo won't contain the credential.ResolvedServiceIdentity
: Encapsulates resolved metadata and credentials needed to interact with cloud services.ServiceIdentityRegistry
: Interface for managing and resolving service identities.DefaultServiceIdentityRegistry
: Loads service identities from static Quarkus config (per realm, per identity type).CatalogEntityConnectionConfigMutator
: An EntityMutator that injects the appropriate service identity based on config.EntityMutationEngine
: Applies configured mutators to polaris entities at some injection points.PolarisCredentialManager
/DefaultPolarisCredentialManager
: Central interface for retrieving credentials to access external services (initially supports connection config, extensible to storage).Example Configuration:
Here is the example of configuring service identities for each realm statically. But for real production, service vendor should implement their own
ServiceIdentityRegistry
to assign/resolve service identity dynamically.Next Steps:
Flowchart
A demo video will come soon