-
Notifications
You must be signed in to change notification settings - Fork 2.7k
GCP: Add Google Authentication Support #13212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hi @nastra, when you have time Could you review this ? |
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adutra Thank you for reviewing
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
…-specific in this manner.
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthSession.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthSession.java
Outdated
Show resolved
Hide resolved
@@ -62,6 +62,9 @@ public class GCPProperties implements Serializable { | |||
*/ | |||
public static final int GCS_DELETE_BATCH_SIZE_DEFAULT = 50; | |||
|
|||
public static final String GCP_CREDENTIALS_PATH_PROPERTY = "gcp.auth.credentials-path"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm currently not sure whether these properties should be here vs in GoogleAuthManager
. The properties in this class are typically related to the FileIO implementation of GCS, whereas these new properties aren't
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved these configurations to GoogleAuthManager. May be we should rename GCPProperties.java as GCSProperties.java ?
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthSession.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthSession.java
Outdated
Show resolved
Hide resolved
gcp/src/test/java/org/apache/iceberg/gcp/auth/TestGoogleAuthSession.java
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthSession.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adutra @nastra - we are helping review a similar PR in PyIceberg, and I wanted to bring this discussion here first.
With the AuthManager, we now have a pluggable way of introducing authentication mechanisms to Iceberg clients. Given that's the case, do we want to encourage vendors to implement and manager their AuthManager separately from this project, or encourage them to contribute and have those code implementations maintained in the parent repositry?
Dremio seems to have taken the former approach: https://github.com/dremio/iceberg-auth-manager
Whereas there's precedent for us supporting AWS SigV4 authentication on the project itself: https://github.com/apache/iceberg/blob/main/aws/src/main/java/org/apache/iceberg/aws/RESTSigV4AuthManager.java
Would this warrant a mailing list discussion?
@sungwy thanks for raising the question!
Once you have a pluggable API, you cannot prevent users from implementing it :-) IOW, there will necessarily be vendor-specific implementations living outside of Iceberg repo eventually. But any implementation that is generic enough to be useful to most Iceberg users deserves being integrated into Iceberg. I think we need to strike a balance between the two possibilities and decide on a case-by-case basis. For example, this PR does introduce a vendor-specific Auth Manager, since it can only work when running on Google's infra. But since it's fairly generic, I don't see any problem in having this Auth Manager in Iceberg.
That is not true: Dremio open-sourced this project recently, but it does not contain anything Dremio-specific. It is a greenfield project that aims at providing an alternate implementation of the Auth Manager API for OAuth, offering support for many more features than the ones supported by the built-in OAuth manager. It is actually Dremio's intention to donate this project, but indeed that conversation didn't officially start yet.
Yes, definitely! I will start this discussion shortly. |
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthManager.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthSession.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthSession.java
Outdated
Show resolved
Hide resolved
gcp/src/main/java/org/apache/iceberg/gcp/auth/GoogleAuthSession.java
Outdated
Show resolved
Hide resolved
return newHeaders.equals(request.headers()) | ||
? request | ||
: ImmutableHTTPRequest.builder().from(request).headers(newHeaders).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why would we need this? I don't think we reuse requests and if we did, the putIfAbsent
would resolve it. Seems like unnecessary complexity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to keep this because we re-use headers. https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/rest/HTTPClient.java#L256
If someone add Authorizationheader it can broke workflow. I dont want to allow that behaviour.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @talatuyarer
This PR introduces a dedicated GoogleAuthManager for the Iceberg REST client, enabling robust and streamlined authentication with Google Cloud Platform (GCP) services by delegating authentication to GCP's auth library.
Why add a specific GoogleAuthManager?
Currently, Iceberg's AuthManager and the generic OAuth2Manager provide a good baseline for authentication. However, connecting to services on Google Cloud often uses specific patterns that a generic manager doesn't fully simplify. This new GoogleAuthManager makes life easier for folks using Iceberg REST catalogs behind GCP Auth Service
The existing OAuth2Manager is great for standard OAuth2 flows (like client credentials) but it does not support authorization code workflow.
The GoogleAuthManager introduces authentication methods that are more convenient for GCP users, differing from the generic OAuth2 flow:
While our current OAuth2Manager can be used with tokens from service accounts, it doesn't natively understand refresh token workflow and the ADC discovery process or how to directly consume a service account JSON file for its credentials.
This dedicated manager ensures that Iceberg can authenticate with GCP-backed REST catalog services in the most efficient, secure, and user-friendly manner, aligning with common GCP practices. The GoogleAuthManager bridges this gap for a more seamless Iceberg Rest experience on GCP.