-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS,Core: Add S3 REST Signer client + REST Spec #6169
Conversation
07dc9f2
to
b2f7900
Compare
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Show resolved
Hide resolved
1 general question regarding this PR, @nastra @rdblue @danielcweeks this is a feature very specific to AWS S3. What is the general guideline in the community for adding this as a part of the OpenAPI spec? In my mind the spec should be something generic for Iceberg, and not related to any specific cloud provider. Although many cloud storage providers have built S3 compatible API, but mostly just do it to some extent to support basic operations like get and put object. I would imaging signing with sigv4 is still a very AWS-specific thing, but maybe I am wrong about that. Overall, have we thought about making this spec more generic, or is that not possible to achieve without being cloud provider specific? |
@jackye1995 I think the purpose behind the open-api doc is just to be clear about what the expectations are from a call/response/error perspective. While it has some overlap with the other open-api doc, they are separate entities. This signer is specific to AWS and I think that's fine since the signer implementation is in the As a general pattern, all cloud providers sign requests in some form and we may see similar implementations using different protocols in the future, but it may also be that this can be generalized in the future. |
I have moved the S3 Signer REST Spec to the |
} | ||
} | ||
|
||
@SuppressWarnings("checkstyle:NestedTryDepth") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using more standard loading would avoid this problem so we don't need to suppress it.
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Outdated
Show resolved
Hide resolved
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Show resolved
Hide resolved
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Outdated
Show resolved
Hide resolved
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1. There are some very minor things (like how we set kebab case is deprecated), but other than that, this looks good.
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Outdated
Show resolved
Hide resolved
private static final Cache<Key, SignedComponent> SIGNED_COMPONENT_CACHE = | ||
Caffeine.newBuilder().expireAfterWrite(30, TimeUnit.SECONDS).maximumSize(100).build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, is it worth making the cache expiration and size configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for now we'd probably go as is and follow up and make it configurable if we see that it helps with certain use cases.
aws/src/main/java/org/apache/iceberg/aws/s3/signer/S3V4RestSignerClient.java
Outdated
Show resolved
Hide resolved
This introduces an S3 REST signer client and defines a REST spec (`s3-signer-open-api.yml`) for a server implementation. Below is a high-level overview of the introduced changes: * the main logic and functionality resides in the `S3V4RestSignerClient` class * it uses the same **credential/token** exchange flow as we have in `RESTSessionCatalog` and also uses the same token refresh mechanism. In order to achieve that, a few refactorings have been done in `RESTSessionCatalog` / `OAuth2Util`. * the default endpoint the signer connects to is `v1/aws/s3/sign` but can be customized. * The server decides which headers to sign and can indicate to the `S3V4RestSignerClient` whether a response with signed headers can be cached by sending a `Cache-Control: private` header * `AwsProperties` introduce `s3.signer.class` that allows to dynamically load an S3 Signer implementation and apply it when creating an S3 client. This can be any Signer class that implements `software.amazon.awssdk.core.signer.Signer`. * `S3SignRequest` and `S3SignResponse` classes define how the request and response looks like * an `S3ObjectMapper` class has been introduced that is similar to `RESTObjectMapper` but only contains what's necessary for the S3 REST signer, which are the request/response classes with OAuth-related classes and error handling. * Testing is done by using `MinioContainer` (`TestContainers` + `MinIO`) in `TestS3RestSigner` * The `S3SignerServlet` defines the minimum amount of work that a server-side implementation might have. It is by no means complete and only serves the purpose of testing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fixes, looks good to me!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me @nastra!
Thanks @nastra !! |
* AWS,Core: Add S3 Signer client + REST Spec This introduces an S3 REST signer client and defines a REST spec (`s3-signer-open-api.yml`) for a server implementation. Below is a high-level overview of the introduced changes: * the main logic and functionality resides in the `S3V4RestSignerClient` class * it uses the same **credential/token** exchange flow as we have in `RESTSessionCatalog` and also uses the same token refresh mechanism. In order to achieve that, a few refactorings have been done in `RESTSessionCatalog` / `OAuth2Util`. * the default endpoint the signer connects to is `v1/aws/s3/sign` but can be customized. * The server decides which headers to sign and can indicate to the `S3V4RestSignerClient` whether a response with signed headers can be cached by sending a `Cache-Control: private` header * `AwsProperties` introduce `s3.signer.class` that allows to dynamically load an S3 Signer implementation and apply it when creating an S3 client. This can be any Signer class that implements `software.amazon.awssdk.core.signer.Signer`. * `S3SignRequest` and `S3SignResponse` classes define how the request and response looks like * an `S3ObjectMapper` class has been introduced that is similar to `RESTObjectMapper` but only contains what's necessary for the S3 REST signer, which are the request/response classes with OAuth-related classes and error handling. * Testing is done by using `MinioContainer` (`TestContainers` + `MinIO`) in `TestS3RestSigner` * The `S3SignerServlet` defines the minimum amount of work that a server-side implementation might have. It is by no means complete and only serves the purpose of testing * make http client static * review feedback * change dynamic loading of signer
This introduces an S3 REST signer client and defines a REST spec (
s3-signer-open-api.yml
) for a server implementation. Below is a high-level overview of the introduced changes:S3V4RestSignerClient
classRESTSessionCatalog
and also uses the same token refresh mechanism. In order to achieve that, a few refactorings have been done inRESTSessionCatalog
/OAuth2Util
.v1/aws/s3/sign
but can be customized.S3V4RestSignerClient
whether a response with signed headers can be cached by sending aCache-Control: private
headerAwsProperties
introduces3.signer.class
that allows to dynamically load an S3 Signer implementation and apply it when creating an S3 client. This can be any Signer class that implementssoftware.amazon.awssdk.core.signer.Signer
.S3SignRequest
andS3SignResponse
classes define how the request and response looks likeS3ObjectMapper
class has been introduced that is similar toRESTObjectMapper
but only contains what's necessary for the S3 REST signer, which are the request/response classes with OAuth-related classes and error handling.MinioContainer
(TestContainers
+MinIO
) inTestS3RestSigner
S3SignerServlet
defines the minimum amount of work that a server-side implementation might have. It is by no means complete and only serves the purpose of testing