Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for SigV4 request signing to REST Catalog requests #6951

Merged
merged 1 commit into from
Mar 6, 2023

Conversation

danielcweeks
Copy link
Contributor

This PR adds support for configuring the HTTPClient to allow for a request interceptor to be configured and an implementation in the AWS module for signing all requests with the SigV4 protocol.

@@ -64,13 +71,31 @@ public class HTTPClient implements RESTClient {
private final String uri;
private final CloseableHttpClient httpClient;
private final ObjectMapper mapper;
private final Map<String, String> baseHeaders;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default headers are now built as part of the client builder. This should have no affect on behavior since they are immutable in this implementation.

@danielcweeks
Copy link
Contributor Author

@jackye1995 It would be great if someone from the AWS could validate how we're configuration and signing these requests to make sure they're SigV4 compliant. We're leveraging the SDKv2 for most of the mechanics, but verification from your side would be great since I don't know if we can actually validate the signatures are correct.

Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mostly looks good to me! just some nit comments

.signingRegion(signingRegion)
.awsCredentials(credentialsProvider.resolveCredentials())
.checksumParams(
SignerChecksumParams.builder()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackye1995 it appears the SDK is producing an incorrect content hash for requests without a body. Based on the docs, SigV4 should produce e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855 as the body hash for empty requests (GET, HEAD). However, the actual value is 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU= which appears to be a base64 encoded representation.

We can work around this by either omitting the hash in those cases, which seems to work, or overriding the produce value, but it seams like we should be able to use the SDK like this by default.

Any thoughts on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The content hash appears to be added here, but this looks like the wrong encoding based on the docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I filed an issue with SDK maintainers to see what they they say, but added a workaround to correctly produce the base16 encoded empty body sha256 value.

@danielcweeks danielcweeks requested a review from rdblue March 3, 2023 23:24
* #REST_SIGV4_SESSION_TOKEN} is set, session credential is used, otherwise basic credential is
* used.
*/
public static final String REST_SIGV4_ACCESS_KEY_ID = "rest.sigv4.access-key-id";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need both rest and sigv4 prefixes?

If we removed "rest" then this would appear more generic because it isn't clear what SigV4 connections this applies to. We should probably keep rest then.

If we remove "sigv4" then we would have rest.access-key-id, rest.session-token, rest.sigv4-enabled, etc. Those seem fairly reasonable to me... but I'm definitely not sure they belong in AwsProperties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the issue here is that there are multiple aws keys used and just using rest isn't clear that it's for request signing vs fileio (which has an s3 prefix).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are also a lot of other AWS services that are configurable, so I think we want to be pretty explicit about the fact that these credentials are used only for this specific purpose.

Copy link
Contributor

@rdblue rdblue Mar 4, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't rest a specific enough purpose, without sigv4? That would also make these more consistent with other properties that don't add sig4v, like s3.access-key-id.

If the concern is that this conflicts with other properties in this class, then I'd fix it by moving them rather than adding more to a user-facing property. These are more associated with REST than with AWS. I think it's fine to have them here, but I don't think we should make user-facing properties longer because we decided to put them in this class vs another.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the property names

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest. also sounds good to me!

@danielcweeks danielcweeks force-pushed the rest-sigv4 branch 3 times, most recently from fd669c3 to 8b2f9fa Compare March 5, 2023 18:43
Copy link
Contributor

@rdblue rdblue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good. One minor comment on headers.

@rdblue
Copy link
Contributor

rdblue commented Mar 6, 2023

The Java build-checks task was stuck after running for almost 3 hours. I just restarted it.

/** The service name to be used by the SigV4 protocol for signing requests. */
public static final String REST_SIGNING_NAME = "rest.signing-name";

/** The default service name (API Gateway and lambda) used during SigV4 signing. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I think this is for API Gateway only? It does not have anything to do with Lambda.

public static final String REST_SIGNING_NAME = "rest.signing-name";

/** The default service name (API Gateway and lambda) used during SigV4 signing. */
public static final String REST_SIGNING_NAME_DEFAULT = "execute-api";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: regarding Ryan's comment in #6951 (comment), there are some other variations of this internally, but publicly this is the only possible value. We could name this REST_SIGNING_NAME_APIGATEWAY and then do REST_SIGNING_NAME_DEFAULT = REST_SIGNING_NAME_APIGATEWAY, but that seems like an overkill for now we can keep it as is until any further variation come.

Copy link
Contributor

@jackye1995 jackye1995 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me!

@danielcweeks danielcweeks merged commit 50863d7 into apache:master Mar 6, 2023
@rdblue
Copy link
Contributor

rdblue commented Mar 7, 2023

Thanks for reviewing this, @jackye1995!

krvikash pushed a commit to krvikash/iceberg that referenced this pull request Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants