-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-14661. Add S3 requester pays bucket support to S3A #3962
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
b3c8ff2
to
a401c36
Compare
Looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
@mukund-thakur would you be able to review this PR? |
Sure will check early next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good. Some minor comments.
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/S3ATestConstants.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this, sorry I've been unreliable at reviewing.
if the SDK does now attach the requester pays header everywhere then we don't need the stuff in the request factory...that was in as the discussion implied SDK coverage was incomplete.
Test-wise, made some comments. I think it also might be good to try and do some more API calls to make sure the requests are tagged, listing in particular
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/Constants.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/troubleshooting_s3a.md
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Show resolved
Hide resolved
Thank you both for the comments, I have just pushed a few changes. I also removed unused configuration from the RequestFactoryImpl - I am assuming since its under |
yes. though we do regularly found out where that's not always true (s3 client factory, store context). how do we find out? change things and see what breaks, then add comments in the javadocs to warn others |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i like the tests, much cleaner now. made some minor suggestions there.
if you can do those and cover how to change/disable the tests in testing.md, this is ready to go in
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/ITestS3ARequesterPays.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
this is good to go. one little detail, which is those line endings in the markdown.
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md:1638:S3A supports buckets with
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md:1639:[Requester Pays](https://docs.aws.amazon.com/AmazonS3/latest/userguide/RequesterPaysBuckets.html)
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md:1640:enabled. When a bucket is configured with requester pays, the requester must cover
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/index.md:1643:For requests to be successful, the S3 client must acknowledge that they will pay
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md:598:By default, the requester pays tests will look for a bucket that exists on Amazon S3
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md:602:The test only requires an object of at least a few bytes in order
hadoop-tools/hadoop-aws/src/site/markdown/tools/hadoop-aws/testing.md:612:If the endpoint does not support requester pays, you can also disable the tests by configuring
can you fix these as the github merge button doesn't do it for us?
@steveloughran I caught those after I pushed most of the other fixes from the last feedback and pushed a second commit ae8e812. There's two Yetus comments, latest one seems happy. Are we good to merge? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, I am happy
merged to trunk, will cherrypick to 3.3 locally, retest and if all is good merge there too |
Adds the option fs.s3a.requester.pays.enabled, which, if set to true, allows the client to access S3 buckets where the requester is billed for the IO. Contributed by Daniel Carl Jones Change-Id: I51f64d0f9b3be3c4ec493bcf91927fca3b20407a
Thanks Steve, Mukund, and Monthon for the reviews! |
Adds the option fs.s3a.requester.pays.enabled, which, if set to true, allows the client to access S3 buckets where the requester is billed for the IO. Contributed by Daniel Carl Jones
Description of PR
This change introduces support for "requester pays" S3 buckets (HADOOP-14661).
There was already some discussion and patches made available (e.g. #250), although the code base has changed a bit since then.
In summary:
fs.s3a.requester-pays.enabled
) when creating S3A filesystem, which uses S3ClientFactory to create a new S3Client also taking this optionusgs-landsat
bucket (new version, not old) which is configured to require requester pays ('usgs-landsat' on Registry of Open Data on AWS).How was this patch tested?
I created an EC2 instance and an S3 Bucket in eu-west-1 and ran the tests against the global S3 endpoint.
Note, the request pays specific tests introduced do not use the developer supplied bucket.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?