Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HADOOP-17198. Support S3 Access Points #3958

Conversation

steveloughran
Copy link
Contributor

HADOOP-17198. Support S3 Access Points (#3260)

Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.

To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern

fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN

  • The global/bucket option fs.s3a.accesspoint.required to
    mandate that buckets must declare their access point.
  • This is not compatible with S3Guard.

Consult the documentation for further details.

Contributed by Bogdan Stolojan

(this commit contains the changes to TestArnResource from HADOOP-18068,
"upgrade AWS SDK to 1.12.132" so that it works with the later SDK.)

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

Add support for S3 Access Points. This provides extra security as it
ensures applications are not working with buckets belong to third parties.

To bind a bucket to an access point, set the access point (ap) ARN,
which must be done for each specific bucket, using the pattern

fs.s3a.bucket.$BUCKET.accesspoint.arn = ARN

* The global/bucket option `fs.s3a.accesspoint.required` to
mandate that buckets must declare their access point.
* This is not compatible with S3Guard.

Consult the documentation for further details.

Contributed by Bogdan Stolojan

(this commit contains the changes to TestArnResource from HADOOP-18068,
 "upgrade AWS SDK to 1.12.132" so that it works with the later SDK.)

Change-Id: I3fac213e52ca6ec1c813effb8496c353964b8e1b
…he#3516)


Follow-on to HADOOP-17198. Support S3 Access Points

Contributed by Bogdan Stolojan
@steveloughran
Copy link
Contributor Author

this is bogdan's patch of #3958 #3954 done as a cherrypick of the two changes, reconciliation of the s3guard removal and fixup for the sdk update.

testing in progress

@steveloughran
Copy link
Contributor Author

this is @bogthe's work, I've just dealt with the merge conflict caused by the s3guard cut.

testing s3 london. assuming it and the yetus tests are happy, i will do the merge

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 14m 10s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ branch-3.3 Compile Tests _
+0 🆗 mvndep 3m 42s Maven dependency ordering for branch
+1 💚 mvninstall 33m 22s branch-3.3 passed
+1 💚 compile 19m 45s branch-3.3 passed
+1 💚 checkstyle 3m 22s branch-3.3 passed
+1 💚 mvnsite 2m 48s branch-3.3 passed
+1 💚 javadoc 2m 30s branch-3.3 passed
+1 💚 spotbugs 4m 14s branch-3.3 passed
+1 💚 shadedclient 29m 6s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 28s Maven dependency ordering for patch
+1 💚 mvninstall 1m 53s the patch passed
+1 💚 compile 21m 55s the patch passed
+1 💚 javac 21m 55s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 3m 9s the patch passed
+1 💚 mvnsite 2m 33s the patch passed
+1 💚 xml 0m 2s The patch has no ill-formed XML file.
+1 💚 javadoc 1m 44s hadoop-common in the patch passed.
+1 💚 javadoc 0m 40s hadoop-tools_hadoop-aws generated 0 new + 39 unchanged - 1 fixed = 39 total (was 40)
+1 💚 spotbugs 4m 41s the patch passed
+1 💚 shadedclient 28m 45s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 19m 5s hadoop-common in the patch passed.
+1 💚 unit 2m 36s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 8s The patch does not generate ASF License warnings.
203m 38s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3958/1/artifact/out/Dockerfile
GITHUB PR #3958
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell xml spotbugs checkstyle markdownlint
uname Linux d9a67b9b152b 4.15.0-153-generic #160-Ubuntu SMP Thu Jul 29 06:54:29 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 8088f34
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3958/1/testReport/
Max. process+thread count 2996 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3958/1/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@bogthe
Copy link
Contributor

bogthe commented Feb 4, 2022

thanks for this!

@steveloughran
Copy link
Contributor Author

hmm

[ERROR] testAccessPointRequired(org.apache.hadoop.fs.s3a.ITestS3ABucketExistence)  Time elapsed: 0.768 s  <<< ERROR!
java.lang.IllegalArgumentException: The region field of the ARN being passed as a bucket parameter to an S3 operation does not match the region the client was configured with. Provided region: 'eu-west-1'; client region: 'accesspoint-eu-west-1'.
	at com.amazonaws.services.s3.AmazonS3Client.validateIsTrue(AmazonS3Client.java:6584)
	at com.amazonaws.services.s3.AmazonS3Client.validateS3ResourceArn(AmazonS3Client.java:5155)
	at com.amazonaws.services.s3.AmazonS3Client.createRequest(AmazonS3Client.java:4956)
	at com.amazonaws.services.s3.AmazonS3Client.createRequest(AmazonS3Client.java:4920)
	at com.amazonaws.services.s3.AmazonS3Client.getAcl(AmazonS3Client.java:4040)
	at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1278)
	at com.amazonaws.services.s3.AmazonS3Client.getBucketAcl(AmazonS3Client.java:1268)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExistsV2$2(S3AFileSystem.java:731)
	at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:499)
	at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:119)
	at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:348)
	at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:440)
	at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:344)
	at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:319)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExistsV2(S3AFileSystem.java:724)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.doBucketProbing(S3AFileSystem.java:611)
	at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:506)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537)
	at org.apache.hadoop.fs.s3a.ITestS3ABucketExistence.lambda$testAccessPointRequired$14(ITestS3ABucketExistence.java:189)
	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:498)
	at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:384)
	at org.apache.hadoop.fs.s3a.ITestS3ABucketExistence.expectUnknownStore(ITestS3ABucketExistence.java:103)
	at org.apache.hadoop.fs.s3a.ITestS3ABucketExistence.testAccessPointRequired(ITestS3ABucketExistence.java:188)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
	at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
	at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
	at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
	at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.lang.Thread.run(Thread.java:748)

[INFO]

sdk merge pain. the 3.3.2 patch is actually easier here

@bogthe
Copy link
Contributor

bogthe commented Feb 4, 2022

This patch makes this AP feature independent to SDK upgrade. #3902

…lation (apache#3902)


Part of HADOOP-17198. Support S3 Access Points.

HADOOP-18068. "upgrade AWS SDK to 1.12.132" broke the access point endpoint
translation.

Correct endpoints should start with "s3-accesspoint.", after SDK upgrade they start with
"s3.accesspoint-" which messes up tests + region detection by the SDK.

Contributed by Bogdan Stolojan
@steveloughran
Copy link
Contributor Author

confirmed; applying that change to this branch fixes it all.

i'm rerunning the test suites for safety, but then i'm going to merge in this branch as the three separate changes.

Something

@steveloughran
Copy link
Contributor Author

ok, reran all tests against s3 london, all good . merging locally as a chain of commits, rather than squashing in through the github web interface

@sunchao
Copy link
Member

sunchao commented Feb 4, 2022

Thanks @steveloughran ! can this be cherry-picked cleanly into branch-3.3.2, or we need another PR?

@bogthe
Copy link
Contributor

bogthe commented Feb 4, 2022

awesome work @steveloughran thank you!

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 41s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 7 new or modified test files.
_ branch-3.3 Compile Tests _
+0 🆗 mvndep 3m 31s Maven dependency ordering for branch
+1 💚 mvninstall 30m 5s branch-3.3 passed
+1 💚 compile 17m 29s branch-3.3 passed
+1 💚 checkstyle 2m 43s branch-3.3 passed
+1 💚 mvnsite 2m 32s branch-3.3 passed
+1 💚 javadoc 2m 34s branch-3.3 passed
+1 💚 spotbugs 3m 42s branch-3.3 passed
+1 💚 shadedclient 24m 2s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 31s Maven dependency ordering for patch
+1 💚 mvninstall 1m 31s the patch passed
+1 💚 compile 16m 45s the patch passed
+1 💚 javac 16m 45s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 2m 34s /results-checkstyle-root.txt root: The patch generated 1 new + 17 unchanged - 0 fixed = 18 total (was 17)
+1 💚 mvnsite 2m 32s the patch passed
+1 💚 xml 0m 1s The patch has no ill-formed XML file.
+1 💚 javadoc 1m 39s hadoop-common in the patch passed.
+1 💚 javadoc 0m 46s hadoop-tools_hadoop-aws generated 0 new + 39 unchanged - 1 fixed = 39 total (was 40)
+1 💚 spotbugs 3m 57s the patch passed
+1 💚 shadedclient 25m 25s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 17m 3s hadoop-common in the patch passed.
+1 💚 unit 2m 15s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 59s The patch does not generate ASF License warnings.
165m 41s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3958/2/artifact/out/Dockerfile
GITHUB PR #3958
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell xml spotbugs checkstyle markdownlint
uname Linux 08abbe2e20c7 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision branch-3.3 / 305ab1a
Default Java Private Build-1.8.0_312-8u312-b07-0ubuntu1~18.04-b07
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3958/2/testReport/
Max. process+thread count 1251 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3958/2/console
versions git=2.17.1 maven=3.6.0 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants