Skip to content

HADOOP-13551. AWS metrics wire-up #2778

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 24, 2021

Conversation

steveloughran
Copy link
Contributor

This moves to a parameter-class and builder pattern for passing
in configuration parameters to the S3 client factory, so as to
stop adding any future parameters from breaking external implementations.

Extracted from HADOOP-17511 for isolated review/commit

@steveloughran
Copy link
Contributor Author

Tested, s3 london, with the new suite to force tests against landsat and common crawl.

If you comment out the withForceGlobalBucketAccessEnabled() change in DefaultS3ClientFactory the new test suite ITestAWSStatisticCollection fails with

org.apache.hadoop.fs.s3a.AWSRedirectException: getObjectMetadata on s3a://landsat-pds/scene_list.gz: com.amazonaws.services.s3.model.AmazonS3Exception: The bucket is in this region: us-west-2. Please use this region to retry the request (Service: Amazon S3; Status Code: 301; Error Code: 301 Moved Permanently; 

That is: central endpoint's 301 responses don't trigger a switch to the endpoint.

Maybe we should include that in troubleshooting

@steveloughran
Copy link
Contributor Author

Maybe we should include that in troubleshooting

already in it

@steveloughran
Copy link
Contributor Author

Full test run complete against s3 london; sole failure is #2777

@steveloughran
Copy link
Contributor Author

tests against S3 ireland with endpoint option unset; bucket unversioned.

Test failures are (1) transient and (2) fixed in another PR respectively.


-Dparallel-tests -DtestsThreadCount=8 -Dmarkers=delete -Ds3guard -Ddynamo

...

[ERROR] Failures:
[ERROR]   ITestS3AContractUnbuffer>AbstractContractUnbufferTest.testUnbufferBeforeRead:63->AbstractContractUnbufferTest.validateFullFileContents:132->AbstractContractUnbufferTest.validateFileContents:139->Assert.assertEquals:645->Assert.failNotEquals:834->Assert.fail:88 failed to read expected number of bytes from stream. This may be transient expected:<1024> but was:<93>
[ERROR]   ITestAssumeRole.testAssumeRoleBadInnerAuth:256->expectFileSystemCreateFailure:136  Expected to find 'not a valid key=value pair (missing equal-sign) in Authorization header' but got unexpected exception: org.apache.hadoop.fs.s3a.AWSBadRequestException: Instantiate org.apache.hadoop.fs.s3a.auth.AssumedRoleCredentialProvider: com.amazonaws.services.securitytoken.model.AWSSecurityTokenServiceException: null (Service: AWSSecurityTokenService; Status Code: 400; Error Code: IncompleteSignature; Request ID: 20f68ae9-221a-49eb-8ecb-2b5eb8a1134c; Proxy: null):IncompleteSignature: null (Service: AWSSecurityTokenService; Status Code: 400; Error Code: IncompleteSignature; Request ID: 20f68ae9-221a-49eb-8ecb-2b5eb8a1134c; Proxy: null)
	at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:243)

@steveloughran steveloughran force-pushed the s3/HADOOP-13551-aws-metrics branch 2 times, most recently from 5d2075b to 0923b23 Compare March 17, 2021 14:03
Moves to the builder API for AWS S3 client creation, and
offers a similar-style API to the S3AFS and tests, hiding
the details of what options are client, what are AWS Conf,
and doing the wiring up of S3A stats interfaces to the AWS
internals.

This will break HBase's HBoss test suite, but I couldn't see a way to
cleanly address this. I'll need to work on that code separately

Change-Id: I4fb05f4a54eece8fac96abf8d6af9c889392a6c1
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 37s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 42s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 37s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 checkstyle 0m 29s trunk passed
+1 💚 mvnsite 0m 44s trunk passed
+1 💚 javadoc 0m 24s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 32s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 8s trunk passed
+1 💚 shadedclient 13m 57s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 0m 36s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 36s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 javac 0m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 0m 20s /results-checkstyle-hadoop-tools_hadoop-aws.txt hadoop-tools/hadoop-aws: The patch generated 1 new + 17 unchanged - 4 fixed = 18 total (was 21)
+1 💚 mvnsite 0m 32s the patch passed
+1 💚 javadoc 0m 16s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
-1 ❌ javadoc 0m 22s /patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.txt hadoop-aws in the patch failed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08.
+1 💚 spotbugs 1m 8s the patch passed
+1 💚 shadedclient 13m 54s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 3s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
73m 43s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2778/5/artifact/out/Dockerfile
GITHUB PR #2778
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 4913e90d4081 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 0923b23028e484361b2165b36435ac36e4fe1010
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2778/5/testReport/
Max. process+thread count 670 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2778/5/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@apache apache deleted a comment from hadoop-yetus Mar 17, 2021
@apache apache deleted a comment from hadoop-yetus Mar 17, 2021
@apache apache deleted a comment from hadoop-yetus Mar 17, 2021
@apache apache deleted a comment from hadoop-yetus Mar 17, 2021
@steveloughran
Copy link
Contributor Author

[WARNING] The requested profile "docs" could not be activated because it does not exist.
[ERROR] Failed to execute goal org.apache.maven.plugins:maven-javadoc-plugin:3.0.1:javadoc-no-fork (default-cli) on project hadoop-aws: An error has occurred in Javadoc report generation: 
[ERROR] Exit code: 1 - /home/jenkins/jenkins-home/workspace/hadoop-multibranch_PR-2778/src/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/DefaultS3ClientFactory.java:71: error: reference not found
[ERROR]    * and then invoking {@link #buildAmazonS3Client(AWSCredentialsProvider, ClientConfiguration, S3ClientCreationParameters, String, boolean)}

Copy link
Contributor

@mehakmeet mehakmeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, some nits. Tested on ap-south-1.

private static int invocationCount = 0;

private static AtomicInteger instantiationCount = new AtomicInteger(0);
private static AtomicInteger invocationCount = new AtomicInteger(0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could make this final


/**
* Use the (newer) Builder SDK to create a an AWS S3 client.
* Use the Builder API to create a an AWS S3 client.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a

@steveloughran steveloughran force-pushed the s3/HADOOP-13551-aws-metrics branch from 0923b23 to 66c0d43 Compare March 18, 2021 15:16
Change-Id: I9ea831f5fec48ebcc7d08fe0e5d6918f8dfa2786
@steveloughran steveloughran force-pushed the s3/HADOOP-13551-aws-metrics branch from 66c0d43 to 470d551 Compare March 18, 2021 15:20
Change-Id: I6dd96b6ea4a69e1d11fca66c133ca5e880f57331
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 35s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 5 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 32m 50s trunk passed
+1 💚 compile 0m 44s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 compile 0m 37s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 checkstyle 0m 31s trunk passed
+1 💚 mvnsite 0m 45s trunk passed
+1 💚 javadoc 0m 25s trunk passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 33s trunk passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 6s trunk passed
+1 💚 shadedclient 13m 52s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 0m 34s the patch passed
+1 💚 compile 0m 34s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javac 0m 34s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 javac 0m 29s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 19s hadoop-tools/hadoop-aws: The patch generated 0 new + 17 unchanged - 4 fixed = 17 total (was 21)
+1 💚 mvnsite 0m 33s the patch passed
+1 💚 javadoc 0m 14s the patch passed with JDK Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04
+1 💚 javadoc 0m 24s the patch passed with JDK Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
+1 💚 spotbugs 1m 8s the patch passed
+1 💚 shadedclient 13m 54s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 2m 3s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 33s The patch does not generate ASF License warnings.
73m 45s
Subsystem Report/Notes
Docker ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2778/8/artifact/out/Dockerfile
GITHUB PR #2778
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient spotbugs checkstyle codespell
uname Linux 4130bc87c1b5 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 23fe6f4
Default Java Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.10+9-Ubuntu-0ubuntu1.20.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_282-8u282-b08-0ubuntu1~20.04-b08
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2778/8/testReport/
Max. process+thread count 672 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2778/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should include that in troubleshooting

already in it

Have you added the doc changes because I don't see them here.

LGTM, after the doc clarification.
Ran the raw tests using ap-south-1. All good.

@@ -226,31 +170,6 @@ protected static AmazonS3 configureAmazonS3Client(AmazonS3 s3,
throw new IllegalArgumentException(msg, e);
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a bit of commentry about path style access was good.

@apache apache deleted a comment from hadoop-yetus Mar 23, 2021
@apache apache deleted a comment from hadoop-yetus Mar 23, 2021
@steveloughran
Copy link
Contributor Author

Have you added the doc changes because I don't see them here.

There's a lot of stuff in the docs about endpoints, including the exception we've been seeing here. Nothing else to add

Copy link
Contributor

@mukund-thakur mukund-thakur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM +1

@steveloughran
Copy link
Contributor Author

thanks!

@steveloughran steveloughran merged commit 04880f0 into apache:trunk Mar 24, 2021
steveloughran added a commit to steveloughran/hadoop that referenced this pull request Mar 24, 2021
Moves to the builder API for AWS S3 client creation, and
offers a similar style of API to the S3A FileSystem and tests, hiding
the details of which options are client, which are in AWS Conf,
and doing the wiring up of S3A statistics interfaces to the AWS
SDK internals. S3A Statistics, including IOStatistics, should now
count throttling events handled in the AWS SDK itself.

This patch restores endpoint determination by probes to US-East-1
if the client isn't configured with fs.s3a.endpoint.

Explicitly setting the endpoint will save the cost of these probe
HTTP requests.

Contributed by Steve Loughran.
asfgit pushed a commit that referenced this pull request Mar 25, 2021
Moves to the builder API for AWS S3 client creation, and
offers a similar style of API to the S3A FileSystem and tests, hiding
the details of which options are client, which are in AWS Conf,
and doing the wiring up of S3A statistics interfaces to the AWS
SDK internals. S3A Statistics, including IOStatistics, should now
count throttling events handled in the AWS SDK itself.

This patch restores endpoint determination by probes to US-East-1
if the client isn't configured with fs.s3a.endpoint.

Explicitly setting the endpoint will save the cost of these probe
HTTP requests.

Contributed by Steve Loughran.

Change-Id: Ifa6caa8ff56369ad30e4fd01a42bc74f7b8b3d6b
kiran-maturi pushed a commit to kiran-maturi/hadoop that referenced this pull request Nov 24, 2021
Moves to the builder API for AWS S3 client creation, and
offers a similar style of API to the S3A FileSystem and tests, hiding
the details of which options are client, which are in AWS Conf,
and doing the wiring up of S3A statistics interfaces to the AWS
SDK internals. S3A Statistics, including IOStatistics, should now
count throttling events handled in the AWS SDK itself.

This patch restores endpoint determination by probes to US-East-1
if the client isn't configured with fs.s3a.endpoint.

Explicitly setting the endpoint will save the cost of these probe
HTTP requests.

Contributed by Steve Loughran.
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
Moves to the builder API for AWS S3 client creation, and
offers a similar style of API to the S3A FileSystem and tests, hiding
the details of which options are client, which are in AWS Conf,
and doing the wiring up of S3A statistics interfaces to the AWS
SDK internals. S3A Statistics, including IOStatistics, should now
count throttling events handled in the AWS SDK itself.

This patch restores endpoint determination by probes to US-East-1
if the client isn't configured with fs.s3a.endpoint.

Explicitly setting the endpoint will save the cost of these probe
HTTP requests.

Contributed by Steve Loughran.

Change-Id: Ifa6caa8ff56369ad30e4fd01a42bc74f7b8b3d6b
(cherry picked from commit b32b08e1d208c45cf645d63a4630f34bfee5322e)
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
jojochuang pushed a commit to jojochuang/hadoop that referenced this pull request May 23, 2023
Moves to the builder API for AWS S3 client creation, and
offers a similar style of API to the S3A FileSystem and tests, hiding
the details of which options are client, which are in AWS Conf,
and doing the wiring up of S3A statistics interfaces to the AWS
SDK internals. S3A Statistics, including IOStatistics, should now
count throttling events handled in the AWS SDK itself.

This patch restores endpoint determination by probes to US-East-1
if the client isn't configured with fs.s3a.endpoint.

Explicitly setting the endpoint will save the cost of these probe
HTTP requests.

Contributed by Steve Loughran.

Change-Id: Ifa6caa8ff56369ad30e4fd01a42bc74f7b8b3d6b
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants