Skip to content

HADOOP-19399. S3A: support CRT client. #7443

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

ahmarsuhail
Copy link
Contributor

@ahmarsuhail ahmarsuhail commented Feb 28, 2025

Description of PR

Adds support S3 CRT client.

  • Configuration mapping b/w clients is getting tricky, I moved endpoint/region resolution logic out so it can be re-used.

  • CRT does not support client level headers for user agent and requester pays, so those have to be added on a per request level. Only added for copyObject() and putObject() as those are the only two places CRT will be used in S3A.

How was this patch tested?

Tested in us-east-2 with mvn -Dparallel-tests -DtestsThreadCount=16 clean verify. All good, failures is ITestS3AEndpointRegion and ITestAwsSdkWorkarounds are related to SDK upgrade.

For code changes:

  • Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')?
  • Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation?
  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?
  • If applicable, have you updated the LICENSE, LICENSE-binary, NOTICE-binary files?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 50s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 28s Maven dependency ordering for branch
+1 💚 mvninstall 36m 5s trunk passed
+1 💚 compile 17m 35s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 40s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 36s trunk passed
+1 💚 mvnsite 1m 34s trunk passed
+1 💚 javadoc 1m 30s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 23s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 42s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 39m 42s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for patch
+1 💚 mvninstall 0m 48s the patch passed
+1 💚 compile 17m 10s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 17m 10s the patch passed
+1 💚 compile 15m 16s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 16s the patch passed
+1 💚 blanks 0m 1s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 39s /results-checkstyle-root.txt root: The patch generated 6 new + 12 unchanged - 0 fixed = 18 total (was 12)
+1 💚 mvnsite 1m 31s the patch passed
-1 ❌ javadoc 0m 50s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-1 ❌ javadoc 0m 47s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+0 🆗 spotbugs 0m 36s hadoop-project has no data from spotbugs
+1 💚 shadedclient 39m 49s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 33s hadoop-project in the patch passed.
-1 ❌ unit 3m 21s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
+1 💚 asflicense 1m 2s The patch does not generate ASF License warnings.
221m 7s
Reason Tests
Failed junit tests hadoop.fs.s3a.impl.TestClientManager
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/1/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux 7146247aa952 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 982826f
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/1/testReport/
Max. process+thread count 581 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/1/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@@ -0,0 +1,70 @@
package org.apache.hadoop.fs.s3a.impl;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steveloughran starting to work on the CRT changes, wanted to get your opinion on this. What do you think about doing some refactoring around this region logic, so we have our own builder and container class for AWSRegionEndpointInformation which all clients can get information out of.

Without this you have to duplicate the endpoint region logic for CRT and the other clients since they use different builder classes. CRT uses S3CrtAsyncClientBuilder which does not extend S3BaseClientBuilder and so you can't use the existing method..

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulling out is good.

we need to fix that region resolution logic

  • work in ec2/k8s without going out of region -automatically
  • add an option to use the default chain without the "add a space" trick, e.g a region called "(sdk)". Not backwards compatible, but makes clear what happens.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 18m 57s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 41s Maven dependency ordering for branch
+1 💚 mvninstall 36m 22s trunk passed
+1 💚 compile 17m 33s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 13s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 36s trunk passed
+1 💚 mvnsite 1m 32s trunk passed
+1 💚 javadoc 1m 29s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 20s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 40s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 39m 15s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 35s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 20s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 16m 3s /patch-compile-root-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt root in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ javac 16m 3s /patch-compile-root-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt root in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ compile 14m 23s /patch-compile-root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt root in the patch failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
-1 ❌ javac 14m 23s /patch-compile-root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt root in the patch failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 33s /results-checkstyle-root.txt root: The patch generated 22 new + 21 unchanged - 3 fixed = 43 total (was 24)
-1 ❌ mvnsite 0m 43s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ javadoc 0m 49s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
-1 ❌ javadoc 0m 47s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0)
+0 🆗 spotbugs 0m 36s hadoop-project has no data from spotbugs
-1 ❌ spotbugs 0m 42s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 shadedclient 38m 28s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 34s hadoop-project in the patch passed.
-1 ❌ unit 0m 45s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ asflicense 1m 3s /results-asflicense.txt The patch generated 2 ASF License warnings.
231m 54s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/2/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle
uname Linux 59e71f68690e 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 1e9f143
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/2/testReport/
Max. process+thread count 535 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/2/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@apache apache deleted a comment from hadoop-yetus Mar 7, 2025
@apache apache deleted a comment from hadoop-yetus Mar 7, 2025
@apache apache deleted a comment from hadoop-yetus Mar 7, 2025
@ahmarsuhail ahmarsuhail force-pushed the HADOOP-19399-crt-client-support branch from badf0db to 032c9a8 Compare March 7, 2025 13:06
@ahmarsuhail ahmarsuhail marked this pull request as ready for review March 7, 2025 13:06
@@ -324,6 +326,19 @@
</properties>
</profile>

<!-- Use the S3 CRT client -->
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this doesn't seem to be working btw, ran mvn clean verify -Dit.test=ITestAwsSdkWorkarounds -Dtest=none -Pcrt as I expect ITestAwsSdkWorkarounds to fail when using CRT, but it still passes. Need to look

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ITestAwsSdkWorkarounds stops looking for the error text with the sdk update patch

@ahmarsuhail
Copy link
Contributor Author

@steveloughran @mukund-thakur @shameersss1

Could you please review? Internal TPC-DS benchmarks for AAL seem to be consistently showing better numbers with CRT, possibly due to the CRT's enhanced connection management, so want to get this in for 3.4.2.

durationTrackerFactory,
STORE_CLIENT_CREATION.getSymbol(),
() -> clientFactory.createS3AsyncClient(getUri(), clientCreationParameters));
return trackDurationOfOperation(durationTrackerFactory, STORE_CLIENT_CREATION.getSymbol(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should actually go into the client factory, not here. moving this over

if (regionEndpointInformation.getEndpoint() == null) {
s3CrtAsyncClientBuilder.endpointOverride(regionEndpointInformation.getEndpoint());
}

Copy link

@rajdchak rajdchak Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. We are doing a null check for both the region and the endpoint and then setting to that if null?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, it was not right :) fixed!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm thinking we need a special region "sdk" to let the sdk take over region resolution; would that help here too?

@@ -0,0 +1,232 @@
/*
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we not modify the configureEndpointAndRegion method in DefaultS3ClientFactory class to support AWSRegionEndpointInformation and not create a new class for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prefer to move it out to a separate class, this logic doesn't really belong in the client factory imo

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this stuff is getting to complicated and the endpoint/region code a significant source of pain. Isolation, more tests etc would be good.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 5m 38s Maven dependency ordering for branch
+1 💚 mvninstall 19m 16s trunk passed
+1 💚 compile 8m 26s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 7m 23s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 1m 59s trunk passed
+1 💚 mvnsite 1m 5s trunk passed
+1 💚 javadoc 1m 1s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 59s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 31s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 21m 26s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 9m 28s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 9m 28s the patch passed
+1 💚 compile 8m 9s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 8m 9s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 10 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 2m 10s /results-checkstyle-root.txt root: The patch generated 12 new + 21 unchanged - 3 fixed = 33 total (was 24)
+1 💚 mvnsite 0m 49s the patch passed
-1 ❌ javadoc 0m 33s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-1 ❌ javadoc 0m 34s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
+0 🆗 spotbugs 0m 26s hadoop-project has no data from spotbugs
+1 💚 shadedclient 20m 43s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 24s hadoop-project in the patch passed.
+1 💚 unit 2m 36s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 36s The patch does not generate ASF License warnings.
120m 33s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/9/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 226abc4127b3 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 04f83c6
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/9/testReport/
Max. process+thread count 555 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/9/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 19m 4s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 5m 57s Maven dependency ordering for branch
+1 💚 mvninstall 36m 40s trunk passed
+1 💚 compile 17m 51s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 13s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 34s trunk passed
+1 💚 mvnsite 1m 35s trunk passed
+1 💚 javadoc 1m 29s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 21s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 41s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 39m 34s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 35s Maven dependency ordering for patch
+1 💚 mvninstall 0m 47s the patch passed
+1 💚 compile 16m 52s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 16m 52s the patch passed
+1 💚 compile 15m 22s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 22s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 9 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 31s /results-checkstyle-root.txt root: The patch generated 13 new + 21 unchanged - 3 fixed = 34 total (was 24)
+1 💚 mvnsite 1m 31s the patch passed
-1 ❌ javadoc 0m 51s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-1 ❌ javadoc 0m 47s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
+0 🆗 spotbugs 0m 36s hadoop-project has no data from spotbugs
+1 💚 shadedclient 39m 30s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 34s hadoop-project in the patch passed.
-1 ❌ unit 3m 21s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch passed.
+1 💚 asflicense 1m 3s The patch does not generate ASF License warnings.
238m 21s
Reason Tests
Failed junit tests hadoop.fs.s3a.impl.TestClientManager
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/8/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 7233acf0b060 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 032c9a8
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/8/testReport/
Max. process+thread count 525 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/8/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 20s Maven dependency ordering for branch
+1 💚 mvninstall 32m 45s trunk passed
+1 💚 compile 15m 48s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 1s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 31s trunk passed
+1 💚 mvnsite 1m 32s trunk passed
+1 💚 javadoc 1m 49s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 17s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 38s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 35m 23s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 35s Maven dependency ordering for patch
+1 💚 mvninstall 0m 43s the patch passed
+1 💚 compile 15m 29s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 15m 29s the patch passed
+1 💚 compile 15m 19s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 19s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 9 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 26s /results-checkstyle-root.txt root: The patch generated 13 new + 21 unchanged - 3 fixed = 34 total (was 24)
+1 💚 mvnsite 1m 28s the patch passed
-1 ❌ javadoc 0m 57s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-1 ❌ javadoc 0m 41s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
+0 🆗 spotbugs 0m 32s hadoop-project has no data from spotbugs
+1 💚 shadedclient 36m 57s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 32s hadoop-project in the patch passed.
+1 💚 unit 3m 19s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 57s The patch does not generate ASF License warnings.
205m 46s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/10/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 891171ec517d 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / f72e64a
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/10/testReport/
Max. process+thread count 554 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/10/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 56s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 6 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 4s Maven dependency ordering for branch
+1 💚 mvninstall 32m 42s trunk passed
+1 💚 compile 17m 34s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 14m 23s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 9s trunk passed
+1 💚 mvnsite 1m 28s trunk passed
+1 💚 javadoc 1m 23s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 16s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 38s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 36m 35s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for patch
+1 💚 mvninstall 0m 42s the patch passed
+1 💚 compile 16m 26s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 16m 26s the patch passed
+1 💚 compile 15m 12s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 12s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 9 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 7s /results-checkstyle-root.txt root: The patch generated 13 new + 21 unchanged - 3 fixed = 34 total (was 24)
+1 💚 mvnsite 1m 25s the patch passed
-1 ❌ javadoc 0m 49s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt hadoop-tools_hadoop-aws-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
-1 ❌ javadoc 0m 44s /results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 2 new + 0 unchanged - 0 fixed = 2 total (was 0)
+0 🆗 spotbugs 0m 33s hadoop-project has no data from spotbugs
+1 💚 shadedclient 38m 7s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 36s hadoop-project in the patch passed.
+1 💚 unit 3m 26s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 5s The patch does not generate ASF License warnings.
208m 27s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/11/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux cf3296fc488f 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / cab8a41
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/11/testReport/
Max. process+thread count 729 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/11/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@shameersss1
Copy link
Contributor

@steveloughran @mukund-thakur @shameersss1

Could you please review? Internal TPC-DS benchmarks for AAL seem to be consistently showing better numbers with CRT, possibly due to the CRT's enhanced connection management, so want to get this in for 3.4.2.

Great to hear we see improvements in the performance. Just curious, Was it for Parquet file format ? And % improvement ?

}

if (regionEndpointInformation.getEndpoint() != null) {
s3CrtAsyncClientBuilder.endpointOverride(regionEndpointInformation.getEndpoint());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is setting both endpoint and region expected ? Won't the client throw exception in that case ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, I haven't made any changes to the logic, just refactored it a bit. We've always had the ability for uses to set both the endpoint and the region.. I think maybe useful if you want to us a specific endpoint for a region (eg: fips enabled?) .. this whole logic does need a bit of a rethink and re-write, see: https://issues.apache.org/jira/browse/HADOOP-19470

@@ -356,7 +362,7 @@ public PutObjectRequest.Builder newPutObjectRequestBuilder(String key,
}

// Set the timeout for object uploads but not directory markers.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

edit the comment as well

@shameersss1
Copy link
Contributor

@ahmarsuhail : I see an effort to upgrade SDK V2 : https://issues.apache.org/jira/browse/HADOOP-19485

I am not sure will that cause any conflicts here. Wouldn't it be nice to start this after upgrade ? Is there any depedency between SDK version CRT version ?

@ahmarsuhail
Copy link
Contributor Author

thanks @shameersss1, yeah agreed makes sense to wait for the SDK upgrade to go in first. i just wanted to get the code up for review, but will test again after the upgrade

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the time to get that region logic under control. Pulling it out helps -and makes for a change easier to cherrypick.

what do we actually need here?

@@ -1767,6 +1767,48 @@ To disable checksum verification in `distcp`, use the `-skipcrccheck` option:
hadoop distcp -update -skipcrccheck -numListstatusThreads 40 /user/alice/datasets s3a://alice-backup/datasets
```

### <a name="distcp"></a> Using the S3 CRT client
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pull out all crtclient stuff into its own doc

<dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<scope>compile</scope>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provided, unless it really is to be shipped

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets leave at provided, but docs to indicate this can be left out if the option is disabled

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you mean leave at compile ? or should I move to provided?

Also just to confirm, only difference is if it's provided it won't get included in the final release tar, and anyone who wants to use it has to put the jar there?

public S3AsyncClient createS3AsyncClient(final URI uri,
final S3ClientCreationParameters parameters) throws IOException {
if (parameters.isCrtEnabled()) {
LOG_S3_CRT_ENABLED.info("The S3 CRT client is enabled");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not at debug? if everything works. no need to print anything


The [AWS CRT-based S3 client](https://docs.aws.amazon.com/sdk-for-java/latest/developer-guide/crt-based-s3-client.html)
is built on top of the AWS Common Runtime (CRT), is an alternative S3 asynchronous client. It has
enhanced connection pool management, and can provide higher transfer from and to S3 due to its
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

higher transfer what? bandwidth, latency, support calls?.

also, s3a block output stream splits PUT requests, and vector io/analytics does the GET stuff, so what else does it offer? load balancing?

@@ -0,0 +1,70 @@
package org.apache.hadoop.fs.s3a.impl;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pulling out is good.

we need to fix that region resolution logic

  • work in ec2/k8s without going out of region -automatically
  • add an option to use the default chain without the "add a space" trick, e.g a region called "(sdk)". Not backwards compatible, but makes clear what happens.

@@ -0,0 +1,232 @@
/*
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this stuff is getting to complicated and the endpoint/region code a significant source of pain. Isolation, more tests etc would be good.

// region configuration was set to empty string.
// allow this if people really want it; it is OK to rely on this
// when deployed in EC2.
WARN_OF_DEFAULT_REGION_CHAIN.warn(SDK_REGION_CHAIN_IN_USE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets just log at debug

@steveloughran steveloughran changed the title HADOOP-19399. Adds in support for CRT client. HADOOP-19399. S3A: support CRT client. Mar 28, 2025
Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

commented

@ahmarsuhail I think you'd missed I'd already commented on it, so now I add more homework for you

@@ -205,6 +205,7 @@
<surefire.fork.timeout>900</surefire.fork.timeout>
<aws-java-sdk.version>1.12.720</aws-java-sdk.version>
<aws-java-sdk-v2.version>2.25.53</aws-java-sdk-v2.version>
<software.amazon.awssdk.crt.version>0.29.11</software.amazon.awssdk.crt.version>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to aws-crt-client.version

<dependency>
<groupId>software.amazon.awssdk.crt</groupId>
<artifactId>aws-crt</artifactId>
<scope>compile</scope>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets leave at provided, but docs to indicate this can be left out if the option is disabled

@@ -175,6 +163,39 @@ public S3AsyncClient createS3AsyncClient(
return s3AsyncClientBuilder.build();
}

private S3AsyncClient createS3CrtAsyncClient(URI uri, S3ClientCreationParameters parameters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we pull all async client creation into its own class, so as to be confident that there's no crt client references elsewhere in the code? I don't what it to become mandatory on the classpath

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These CRT classes referenced here are all included in the SDK bundle. It's just that if you try to initialise it, and you don't have the CRT dependency on the classpath it'll fail. do you think we still should need to move?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safe if these are the only places of this source file where the crt classes are referenced.

@@ -81,6 +81,7 @@ S3Client createS3Client(URI uri,
S3AsyncClient createS3AsyncClient(URI uri,
S3ClientCreationParameters parameters) throws IOException;


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: cut

region = Region.of(AWS_S3_DEFAULT_REGION);
builder.withRegion(region);
origin = "cross region access fallback";
} else if (configuredRegion.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

proposed, extra regiond

  • "sdk" which explicitly switches to sdk resolution
  • "auto" which is "us doing the right thing"; document this may change across releases

}


protected static URI getS3Endpoint(String endpoint, final Configuration conf) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

javadoc; highlight failure condition.

}

try {
return new URI(endpoint);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to think of unit tests to break this, e.g https://something/http://something-else

@@ -59,6 +60,7 @@ public class ClientManagerImpl

public static final Logger LOG = LoggerFactory.getLogger(ClientManagerImpl.class);


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cut

@ahmarsuhail
Copy link
Contributor Author

ahmarsuhail commented Mar 31, 2025

This PR is dependent on the SDK upgrade: #7479

@@ -165,14 +153,41 @@ public S3AsyncClient createS3AsyncClient(
configureClientBuilder(S3AsyncClient.builder(), parameters, conf, bucket)
.httpClientBuilder(httpClientBuilder);

// multipart upload pending with HADOOP-19326.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

safe to cut this now as SDK no longer complains with Multipart download is not yet supported. Instead use the CRT based S3 client for multipart download. when trying to do a Ranged GET with the S3 Async client and with >2.29.52

@ahmarsuhail ahmarsuhail force-pushed the HADOOP-19399-crt-client-support branch 2 times, most recently from 57ad5f2 to e3122d0 Compare March 31, 2025 16:20
@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 8m 11s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 3s Maven dependency ordering for branch
+1 💚 mvninstall 18m 59s trunk passed
+1 💚 compile 8m 14s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 7m 17s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 1m 58s trunk passed
+1 💚 mvnsite 0m 59s trunk passed
+1 💚 javadoc 1m 1s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 55s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 31s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 20m 46s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 7m 59s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 7m 59s the patch passed
+1 💚 compile 7m 20s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 7m 20s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 13 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 1m 52s /results-checkstyle-root.txt root: The patch generated 18 new + 18 unchanged - 3 fixed = 36 total (was 21)
+1 💚 mvnsite 1m 0s the patch passed
+1 💚 javadoc 1m 0s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 58s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 26s hadoop-project has no data from spotbugs
+1 💚 shadedclient 20m 47s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 26s hadoop-project in the patch passed.
+1 💚 unit 2m 51s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 40s The patch does not generate ASF License warnings.
125m 39s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/16/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 58c239697864 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / db44e05
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/16/testReport/
Max. process+thread count 706 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/16/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 20m 32s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 32s Maven dependency ordering for branch
+1 💚 mvninstall 31m 35s trunk passed
+1 💚 compile 15m 46s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 13m 38s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 14s trunk passed
+1 💚 mvnsite 1m 38s trunk passed
+1 💚 javadoc 1m 35s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 28s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 45s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 34m 57s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 36s Maven dependency ordering for patch
+1 💚 mvninstall 0m 44s the patch passed
+1 💚 compile 14m 53s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 14m 53s the patch passed
+1 💚 compile 13m 35s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 13m 35s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 13 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 12s /results-checkstyle-root.txt root: The patch generated 18 new + 18 unchanged - 3 fixed = 36 total (was 21)
+1 💚 mvnsite 1m 37s the patch passed
+1 💚 javadoc 1m 33s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 28s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 39s hadoop-project has no data from spotbugs
+1 💚 shadedclient 35m 26s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 39s hadoop-project in the patch passed.
+1 💚 unit 3m 43s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 3s The patch does not generate ASF License warnings.
219m 51s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/14/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 08d5473b5870 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 470b3dc
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/14/testReport/
Max. process+thread count 546 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/14/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 37m 10s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 1s codespell was not available.
+0 🆗 detsecrets 0m 1s detect-secrets was not available.
+0 🆗 xmllint 0m 1s xmllint was not available.
+0 🆗 markdownlint 0m 1s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 15s Maven dependency ordering for branch
+1 💚 mvninstall 36m 30s trunk passed
+1 💚 compile 17m 42s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 11s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 35s trunk passed
+1 💚 mvnsite 1m 33s trunk passed
+1 💚 javadoc 1m 31s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 21s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 42s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 40m 4s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for patch
+1 💚 mvninstall 0m 43s the patch passed
+1 💚 compile 16m 43s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 16m 43s the patch passed
+1 💚 compile 15m 3s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 3s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 13 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 4m 33s /results-checkstyle-root.txt root: The patch generated 18 new + 18 unchanged - 3 fixed = 36 total (was 21)
+1 💚 mvnsite 1m 31s the patch passed
+1 💚 javadoc 1m 25s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 22s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 36s hadoop-project has no data from spotbugs
+1 💚 shadedclient 39m 26s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 35s hadoop-project in the patch passed.
+1 💚 unit 3m 35s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 0s The patch does not generate ASF License warnings.
256m 36s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/15/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 0c085e1c9780 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / db44e05
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/15/testReport/
Max. process+thread count 531 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/15/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

@ahmarsuhail givnen the SDK upgrade is still awaiting approval, maybe you'd want to look at it

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 51s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 9m 7s Maven dependency ordering for branch
+1 💚 mvninstall 36m 26s trunk passed
+1 💚 compile 17m 34s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 15m 8s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 35s trunk passed
+1 💚 mvnsite 1m 32s trunk passed
+1 💚 javadoc 1m 30s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 25s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 43s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 39m 32s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 34s Maven dependency ordering for patch
+1 💚 mvninstall 0m 44s the patch passed
+1 💚 compile 16m 39s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 16m 39s the patch passed
+1 💚 compile 15m 3s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 15m 3s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 36s /results-checkstyle-root.txt root: The patch generated 10 new + 16 unchanged - 5 fixed = 26 total (was 21)
+1 💚 mvnsite 1m 30s the patch passed
+1 💚 javadoc 1m 27s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 22s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 37s hadoop-project has no data from spotbugs
+1 💚 shadedclient 40m 5s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 36s hadoop-project in the patch passed.
+1 💚 unit 3m 37s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 3s The patch does not generate ASF License warnings.
222m 53s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/17/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux 3d7218b5c42d 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 73a4783
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/17/testReport/
Max. process+thread count 524 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/17/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 55s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 1s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 4s Maven dependency ordering for branch
+1 💚 mvninstall 31m 56s trunk passed
+1 💚 compile 15m 37s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 13m 40s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 4m 15s trunk passed
+1 💚 mvnsite 1m 37s trunk passed
+1 💚 javadoc 1m 36s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 26s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 41s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 34m 26s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 37s Maven dependency ordering for patch
-1 ❌ mvninstall 0m 21s /patch-mvninstall-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
-1 ❌ compile 13m 59s /patch-compile-root-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt root in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ javac 13m 59s /patch-compile-root-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt root in the patch failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ compile 12m 50s /patch-compile-root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt root in the patch failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
-1 ❌ javac 12m 50s /patch-compile-root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt root in the patch failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 6s /results-checkstyle-root.txt root: The patch generated 13 new + 17 unchanged - 5 fixed = 30 total (was 22)
-1 ❌ mvnsite 0m 46s /patch-mvnsite-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 javadoc 1m 31s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 27s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 37s hadoop-project has no data from spotbugs
-1 ❌ spotbugs 0m 44s /patch-spotbugs-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 shadedclient 34m 4s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 37s hadoop-project in the patch passed.
-1 ❌ unit 0m 45s /patch-unit-hadoop-tools_hadoop-aws.txt hadoop-aws in the patch failed.
+1 💚 asflicense 1m 1s The patch does not generate ASF License warnings.
192m 26s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/18/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux e110016c3b98 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 96fd7cb
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/18/testReport/
Max. process+thread count 563 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/18/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 2m 27s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 1s The patch does not contain any @author tags.
+1 💚 test4tests 0m 1s The patch appears to include 4 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 7m 29s Maven dependency ordering for branch
+1 💚 mvninstall 42m 3s trunk passed
-1 ❌ compile 21m 30s /branch-compile-root-jdkUbuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.txt root in trunk failed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04.
-1 ❌ compile 0m 47s /branch-compile-root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt root in trunk failed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.
-0 ⚠️ checkstyle 0m 29s /buildtool-branch-checkstyle-root.txt The patch fails to run checkstyle in root
+1 💚 mvnsite 1m 38s trunk passed
+1 💚 javadoc 1m 31s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 26s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 45s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
-1 ❌ shadedclient 49m 14s branch has errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 37s Maven dependency ordering for patch
+1 💚 mvninstall 0m 44s the patch passed
+1 💚 compile 15m 3s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 15m 3s the patch passed
+1 💚 compile 13m 40s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
-1 ❌ javac 13m 40s /results-compile-javac-root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06.txt root-jdkPrivateBuild-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06 generated 35 new + 0 unchanged - 0 fixed = 35 total (was 0)
+1 💚 blanks 0m 0s The patch has no blanks issues.
-0 ⚠️ checkstyle 4m 10s /results-checkstyle-root.txt root: The patch generated 30 new + 0 unchanged - 0 fixed = 30 total (was 0)
+1 💚 mvnsite 1m 38s the patch passed
+1 💚 javadoc 1m 33s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 1m 27s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 38s hadoop-project has no data from spotbugs
+1 💚 shadedclient 35m 10s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 39s hadoop-project in the patch passed.
+1 💚 unit 3m 40s hadoop-aws in the patch passed.
+1 💚 asflicense 1m 4s The patch does not generate ASF License warnings.
216m 29s
Subsystem Report/Notes
Docker ClientAPI=1.48 ServerAPI=1.48 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/19/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux e09a6191dbc1 5.15.0-130-generic #140-Ubuntu SMP Wed Dec 18 17:59:53 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / ea3ee79
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/19/testReport/
Max. process+thread count 544 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/19/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

## Enabling the CRT Client
The CRT client can be enabled as follows:

```
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: xml

can be found
[here](https://aws.amazon.com/blogs/developer/introducing-crt-based-s3-client-and-the-s3-transfer-manager-in-the-aws-sdk-for-java-2-x/).

When making multiple parallel GET requests, using the CRT ensures load is evenly distributed across S3. This can be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

S3 front end servers, S3 back end stores -or both?

[here](https://aws.amazon.com/blogs/developer/introducing-crt-based-s3-client-and-the-s3-transfer-manager-in-the-aws-sdk-for-java-2-x/).

When making multiple parallel GET requests, using the CRT ensures load is evenly distributed across S3. This can be
useful for all three input streams available with versions > 3.4.2, as:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

>=

}

/**
* Builds user agent string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: .

@@ -212,6 +246,27 @@ public static RetryPolicy.Builder createRetryPolicyBuilder(Configuration conf) {
return retryPolicyBuilder;
}


private static S3CrtProxyConfiguration mapCRTProxyConfiguration(
software.amazon.awssdk.http.nio.netty.ProxyConfiguration proxyConfiguration) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this the same for the unshaded SDK JARs?

builder.putHeader(REQUESTER_PAYS_HEADER, REQUESTER_PAYS_HEADER_VALUE);
}

builder.putHeader(USER_AGENT, userAgent);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not for this PR, but maybe in future we could add the auditor ? strings here so it gets through the CRT and all the way to cloudtrail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yup, tracked in https://issues.apache.org/jira/browse/HADOOP-19365, will add audit support for CRT

/**
* Requester pays header value. Value {@value}.
*/
public static final String REQUESTER_PAYS_HEADER_VALUE = "requester";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make an InternalConstant unless there's a need to expose it

@@ -175,6 +163,39 @@ public S3AsyncClient createS3AsyncClient(
return s3AsyncClientBuilder.build();
}

private S3AsyncClient createS3CrtAsyncClient(URI uri, S3ClientCreationParameters parameters)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Safe if these are the only places of this source file where the crt classes are referenced.

@@ -50,6 +50,7 @@ full details.
* [Auditing Architecture](./auditing_architecture.html).
* [Testing](./testing.html)
* [S3Guard](./s3guard.html)
* [Using the S3 CRT Client](./crt_client.html).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. point to this in the reading.md file
  2. in performance.md, say that the analytics stream and crt client may also provide performance improvements

/**
* Flag to enable the CRT client. Value {@value}.
*/
public static final String CRT_CLIENT_ENABLED = "fs.s3a.crt.enabled";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you export this in the filesystem hasPathCapability, and add to InternalContants.S3A_DYNAMIC_CAPABILITIES , for bucket-info and storediag. thanks.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tested s3 standard london, stream=analytics, crt=enabled. saw various failures

ot

This hasn't yet been rebased to the new SDK has it? some of the errors (e.g ITestAwsSdkWorkarounds) are already fixed so don't matter

  • Test failures of vectored reads and elsewhere on S3Client.makeMetaRequest has invalid options; Operation name must be set for MetaRequestType.DEFAULT.

ITestS3AAnalyticsAcceleratorStreamReading.testMalformedParquetFooter failed with "Caused by: java.lang.UnsupportedOperationException: Multipart download is not yet supported. Instead use the CRT based S3 client for multipart download."

other than these failures and a few minor comments, I'm happy

[ERROR] ITestS3AContractAnalyticsStreamVectoredRead.testConsecutiveRanges  Time elapsed: 0.377 s  <<< ERROR!
java.io.IOException: Client error accessing s3://stevel-london/test/vectored_file.txt
        at software.amazon.s3.analyticsaccelerator.exceptions.ExceptionHandler.createIOException(ExceptionHandler.java:94)
        at software.amazon.s3.analyticsaccelerator.exceptions.ExceptionHandler.lambda$static$2(ExceptionHandler.java:40)
        at software.amazon.s3.analyticsaccelerator.exceptions.ExceptionHandler.lambda$toIOException$6(ExceptionHandler.java:74)
        at java.util.Optional.map(Optional.java:215)
        at software.amazon.s3.analyticsaccelerator.exceptions.ExceptionHandler.toIOException(ExceptionHandler.java:74)
        at software.amazon.s3.analyticsaccelerator.S3SdkObjectClient.lambda$handleException$5(S3SdkObjectClient.java:197)
        at java.util.concurrent.CompletableFuture.uniExceptionally(CompletableFuture.java:884)
        at java.util.concurrent.CompletableFuture.uniExceptionallyStage(CompletableFuture.java:898)
        at java.util.concurrent.CompletableFuture.exceptionally(CompletableFuture.java:2209)
        at software.amazon.s3.analyticsaccelerator.S3SdkObjectClient.headObject(S3SdkObjectClient.java:139)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.lambda$asyncGet$2(MetadataStore.java:123)
        at java.util.HashMap.computeIfAbsent(HashMap.java:1128)
        at java.util.Collections$SynchronizedMap.computeIfAbsent(Collections.java:2674)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.asyncGet(MetadataStore.java:114)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.MetadataStore.get(MetadataStore.java:92)
        at software.amazon.s3.analyticsaccelerator.io.physical.impl.PhysicalIOImpl.<init>(PhysicalIOImpl.java:87)
        at software.amazon.s3.analyticsaccelerator.S3SeekableInputStreamFactory.createLogicalIO(S3SeekableInputStreamFactory.java:144)
        at software.amazon.s3.analyticsaccelerator.S3SeekableInputStreamFactory.createStream(S3SeekableInputStreamFactory.java:113)
        at org.apache.hadoop.fs.s3a.impl.streams.AnalyticsStream.<init>(AnalyticsStream.java:58)
        at org.apache.hadoop.fs.s3a.impl.streams.AnalyticsStreamFactory.readObject(AnalyticsStreamFactory.java:70)
        at org.apache.hadoop.fs.s3a.impl.S3AStoreImpl.readObject(S3AStoreImpl.java:964)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.executeOpen(S3AFileSystem.java:1922)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$null$42(S3AFileSystem.java:5503)
        at org.apache.hadoop.util.LambdaUtils.eval(LambdaUtils.java:52)
        at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$openFileWithOptions$43(S3AFileSystem.java:5502)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:750)
Caused by: software.amazon.awssdk.core.exception.SdkClientException: Unable to execute HTTP request: S3Client.makeMetaRequest has invalid options; Operation name must be set for MetaRequestType.DEFAULT.
        at software.amazon.awssdk.core.exception.SdkClientException$BuilderImpl.build(SdkClientException.java:111)
        at software.amazon.awssdk.core.exception.SdkClientException.create(SdkClientException.java:47)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:223)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.utils.RetryableStageHelper.setLastException(RetryableStageHelper.java:218)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeRetryExecute(AsyncRetryableStage.java:182)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.lambda$attemptExecute$1(AsyncRetryableStage.java:159)
        at java.util.concurrent.CompletableFuture.uniWhenComplete(CompletableFuture.java:774)
        at java.util.concurrent.CompletableFuture.uniWhenCompleteStage(CompletableFuture.java:792)
        at java.util.concurrent.CompletableFuture.whenComplete(CompletableFuture.java:2153)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.attemptExecute(AsyncRetryableStage.java:156)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.maybeAttemptExecute(AsyncRetryableStage.java:136)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.execute(AsyncRetryableStage.java:95)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage.execute(AsyncRetryableStage.java:79)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage.execute(AsyncRetryableStage.java:44)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.execute(AsyncExecutionFailureExceptionReportingStage.java:41)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncExecutionFailureExceptionReportingStage.execute(AsyncExecutionFailureExceptionReportingStage.java:29)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncApiCallTimeoutTrackingStage.execute(AsyncApiCallTimeoutTrackingStage.java:64)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncApiCallTimeoutTrackingStage.execute(AsyncApiCallTimeoutTrackingStage.java:36)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncApiCallMetricCollectionStage.execute(AsyncApiCallMetricCollectionStage.java:49)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncApiCallMetricCollectionStage.execute(AsyncApiCallMetricCollectionStage.java:32)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.AmazonAsyncHttpClient$RequestExecutionBuilderImpl.execute(AmazonAsyncHttpClient.java:215)
        at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.invoke(BaseAsyncClientHandler.java:288)
        at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.doExecute(BaseAsyncClientHandler.java:227)
        at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.lambda$execute$1(BaseAsyncClientHandler.java:80)
        at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.measureApiCallSuccess(BaseAsyncClientHandler.java:294)
        at software.amazon.awssdk.core.internal.handler.BaseAsyncClientHandler.execute(BaseAsyncClientHandler.java:73)
        at software.amazon.awssdk.awscore.client.handler.AwsAsyncClientHandler.execute(AwsAsyncClientHandler.java:49)
        at software.amazon.awssdk.services.s3.DefaultS3AsyncClient.headObject(DefaultS3AsyncClient.java:7032)
        at software.amazon.awssdk.services.s3.DelegatingS3AsyncClient.lambda$headObject$53(DelegatingS3AsyncClient.java:5259)
        at software.amazon.awssdk.services.s3.internal.crossregion.S3CrossRegionAsyncClient.invokeOperation(S3CrossRegionAsyncClient.java:61)
        at software.amazon.awssdk.services.s3.DelegatingS3AsyncClient.headObject(DelegatingS3AsyncClient.java:5259)
        at software.amazon.awssdk.services.s3.DelegatingS3AsyncClient.lambda$headObject$53(DelegatingS3AsyncClient.java:5259)
        at software.amazon.awssdk.services.s3.DelegatingS3AsyncClient.invokeOperation(DelegatingS3AsyncClient.java:10144)
        at software.amazon.awssdk.services.s3.DelegatingS3AsyncClient.headObject(DelegatingS3AsyncClient.java:5259)
        at software.amazon.s3.analyticsaccelerator.S3SdkObjectClient.headObject(S3SdkObjectClient.java:132)
        ... 19 more
Caused by: java.lang.IllegalArgumentException: S3Client.makeMetaRequest has invalid options; Operation name must be set for MetaRequestType.DEFAULT.
        at software.amazon.awssdk.crt.s3.S3Client.makeMetaRequest(S3Client.java:148)
        at software.amazon.awssdk.services.s3.internal.crt.S3CrtAsyncHttpClient.execute(S3CrtAsyncHttpClient.java:163)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.doExecuteHttpRequest(MakeAsyncHttpRequestStage.java:204)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.executeHttpRequest(MakeAsyncHttpRequestStage.java:151)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.lambda$execute$1(MakeAsyncHttpRequestStage.java:104)
        at java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:670)
        at java.util.concurrent.CompletableFuture.uniAcceptStage(CompletableFuture.java:683)
        at java.util.concurrent.CompletableFuture.thenAccept(CompletableFuture.java:2010)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.execute(MakeAsyncHttpRequestStage.java:100)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.MakeAsyncHttpRequestStage.execute(MakeAsyncHttpRequestStage.java:65)
        at software.amazon.awssdk.core.internal.http.pipeline.RequestPipelineBuilder$ComposingRequestPipelineStage.execute(RequestPipelineBuilder.java:206)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncApiCallAttemptMetricCollectionStage.execute(AsyncApiCallAttemptMetricCollectionStage.java:62)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncApiCallAttemptMetricCollectionStage.execute(AsyncApiCallAttemptMetricCollectionStage.java:41)
        at software.amazon.awssdk.core.internal.http.pipeline.stages.AsyncRetryableStage$RetryingExecutor.attemptExecute(AsyncRetryableStage.java:144)
        ... 47 more

different error on ITestS3AAnalyticsAcceleratorStreamReading

[ERROR] ITestS3AAnalyticsAcceleratorStreamReading.testMalformedParquetFooter  Time elapsed: 2.903 s  <<< ERROR!
java.io.IOException: Failed to fetch block data after retries
        at software.amazon.s3.analyticsaccelerator.io.physical.data.Block.generateSourceAndData(Block.java:208)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.Block.<init>(Block.java:156)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.BlockManager.lambda$makeRangeAvailable$1(BlockManager.java:211)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.DefaultTelemetry.measureImpl(DefaultTelemetry.java:158)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.DefaultTelemetry.measure(DefaultTelemetry.java:79)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.Telemetry.measureStandard(Telemetry.java:174)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.BlockManager.makeRangeAvailable(BlockManager.java:185)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.Blob.read(Blob.java:95)
        at software.amazon.s3.analyticsaccelerator.io.physical.impl.PhysicalIOImpl.lambda$read$3(PhysicalIOImpl.java:162)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.DefaultTelemetry.measure(DefaultTelemetry.java:102)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.Telemetry.measureVerbose(Telemetry.java:253)
        at software.amazon.s3.analyticsaccelerator.io.physical.impl.PhysicalIOImpl.read(PhysicalIOImpl.java:151)
        at software.amazon.s3.analyticsaccelerator.io.logical.impl.DefaultLogicalIOImpl.lambda$read$1(DefaultLogicalIOImpl.java:92)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.DefaultTelemetry.measureConditionally(DefaultTelemetry.java:141)
        at software.amazon.s3.analyticsaccelerator.io.logical.impl.DefaultLogicalIOImpl.read(DefaultLogicalIOImpl.java:81)
        at software.amazon.s3.analyticsaccelerator.io.logical.impl.ParquetLogicalIOImpl.read(ParquetLogicalIOImpl.java:73)
        at software.amazon.s3.analyticsaccelerator.S3SeekableInputStream.lambda$read$3(S3SeekableInputStream.java:155)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.DefaultTelemetry.measure(DefaultTelemetry.java:102)
        at software.amazon.s3.analyticsaccelerator.common.telemetry.Telemetry.measureVerbose(Telemetry.java:253)
        at software.amazon.s3.analyticsaccelerator.S3SeekableInputStream.read(S3SeekableInputStream.java:145)
        at org.apache.hadoop.fs.s3a.impl.streams.AnalyticsStream.read(AnalyticsStream.java:123)
        at java.io.DataInputStream.read(DataInputStream.java:149)
        at org.apache.hadoop.fs.s3a.ITestS3AAnalyticsAcceleratorStreamReading.testMalformedParquetFooter(ITestS3AAnalyticsAcceleratorStreamReading.java:159)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
        at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
        at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
        at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
        at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
        at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
        at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
        at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.UnsupportedOperationException: Multipart download is not yet supported. Instead use the CRT based S3 client for multipart download.
        at software.amazon.awssdk.services.s3.internal.multipart.MultipartS3AsyncClient.getObject(MultipartS3AsyncClient.java:116)
        at software.amazon.s3.analyticsaccelerator.S3SdkObjectClient.getObject(S3SdkObjectClient.java:182)
        at software.amazon.s3.analyticsaccelerator.io.physical.data.Block.generateSourceAndData(Block.java:182)
        ... 37 more

```

@ahmarsuhail
Copy link
Contributor Author

@steveloughran yeah those errors will go away once I rebase to the latest SDK.

will rebase and address your comments now. are you ok with getting this into 3.4.2?

i'm just starting the setup work for the release..going through https://github.com/apache/hadoop-release-support/tree/main

@steveloughran
Copy link
Contributor

If you can address those little nits and and are confident that the errors reported go away, then yes.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one more comment during region resolution, just for the code change and not for docs and tests, which can wait for the full PR.

// region configuration was set to empty string.
// allow this if people really want it; it is OK to rely on this
// when deployed in EC2.
WARN_OF_DEFAULT_REGION_CHAIN.debug(SDK_REGION_CHAIN_IN_USE);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cut this warn

region = Region.of(AWS_S3_DEFAULT_REGION);
builder.withRegion(region);
origin = "cross region access fallback";
} else if (configuredRegion.isEmpty()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add the option "sdk" here. The full changes I have in mind, including an "ec2" which is only the EC2 side resolution would be a separate patch, but the sdk(which we can leave undocumented) option can ship in this release

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didn't understand, can you clarify, where do i need to add the sdk option?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imagine we have a special region "sdk" which if set does the same as the empty newline hack

@ahmarsuhail ahmarsuhail force-pushed the HADOOP-19399-crt-client-support branch from ea3ee79 to 66e2b65 Compare April 30, 2025 17:17
@ahmarsuhail
Copy link
Contributor Author

rebased and addressed most comments (i think!), will test and double check tomorrow.

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 21s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+0 🆗 xmllint 0m 0s xmllint was not available.
+0 🆗 markdownlint 0m 0s markdownlint was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 6m 30s Maven dependency ordering for branch
+1 💚 mvninstall 19m 13s trunk passed
+1 💚 compile 8m 16s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 compile 7m 29s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 checkstyle 2m 2s trunk passed
+1 💚 mvnsite 1m 5s trunk passed
+1 💚 javadoc 1m 1s trunk passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 52s trunk passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 32s branch/hadoop-project no spotbugs output file (spotbugsXml.xml)
+1 💚 shadedclient 20m 27s branch has no errors when building and testing our client artifacts.
_ Patch Compile Tests _
+0 🆗 mvndep 0m 24s Maven dependency ordering for patch
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 7m 57s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javac 7m 57s the patch passed
+1 💚 compile 7m 25s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+1 💚 javac 7m 25s the patch passed
-1 ❌ blanks 0m 0s /blanks-eol.txt The patch has 6 line(s) that end in blanks. Use git apply --whitespace=fix <<patch_file>>. Refer https://git-scm.com/docs/git-apply
-0 ⚠️ checkstyle 1m 54s /results-checkstyle-root.txt root: The patch generated 14 new + 24 unchanged - 5 fixed = 38 total (was 29)
+1 💚 mvnsite 1m 5s the patch passed
+1 💚 javadoc 0m 58s the patch passed with JDK Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04
+1 💚 javadoc 0m 58s the patch passed with JDK Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
+0 🆗 spotbugs 0m 26s hadoop-project has no data from spotbugs
+1 💚 shadedclient 20m 56s patch has no errors when building and testing our client artifacts.
_ Other Tests _
+1 💚 unit 0m 21s hadoop-project in the patch passed.
+1 💚 unit 2m 52s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 42s The patch does not generate ASF License warnings.
118m 44s
Subsystem Report/Notes
Docker ClientAPI=1.49 ServerAPI=1.49 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/20/artifact/out/Dockerfile
GITHUB PR #7443
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient codespell detsecrets xmllint spotbugs checkstyle markdownlint
uname Linux f40c9e6c30d3 5.15.0-136-generic #147-Ubuntu SMP Sat Mar 15 15:53:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 66e2b65
Default Java Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.26+4-post-Ubuntu-1ubuntu120.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_442-8u442-b06us1-0ubuntu120.04-b06
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/20/testReport/
Max. process+thread count 552 (vs. ulimit of 5500)
modules C: hadoop-project hadoop-tools/hadoop-aws U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-7443/20/console
versions git=2.25.1 maven=3.6.3 spotbugs=4.2.2
Powered by Apache Yetus 0.14.0 https://yetus.apache.org

This message was automatically generated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants