Skip to content

HADOOP-16080. hadoop-aws does not work with hadoop-client-api #2575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Mar 9, 2021

Conversation

sunchao
Copy link
Member

@sunchao sunchao commented Dec 29, 2020

This is a backport of #2522 to trunk

@hadoop-yetus

This comment has been minimized.

@sunchao
Copy link
Member Author

sunchao commented Dec 29, 2020

could you review this @aajisaka? no conflict in the backport. cc @steveloughran too.

@steveloughran
Copy link
Contributor

steveloughran commented Jan 4, 2021

#2522 should have gone into trunk first

there should be nothing in an older branch which is not in trunk

@sunchao
Copy link
Member Author

sunchao commented Jan 4, 2021

@steveloughran my bad. Should've done this in a proper way.

@sunchao sunchao force-pushed the HADOOP-16080-trunk branch from 79db74e to ec20769 Compare January 5, 2021 06:45
@steveloughran
Copy link
Contributor

Usual due diligence query: which s3 endpoint did you run the integration tests against?

(I'll expect some tests failures there from HADOOP-16380 stabilisation; if you don't find them I'd be worried about your test setup...they won't be blockers)

@sunchao
Copy link
Member Author

sunchao commented Jan 5, 2021

@steveloughran eh I only tested this in Spark (verified that the failure here was fixed while was reproducible w/o the PR) using a S3A end point of my own. I can run the integration tests also - are the steps here?

@sunchao
Copy link
Member Author

sunchao commented Jan 6, 2021

Tried the steps above - most tests succeeded but I got a bunch of failures related to Dynamo DB table not found:

com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException: Requested resource not found: Table: sunchao not found (Service: AmazonDynamoDBv2; Status Code: 400; Error Code:      ResourceNotFoundException; Request ID: XXX; Proxy: null)

I'm confused since I didn't have neither -Ds3guard nor -Ddynamo.

@steveloughran
Copy link
Contributor

bq. I got a bunch of failures related to Dynamo DB table not found:

some of the tests are parameterized to do test runs with/without dynamoDB. They shouldn't be run if the -Ddynamo option wasn't set, but what has inevitably happened is that regressions into the test runs have crept in and we've not noticed.

if you are testing against a non-AWS endpoint we should skip tests related to that, IAM roles/session tokens and encryption

File a JIRA about it, tell us which tests are failing, etc...

@steveloughran
Copy link
Contributor

BTW, does this mean your initial PR went in without running the ITests? Not good. We like to have a strict "no tests, no review" policy for the hadoop-aws, hadoop-azure branches. We can't give yetus the credentials for test runs, and people need to be set up to debug why their patches don't work anyway. You managed to dodge a bit of the diligence requirements there.

Afraid I'm being strict now. If tests fail, list which ones did and we'll see if they are new/existing intermittent ones/recent regressions. #2594 is trying to fix some which have now surfaced in some test configs

@sunchao
Copy link
Member Author

sunchao commented Jan 6, 2021

some of the tests are parameterized to do test runs with/without dynamoDB. They shouldn't be run if the -Ddynamo option wasn't set, but what has inevitably happened is that regressions into the test runs have crept in and we've not noticed.

I didn't specify the -Ddynamo option. The command I used is:

mvn -Dparallel-tests -DtestsThreadCount=8 clean verify

I'm testing against my own S3A endpoint "s3a://sunchao/" which is in us-west-1 and I just followed the doc to setup auth-keys.xml. I didn't modify core-site.xml.

BTW, does this mean your initial PR went in without running the ITests?

Unfortunately no ... sorry I was not aware of the test steps here (first time contributing to hadoop-aws). I'll try to do some remedy in this PR. Test failures I got:

[ERROR] Tests run: 24, Failures: 1, Errors: 16, Skipped: 0, Time elapsed: 20.537 s <<< FAILURE! - in org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost
[ERROR] testDeleteSingleFileInDir[raw-delete-markers](org.apache.hadoop.fs.s3a.performance.ITestS3ADeleteCost)  Time elapsed: 2.036 s  <<< FAILURE!
java.lang.AssertionError: operation returning after fs.delete(simpleFile) action_executor_acquired starting=0 current=0 diff=0, action_http_get_request starting=0 current=0 diff=0,                action_http_head_request starting=4 current=5 diff=1, committer_bytes_committed starting=0 current=0 diff=0, committer_bytes_uploaded starting=0 current=0 diff=0, committer_commit_job starting=0  current=0 diff=0, committer_commits.failures starting=0 current=0 diff=0, committer_commits_aborted starting=0 current=0 diff=0, committer_commits_completed starting=0 current=0 diff=0,           committer_commits_created starting=0 current=0 diff=0, committer_commits_reverted starting=0 current=0 diff=0, committer_jobs_completed starting=0 current=0 diff=0, committer_jobs_failed          starting=0 current=0 diff=0, committer_magic_files_created starting=0 current=0 diff=0, committer_materialize_file starting=0 current=0 diff=0, committer_stage_file_upload starting=0 current=0    diff=0, committer_tasks_completed starting=0 current=0 diff=0, committer_tasks_failed starting=0 current=0 diff=0, delegation_token_issued starting=0 current=0 diff=0, directories_created         starting=2 current=3 diff=1, directories_deleted starting=0 current=0 diff=0, fake_directories_created starting=0 current=0 diff=0, fake_directories_deleted starting=6 current=8 diff=2,           files_copied starting=0 current=0 diff=0, files_copied_bytes starting=0 current=0 diff=0, files_created starting=1 current=1 diff=0, files_delete_rejected starting=0 current=0 diff=0,             files_deleted starting=0 current=1 diff=1, ignored_errors starting=0 current=0 diff=0, multipart_instantiated starting=0 current=0 diff=0, multipart_upload_abort_under_path_invoked starting=0     current=0 diff=0, multipart_upload_aborted starting=0 current=0 diff=0, multipart_upload_completed starting=0 current=0 diff=0, multipart_upload_part_put starting=0 current=0 diff=0,              multipart_upload_part_put_bytes starting=0 current=0 diff=0, multipart_upload_started starting=0 current=0 diff=0, object_bulk_delete_request starting=3 current=4 diff=1,                          object_continue_list_request starting=0 current=0 diff=0, object_copy_requests starting=0 current=0 diff=0, object_delete_objects starting=6 current=9 diff=3, object_delete_request starting=0     current=1 diff=1, object_list_request starting=5 current=6 diff=1, object_metadata_request starting=4 current=5 diff=1, object_multipart_aborted starting=0 current=0 diff=0,                       object_multipart_initiated starting=0 current=0 diff=0, object_put_bytes starting=0 current=0 diff=0, object_put_request starting=3 current=4 diff=1, object_put_request_completed starting=3       current=4 diff=1, object_select_requests starting=0 current=0 diff=0, op_copy_from_local_file starting=0 current=0 diff=0, op_create starting=1 current=1 diff=0, op_create_non_recursive           starting=0 current=0 diff=0, op_delete starting=0 current=1 diff=1, op_exists starting=0 current=0 diff=0, op_get_delegation_token starting=0 current=0 diff=0, op_get_file_checksum starting=0     current=0 diff=0, op_get_file_status starting=2 current=2 diff=0, op_glob_status starting=0 current=0 diff=0, op_is_directory starting=0 current=0 diff=0, op_is_file starting=0 current=0 diff=0,  op_list_files starting=0 current=0 diff=0, op_list_located_status starting=0 current=0 diff=0, op_list_status starting=0 current=0 diff=0, op_mkdirs starting=2 current=2 diff=0, op_open           starting=0 current=0 diff=0, op_rename starting=0 current=0 diff=0, s3guard_metadatastore_authoritative_directories_updated starting=0 current=0 diff=0, s3guard_metadatastore_initialization       starting=0 current=0 diff=0, s3guard_metadatastore_put_path_request starting=0 current=0 diff=0, s3guard_metadatastore_record_deletes starting=0 current=0 diff=0,                                  s3guard_metadatastore_record_reads starting=0 current=0 diff=0, s3guard_metadatastore_record_writes starting=0 current=0 diff=0, s3guard_metadatastore_retry starting=0 current=0 diff=0,           s3guard_metadatastore_throttled starting=0 current=0 diff=0, store_io_request starting=0 current=0 diff=0, store_io_retry starting=0 current=0 diff=0, store_io_throttled starting=0 current=0      diff=0, stream_aborted starting=0 current=0 diff=0, stream_read_bytes starting=0 current=0 diff=0, stream_read_bytes_backwards_on_seek starting=0 current=0 diff=0,                                 stream_read_bytes_discarded_in_abort starting=0 current=0 diff=0, stream_read_bytes_discarded_in_close starting=0 current=0 diff=0, stream_read_close_operations starting=0 current=0 diff=0,       stream_read_closed starting=0 current=0 diff=0, stream_read_exceptions starting=0 current=0 diff=0, stream_read_fully_operations starting=0 current=0 diff=0, stream_read_opened starting=0         current=0 diff=0, stream_read_operations starting=0 current=0 diff=0, stream_read_operations_incomplete starting=0 current=0 diff=0, stream_read_seek_backward_operations starting=0 current=0      diff=0, stream_read_seek_bytes_discarded starting=0 current=0 diff=0, stream_read_seek_bytes_skipped starting=0 current=0 diff=0, stream_read_seek_forward_operations starting=0 current=0 diff=0,  stream_read_seek_operations starting=0 current=0 diff=0, stream_read_seek_policy_changed starting=0 current=0 diff=0, stream_read_total_bytes starting=0 current=0 diff=0,                          stream_read_version_mismatches starting=0 current=0 diff=0, stream_write_block_uploads starting=0 current=0 diff=0, stream_write_block_uploads_aborted starting=0 current=0 diff=0,                 stream_write_block_uploads_committed starting=0 current=0 diff=0, stream_write_bytes starting=0 current=0 diff=0, stream_write_exceptions starting=0 current=0 diff=0,                              stream_write_exceptions_completing_upload starting=0 current=0 diff=0, stream_write_queue_duration starting=0 current=0 diff=0, stream_write_total_data starting=0 current=0 diff=0,                stream_write_total_time starting=0 current=0 diff=0: object_delete_objects expected:<2> but was:<3>

And seems most of the failures are due to error like the following:

Caused by: com.amazonaws.services.dynamodbv2.model.ResourceNotFoundException: Requested resource not found: Table: sunchao not found (Service: AmazonDynamoDBv2; Status Code: 400; Error Code:      ResourceNotFoundException; Request ID: XXX; Proxy: null)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1828)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1412)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1374)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
» at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
» at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
» at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
» at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.doInvoke(AmazonDynamoDBClient.java:5413)
» at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.invoke(AmazonDynamoDBClient.java:5380)
» at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.executeDescribeTable(AmazonDynamoDBClient.java:2098)
» at com.amazonaws.services.dynamodbv2.AmazonDynamoDBClient.describeTable(AmazonDynamoDBClient.java:2063)
» at com.amazonaws.services.dynamodbv2.document.Table.describe(Table.java:137)
» at org.apache.hadoop.fs.s3a.s3guard.DynamoDBMetadataStoreTableManager.initTable(DynamoDBMetadataStoreTableManager.java:171)
» ... 23 more

Not sure if I missed some steps in my test setup.

@steveloughran
Copy link
Contributor

Ignoring the s3guard/ddb ones (we're clearly still trying to run some when those tests aren't enabled), you are going to be seeing the failures covered in https://issues.apache.org/jira/browse/HADOOP-17451 / #2594 . I'll get that ready for merging today...mostly it's some of the tests using metrics being brittle to how they are executed

@steveloughran
Copy link
Contributor

something went wrong with yetus; try a rebase and forced push

@sunchao sunchao force-pushed the HADOOP-16080-trunk branch from ec20769 to c22d19f Compare January 9, 2021 06:34
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 2m 22s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 0m 0s test4tests The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+0 🆗 mvndep 13m 50s Maven dependency ordering for branch
+1 💚 mvninstall 25m 27s trunk passed
+1 💚 compile 26m 20s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 compile 22m 2s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+1 💚 checkstyle 2m 58s trunk passed
+1 💚 mvnsite 3m 32s trunk passed
+1 💚 shadedclient 24m 32s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 2m 26s trunk passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 javadoc 3m 8s trunk passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+0 🆗 spotbugs 0m 49s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 5m 25s trunk passed
_ Patch Compile Tests _
+0 🆗 mvndep 0m 26s Maven dependency ordering for patch
+1 💚 mvninstall 2m 27s the patch passed
+1 💚 compile 24m 49s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 javac 24m 49s the patch passed
+1 💚 compile 22m 11s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+1 💚 javac 22m 11s the patch passed
-0 ⚠️ checkstyle 3m 5s /diff-checkstyle-root.txt root: The patch generated 1 new + 46 unchanged - 0 fixed = 47 total (was 46)
+1 💚 mvnsite 3m 59s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 xml 0m 4s The patch has no ill-formed XML file.
+1 💚 shadedclient 17m 49s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 2m 35s the patch passed with JDK Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04
+1 💚 javadoc 3m 8s the patch passed with JDK Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
+1 💚 findbugs 5m 39s the patch passed
_ Other Tests _
+1 💚 unit 9m 59s hadoop-common in the patch passed.
+1 💚 unit 1m 29s hadoop-aws in the patch passed.
+1 💚 unit 0m 30s hadoop-aliyun in the patch passed.
+1 💚 unit 0m 32s hadoop-cos in the patch passed.
+1 💚 asflicense 0m 48s The patch does not generate ASF License warnings.
230m 18s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2575/3/artifact/out/Dockerfile
GITHUB PR #2575
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml
uname Linux aa26c37e736b 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/bin/hadoop.sh
git revision trunk / 890f2da
Default Java Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.9.1+1-Ubuntu-0ubuntu1.18.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_275-8u275-b01-0ubuntu1~18.04-b01
Test Results https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2575/3/testReport/
Max. process+thread count 2298 (vs. ulimit of 5500)
modules C: hadoop-common-project/hadoop-common hadoop-tools/hadoop-aws hadoop-tools/hadoop-aliyun hadoop-cloud-storage-project/hadoop-cos U: .
Console output https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2575/3/console
versions git=2.17.1 maven=3.6.0 findbugs=4.0.6
Powered by Apache Yetus 0.13.0-SNAPSHOT https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

I've been running on trunk for a while, everything is happy there. Going to +1 and merge this backport

@steveloughran steveloughran merged commit 176bd88 into apache:trunk Mar 9, 2021
@sunchao
Copy link
Member Author

sunchao commented Mar 9, 2021

thanks @steveloughran !

@sunchao sunchao deleted the HADOOP-16080-trunk branch March 9, 2021 21:21
@iwasakims
Copy link
Member

@sunchao I submitted #2758 as a follow-up of this. Could you take a look?

kiran-maturi pushed a commit to kiran-maturi/hadoop that referenced this pull request Nov 24, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants