Skip to content

HADOOP-XXXXX. S3Guard tombstones can mislead about directory empty status and other fixes - wip #1077

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

bgaborg
Copy link

@bgaborg bgaborg commented Jul 11, 2019

@bgaborg bgaborg requested a review from steveloughran July 11, 2019 08:16
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 61 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1105 trunk passed
+1 compile 37 trunk passed
+1 checkstyle 26 trunk passed
+1 mvnsite 41 trunk passed
+1 shadedclient 722 branch has no errors when building and testing our client artifacts.
+1 javadoc 25 trunk passed
0 spotbugs 61 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 59 trunk passed
_ Patch Compile Tests _
+1 mvninstall 32 the patch passed
+1 compile 31 the patch passed
+1 javac 31 the patch passed
-0 checkstyle 19 hadoop-tools/hadoop-aws: The patch generated 1 new + 7 unchanged - 0 fixed = 8 total (was 7)
+1 mvnsite 34 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 777 patch has no errors when building and testing our client artifacts.
+1 javadoc 24 the patch passed
+1 findbugs 65 the patch passed
_ Other Tests _
+1 unit 283 hadoop-aws in the patch passed.
+1 asflicense 33 The patch does not generate ASF License warnings.
3460
Subsystem Report/Notes
Docker Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/1/artifact/out/Dockerfile
GITHUB PR #1077
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 6ad6c34d2cca 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / ccaa99c
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/1/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/1/testReport/
Max. process+thread count 446 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/1/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@steveloughran steveloughran added the fs/s3 changes related to hadoop-aws; submitter must declare test endpoint label Jul 11, 2019
@steveloughran
Copy link
Contributor

I've checked this out & will work on it. Not failing for me either.

What I'm thinking of doing

  1. Create a dir DIR with files AAAA BBBB
  2. Delete AAAA (+assert tombstone)
  3. PUT AAAA/child to store
  4. Assert that LIST DIR returns AAAA as a prefix
  5. do a getFile status with empty dir flag

@steveloughran
Copy link
Contributor

I can replicate the problem when the dir being worked on is actually the root path /. This is consistent with all the failures happening in the root directory tests. I'd feared it was because the root dir is the one place where any file-under-a-tombstone would cause problems, but it could be that it is the sole place where our filtering breaks, because the prefix returned in the LIST == the name of a file just deleted

Gabor Bota and others added 3 commits July 15, 2019 11:04
…ombstone problem" can be replicated

You don't get it on deeper paths because the full prefix is returned "/test/dir/deleted", which doesn't match the short name of a tombstone.

this helps explain why we only see it on the root dir tests.

Change-Id: I6e00b39684b7f3ff76eb84ee877128fa01c0f2bc
@bgaborg bgaborg changed the title HADOOP-16380. S3Guard tombstones can mislead about directory empty status - wip, trying to reproduce HADOOP-16380. S3Guard tombstones can mislead about directory empty status - wip Jul 15, 2019
@bgaborg bgaborg force-pushed the HADOOP-16380-tombstones-mislead-dirempty branch from c3c60ec to e665f2d Compare July 15, 2019 10:42
@bgaborg bgaborg self-assigned this Jul 15, 2019
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 38 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1077 trunk passed
+1 compile 36 trunk passed
+1 checkstyle 23 trunk passed
+1 mvnsite 39 trunk passed
+1 shadedclient 743 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 63 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 61 trunk passed
_ Patch Compile Tests _
+1 mvninstall 32 the patch passed
+1 compile 31 the patch passed
+1 javac 31 the patch passed
+1 checkstyle 18 the patch passed
+1 mvnsite 36 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 725 patch has no errors when building and testing our client artifacts.
+1 javadoc 20 the patch passed
+1 findbugs 66 the patch passed
_ Other Tests _
+1 unit 280 hadoop-aws in the patch passed.
+1 asflicense 28 The patch does not generate ASF License warnings.
3371
Subsystem Report/Notes
Docker Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/2/artifact/out/Dockerfile
GITHUB PR #1077
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 1b8972b59f7a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 18ee109
Default Java 1.8.0_212
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/2/testReport/
Max. process+thread count 447 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/2/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@bgaborg
Copy link
Author

bgaborg commented Jul 15, 2019

the test testRootTombstones will pass if I change the return new S3AFileStatus(Tristate.TRUE, path, username); to return new S3AFileStatus(Tristate.FALSE, path, username); at org.apache.hadoop.fs.s3a.S3AFileSystem#s3GetFileStatus (org/apache/hadoop/fs/s3a/S3AFileSystem.java:2760). I'll update my PR with a test against this.

Although we will have the following errors after the change:

ITestS3GuardToolDynamoDB.testDynamoDBInitDestroyCycle:250->Assert.assertFalse:64->Assert.assertTrue:41->Assert.fail:88 s3guard.test.testDynamoDBInitDestroy-403465466 still exists; 

ITestS3GuardDDBRootOperations.test_400_rm_root_recursive:237->Assert.assertTrue:41->Assert.fail:88 Root directory delete failed;

ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testRmEmptyRootDirNonRecursive:118 ? PathIO

So at least we are checking for this. It would be better if we had a unit test for this imho. I'll create one if I can - I'll check the feasibility.

@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 64 Docker mode activated.
_ Prechecks _
+1 dupname 0 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1065 trunk passed
+1 compile 38 trunk passed
+1 checkstyle 27 trunk passed
+1 mvnsite 41 trunk passed
+1 shadedclient 750 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 61 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 58 trunk passed
_ Patch Compile Tests _
+1 mvninstall 33 the patch passed
+1 compile 31 the patch passed
+1 javac 31 the patch passed
+1 checkstyle 17 the patch passed
+1 mvnsite 34 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 731 patch has no errors when building and testing our client artifacts.
+1 javadoc 25 the patch passed
+1 findbugs 62 the patch passed
_ Other Tests _
+1 unit 279 hadoop-aws in the patch passed.
+1 asflicense 30 The patch does not generate ASF License warnings.
3380
Subsystem Report/Notes
Docker Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/3/artifact/out/Dockerfile
GITHUB PR #1077
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 8852c8090b0f 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 5446308
Default Java 1.8.0_212
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/3/testReport/
Max. process+thread count 411 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/3/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@bgaborg
Copy link
Author

bgaborg commented Jul 15, 2019

I found the out why the listing and recovering from an OOB after what @steveloughran described is not working as expected: we don't do the tombstoned entry filtering when getting a listing.
To do this I have to add HADOOP-16383 to this PR because of the metadata store has to know the TTLprovider to filter out outdated (expired) entries.

…MetadataStore interface

Fix things based on review and checkstyle issues

Picked from my PR

(cherry picked from commit b7bbdb66ffcf6d6bc5993f3d33cbe26ad5f3618d)

 Conflicts:
	hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java
	hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/DynamoDBMetadataStore.java
	hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestDynamoDBMetadataStore.java
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 112 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 13 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1312 trunk passed
+1 compile 40 trunk passed
+1 checkstyle 25 trunk passed
+1 mvnsite 45 trunk passed
+1 shadedclient 799 branch has no errors when building and testing our client artifacts.
+1 javadoc 26 trunk passed
0 spotbugs 72 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 70 trunk passed
_ Patch Compile Tests _
+1 mvninstall 33 the patch passed
+1 compile 30 the patch passed
+1 javac 30 the patch passed
-0 checkstyle 18 hadoop-tools/hadoop-aws: The patch generated 1 new + 66 unchanged - 2 fixed = 67 total (was 68)
+1 mvnsite 35 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 826 patch has no errors when building and testing our client artifacts.
+1 javadoc 22 the patch passed
+1 findbugs 65 the patch passed
_ Other Tests _
+1 unit 279 hadoop-aws in the patch passed.
+1 asflicense 28 The patch does not generate ASF License warnings.
3855
Subsystem Report/Notes
Docker Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/4/artifact/out/Dockerfile
GITHUB PR #1077
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 720cab6bfb1e 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 1411513
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/4/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/4/testReport/
Max. process+thread count 328 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/4/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

Fixed in S3A: rm should return false on non recursice root delete
Todo: testRootTombstones will fail if run parallely, but succeed if run by itself. Maybe not real root dir?
@bgaborg bgaborg changed the title HADOOP-16380. S3Guard tombstones can mislead about directory empty status - wip HADOOP-XXXXX. S3Guard tombstones can mislead about directory empty status and other fixes - wip Jul 16, 2019
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
0 reexec 122 Docker mode activated.
_ Prechecks _
+1 dupname 1 No case conflicting files found.
+1 @author 0 The patch does not contain any @author tags.
+1 test4tests 0 The patch appears to include 14 new or modified test files.
_ trunk Compile Tests _
+1 mvninstall 1058 trunk passed
+1 compile 37 trunk passed
+1 checkstyle 28 trunk passed
+1 mvnsite 41 trunk passed
+1 shadedclient 741 branch has no errors when building and testing our client artifacts.
+1 javadoc 28 trunk passed
0 spotbugs 62 Used deprecated FindBugs config; considering switching to SpotBugs.
+1 findbugs 61 trunk passed
_ Patch Compile Tests _
+1 mvninstall 33 the patch passed
+1 compile 30 the patch passed
+1 javac 30 the patch passed
-0 checkstyle 19 hadoop-tools/hadoop-aws: The patch generated 2 new + 66 unchanged - 2 fixed = 68 total (was 68)
+1 mvnsite 35 the patch passed
+1 whitespace 0 The patch has no whitespace issues.
+1 shadedclient 766 patch has no errors when building and testing our client artifacts.
+1 javadoc 24 the patch passed
+1 findbugs 64 the patch passed
_ Other Tests _
+1 unit 280 hadoop-aws in the patch passed.
+1 asflicense 30 The patch does not generate ASF License warnings.
3488
Subsystem Report/Notes
Docker Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/5/artifact/out/Dockerfile
GITHUB PR #1077
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux daa3ffbc403a 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / c5e3ab5
Default Java 1.8.0_212
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/5/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/5/testReport/
Max. process+thread count 443 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-1077/5/console
versions git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.10.0 http://yetus.apache.org

This message was automatically generated.

@bgaborg bgaborg closed this Jul 29, 2019
shanthoosh pushed a commit to shanthoosh/hadoop that referenced this pull request Oct 15, 2019
SAMZA-2249:added 1.2 release blog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fs/s3 changes related to hadoop-aws; submitter must declare test endpoint
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants