Skip to content

HADOOP-17022 Tune S3AFileSystem.listFiles() api. #2038

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

mukund-thakur
Copy link
Contributor

Perform list operations directly assuming the path to be a directory
and then fallback to head checks for file. This optimization will
reduce the number of remote calls to S3 when the input is a directory.
This will also reduce the number of metastore calls in case of guarded
store.

Ran all tests using below command in ap-south-1 bucket.
mvn clean verify -Ds3guard -Ddynamo -Dauth

Only two failures:
[ERROR] Failures:
[ERROR] ITestS3GuardListConsistency.testConsistentRenameAfterDelete:286 Recently renamed dir should not be visible: [s3a://mthakur-data/test/a]
[ERROR] ITestMagicCommitProtocol>AbstractITCommitProtocol.testCommitterWithDuplicatedCommit:806->AbstractITCommitProtocol.expectFNFEonTaskCommit:876 Expected a java.io.FileNotFoundException to be thrown, but got the result: : "MagicCommitter{}"

Perform list operations directly assuming the path to be a directory
and then fallback to head checks for file. This optimization will
reduce the number of remote calls to S3 when the input is a directory.
This will also reduce the number of metastore calls in case of guarded
store.
@mukund-thakur
Copy link
Contributor Author

mukund-thakur commented May 28, 2020

I have to setup ITestAssumeRole tests as well.

CC @steveloughran

// fallback to file existence check as the path
// can be a file or empty directory.
if (!listFilesAssumingDir.hasNext()) {
final S3AFileStatus fileStatus = (S3AFileStatus) getFileStatus(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

innerGetFileStatus with probes for file and dir marker -HEAD_OR_DIR_MARKER; we don't need to issue a second LIST call, do we? This will save a call on listing a missing path

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't do this as we will need a list for an directory which is empty. Else we will get FNFE.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea, but an empty dir is that HEAD marker. So no list, right?

Don't worry about it -my dir marker tuning changes things anyway, replacing the HEAD + / with a list.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is an empty directory within a base directory. For example directory structure in test ITestS3AContractGetFileStatus>AbstractContractGetFileStatusTest.testListFilesEmptyDirectoryRecursive
There won't be any files thus listing will be empty.

}
// If we have reached here, it means either there are files
// in this directory or it is empty.
return listFilesAssumingDir;
} catch (AmazonClientException e) {
// TODO S3Guard: retry on file not found exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to consider this and see if the TODO is already done.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any retries in listFiles().
Even the listLocatedStatus() and listStatus() are called with Invoker.Once(). That means just the calls with exception translated without any retries.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets cut that TODO line

} catch (AmazonClientException e) {
// TODO S3Guard: retry on file not found exception
throw translateException("listFiles", path, e);
}
}

/**
* List files under a path assuming the path to be a directory.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wonder if we can/should move this into the Listing class.

in #2046 I'm looking at doing that for openfile, just to pull something complex out of the main fs. I know it hurts maintenance a bit (all subsequent patches will need this patch before cherry picking), but your work is big enough that will happen anyway.

imagine we made the Listing class a subclass of org.apache.hadoop.fs.s3a.impl.AbstractStoreOperation and added a callback interface purely to those s3a FS calls it needs - just like we do with rename with org.apache.hadoop.fs.s3a.impl.OperationCallbacks. we'd then move all the work over there and isolate it better.

of course, one thing this will make harder is for my directory marker changes, but that shouldn't be a blocker.

@mukund-thakur
Copy link
Contributor Author

Setup Assumed role tests. All good there.

[ERROR] ITestS3GuardListConsistency.testConsistentRenameAfterDelete:286 Recently renamed dir should not be visible: [s3a://mthakur-data/test/a]
I debugged this test failure. The reason it started failing is :

  • Previously we used to call getFileStatus() in the start and that will throw away FileNotFoundEx right away which was expected in the test.
  • Now we are doing directory listing first. In this case it won't be empty because of list inconsistency by S3. " Deleted source files after rename may still be visible"

@mukund-thakur
Copy link
Contributor Author

The reason for failure of
ITestMagicCommitProtocol>AbstractITCommitProtocol.testCommitterWithDuplicatedCommit:806->AbstractITCommitProtocol.expectFNFEonTaskCommit:876 Expected a java.io.FileNotFoundException to be thrown, but got the result: : "MagicCommitter{}"

seem similar to the other test failure of ITestS3GuardListConsistency.testConsistentRenameAfterDelete.
In this one, once the task is already committed the listing during next commit still list the older file.

@steveloughran
Copy link
Contributor

Mukund: can you paste in the full stack traces? that way I can use the IDE to see more about the problems. thanks

@mukund-thakur
Copy link
Contributor Author

Full stack traces for test failures:

`java.lang.AssertionError: Recently renamed dir should not be visible: [s3a://mthakur-data/test/a/b/dir3-DELAY_LISTING_ME, s3a://mthakur-data/test/a]

at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:499)
at org.apache.hadoop.fs.s3a.ITestS3GuardListConsistency.testConsistentRenameAfterDelete(ITestS3GuardListConsistency.java:286)`

`java.lang.AssertionError: Expected a java.io.FileNotFoundException to be thrown, but got the result: : "MagicCommitter{}"

at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:499)
at org.apache.hadoop.test.LambdaTestUtils.intercept(LambdaTestUtils.java:384)
at org.apache.hadoop.fs.s3a.commit.AbstractITCommitProtocol.expectFNFEonTaskCommit(AbstractITCommitProtocol.java:876)
at org.apache.hadoop.fs.s3a.commit.AbstractITCommitProtocol.testCommitterWithDuplicatedCommit(AbstractITCommitProtocol.java:806)`

Observed no change from previous implemetation.
@hadoop-yetus
Copy link

🎊 +1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 21m 2s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 2 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 20m 56s trunk passed
+1 💚 compile 0m 31s trunk passed
+1 💚 checkstyle 0m 23s trunk passed
+1 💚 mvnsite 0m 36s trunk passed
+1 💚 shadedclient 16m 12s branch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 24s trunk passed
+0 🆗 spotbugs 0m 58s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 55s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 32s the patch passed
+1 💚 compile 0m 25s the patch passed
+1 💚 javac 0m 25s the patch passed
-0 ⚠️ checkstyle 0m 17s hadoop-tools/hadoop-aws: The patch generated 4 new + 12 unchanged - 1 fixed = 16 total (was 13)
+1 💚 mvnsite 0m 30s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 15m 5s patch has no errors when building and testing our client artifacts.
+1 💚 javadoc 0m 22s the patch passed
+1 💚 findbugs 1m 2s the patch passed
_ Other Tests _
+1 💚 unit 1m 23s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 27s The patch does not generate ASF License warnings.
82m 29s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/2/artifact/out/Dockerfile
GITHUB PR #2038
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux f96e0e2d9d79 4.15.0-101-generic #102-Ubuntu SMP Mon May 11 10:07:26 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / fed6fec
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/2/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/2/testReport/
Max. process+thread count 344 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/2/console
versions git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@apache apache deleted a comment from hadoop-yetus Jun 23, 2020
@mukund-thakur
Copy link
Contributor Author

Latest commit fixes the above test failures.
After multiple debugging efforts I thought I should run the tests without s3guard as well. So when I did that I see 8 tests failing all with file not found exception. I will debug these further.

[ERROR] Errors: [ERROR] ITestS3AContractGetFileStatus>AbstractContractGetFileStatusTest.testListFilesEmptyDirectoryNonrecursive:99->AbstractContractGetFileStatusTest.listFilesOnEmptyDir:119 » FileNotFound [ERROR] ITestS3AContractGetFileStatus>AbstractContractGetFileStatusTest.testListFilesEmptyDirectoryRecursive:104->AbstractContractGetFileStatusTest.listFilesOnEmptyDir:119 » FileNotFound [ERROR] ITestS3AContractGetFileStatusV1List>AbstractContractGetFileStatusTest.testListFilesEmptyDirectoryNonrecursive:99->AbstractContractGetFileStatusTest.listFilesOnEmptyDir:119 » FileNotFound [ERROR] ITestS3AContractGetFileStatusV1List>AbstractContractGetFileStatusTest.testListFilesEmptyDirectoryRecursive:104->AbstractContractGetFileStatusTest.listFilesOnEmptyDir:119 » FileNotFound [ERROR] ITestMagicCommitProtocol>AbstractITCommitProtocol.testCommitJobButNotTask:1004->AbstractITCommitProtocol.executeWork:552->AbstractITCommitProtocol.executeWork:568->AbstractITCommitProtocol.lambda$testCommitJobButNotTask$9:1010 » FileNotFound [INFO] [ERROR] Tests run: 1204, Failures: 0, Errors: 5, Skipped: 342

[ERROR] Errors: [ERROR] ITestS3AContractRootDir.testListEmptyRootDirectory:82->AbstractContractRootDirectoryTest.testListEmptyRootDirectory:199 » FileNotFound [ERROR] ITestS3AContractRootDir>AbstractContractRootDirectoryTest.testSimpleRootListing:239 » FileNotFound [ERROR] ITestS3AEncryptionSSEC.testListEncryptedDir:197 » FileNotFound No such file or... [INFO] [ERROR] Tests run: 110, Failures: 0, Errors: 3, Skipped: 87

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 40s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 23m 35s trunk passed
+1 💚 compile 0m 41s trunk passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 compile 0m 32s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 checkstyle 0m 21s trunk passed
+1 💚 mvnsite 0m 37s trunk passed
+1 💚 shadedclient 18m 23s branch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 33s hadoop-aws in trunk failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 29s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+0 🆗 spotbugs 1m 12s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 1m 9s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 41s the patch passed
+1 💚 compile 0m 39s the patch passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 javac 0m 39s the patch passed
+1 💚 compile 0m 32s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 javac 0m 32s the patch passed
-0 ⚠️ checkstyle 0m 22s hadoop-tools/hadoop-aws: The patch generated 5 new + 15 unchanged - 1 fixed = 20 total (was 16)
+1 💚 mvnsite 0m 36s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 17m 7s patch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 33s hadoop-aws in the patch failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 27s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 findbugs 1m 16s the patch passed
_ Other Tests _
+1 💚 unit 1m 31s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 32s The patch does not generate ASF License warnings.
73m 38s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/3/artifact/out/Dockerfile
GITHUB PR #2038
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux 7e3adea5ae02 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 84110d8
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_252-8u252-b09-1~18.04-b09
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/3/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
checkstyle https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/3/artifact/out/diff-checkstyle-hadoop-tools_hadoop-aws.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/3/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/3/testReport/
Max. process+thread count 454 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/3/console
versions git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

@steveloughran
Copy link
Contributor

yes, I think we will be needed the unguarded operation to work too.

FWIW I do regularly test on both options. Try a -Dscale run too for even more coverage

@mukund-thakur
Copy link
Contributor Author

Fixed the raw test failures. Also ran the scale tests. All went fine apart from some timeout failures.
I think this is almost done but I would like others @bgaborg and @mehakmeet as well to run tests on their configs to achieve good confidence in merging this. Thanks

@mehakmeet
Copy link
Contributor

All tests running fine other than getting timeout error in ITestS3AContractRootDir. Tested on East US, West US region. There is a checkstyle issue. Other than that, LGTM.

@apache apache deleted a comment from hadoop-yetus Jul 6, 2020
@steveloughran
Copy link
Contributor

good to hear you are happy. What do others say?

@hadoop-yetus
Copy link

💔 -1 overall

Vote Subsystem Runtime Comment
+0 🆗 reexec 0m 33s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 test4tests 0m 0s The patch appears to include 3 new or modified test files.
_ trunk Compile Tests _
+1 💚 mvninstall 19m 23s trunk passed
+1 💚 compile 0m 43s trunk passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 compile 0m 37s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 checkstyle 0m 28s trunk passed
+1 💚 mvnsite 0m 40s trunk passed
+1 💚 shadedclient 15m 4s branch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 32s hadoop-aws in trunk failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 30s trunk passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+0 🆗 spotbugs 1m 1s Used deprecated FindBugs config; considering switching to SpotBugs.
+1 💚 findbugs 0m 59s trunk passed
_ Patch Compile Tests _
+1 💚 mvninstall 0m 33s the patch passed
+1 💚 compile 0m 33s the patch passed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04
+1 💚 javac 0m 33s the patch passed
+1 💚 compile 0m 29s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 javac 0m 29s the patch passed
+1 💚 checkstyle 0m 19s hadoop-tools/hadoop-aws: The patch generated 0 new + 15 unchanged - 1 fixed = 15 total (was 16)
+1 💚 mvnsite 0m 31s the patch passed
+1 💚 whitespace 0m 0s The patch has no whitespace issues.
+1 💚 shadedclient 13m 32s patch has no errors when building and testing our client artifacts.
-1 ❌ javadoc 0m 28s hadoop-aws in the patch failed with JDK Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.
+1 💚 javadoc 0m 25s the patch passed with JDK Private Build-1.8.0_252-8u252-b09-1~18.04-b09
+1 💚 findbugs 1m 1s the patch passed
_ Other Tests _
+1 💚 unit 1m 21s hadoop-aws in the patch passed.
+1 💚 asflicense 0m 31s The patch does not generate ASF License warnings.
61m 20s
Subsystem Report/Notes
Docker ClientAPI=1.40 ServerAPI=1.40 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/5/artifact/out/Dockerfile
GITHUB PR #2038
Optional Tests dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle
uname Linux f855abc44374 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality personality/hadoop.sh
git revision trunk / 639acb6
Default Java Private Build-1.8.0_252-8u252-b09-1~18.04-b09
Multi-JDK versions /usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04 /usr/lib/jvm/java-8-openjdk-amd64:Private Build-1.8.0_252-8u252-b09-1~18.04-b09
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/5/artifact/out/branch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
javadoc https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/5/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkUbuntu-11.0.7+10-post-Ubuntu-2ubuntu218.04.txt
Test Results https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/5/testReport/
Max. process+thread count 452 (vs. ulimit of 5500)
modules C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws
Console output https://builds.apache.org/job/hadoop-multibranch/job/PR-2038/5/console
versions git=2.17.1 maven=3.6.0 findbugs=3.1.0-RC1
Powered by Apache Yetus 0.12.0 https://yetus.apache.org

This message was automatically generated.

Copy link
Contributor

@steveloughran steveloughran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code is looking good, I think the next step for me is to pull this into my IDE and actually do a local build and test. Can you move to innerGetFileStatus and I'll check out and play with the updated PR tomorrow. This is probably good time to do a rebase too....this PR has been up for a few weeks.

// If file status was already passed, reuse it.
final S3AFileStatus fileStatus = status != null
? status
: (S3AFileStatus) getFileStatus(path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's go to innerGetFileStatus here so the invocation count is consistent and we can declare we don't want to fall back to the list operation, just StatusProbeEnum.HEAD_OR_DIR_MARKER. Saves a LIST call when listing a nonexistent path. Also lets us avoid that cast

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This won't work as explained in #2038 (comment)

}
// If we have reached here, it means either there are files
// in this directory or it is empty.
return listFilesAssumingDir;
} catch (AmazonClientException e) {
// TODO S3Guard: retry on file not found exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lets cut that TODO line

*/
private RemoteIterator<S3ALocatedFileStatus> getListFilesAssumingDir(
Path path,
boolean recursive, Listing.FileStatusAcceptor acceptor,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add on a new line

if (recursive) {
final PathMetadata pm = metadataStore.get(path, true);
if (pm != null) {
if (pm.isDeleted()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this a copy/paste of the other use of this? if so, what about we pull lines 4250-4255 into a new static method in org.apache.hadoop.fs.s3a.s3guard.S3Guard and use in both places. could even have a test, maybe?

resetMetricDiffs();
intercept(FileNotFoundException.class,
() -> fs.listFiles(dir, true));
verifyOperationCount(2, 2);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with a move to innerGetFileStatus and only do the file/marker probes, we could get cost to (2, 1)

: (S3AFileStatus) getFileStatus(path);
if (fileStatus.isFile()) {
// if a status was given and it is a file.
if (status != null && status.isFile()) {
// simple case: File
LOG.debug("Path is a file");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we are keeping this, include the path f in the log message, just to make more sense of it in a busy log

@steveloughran
Copy link
Contributor

merged to trunk. thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants