-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18501: ABFS: Partial read should add to throttling data: DRAFT #5109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…fsinputstream will retry with remaining data
…eated can be handled by the backend
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
...ols/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsRestOperation.java
Show resolved
Hide resolved
...zure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsClientThrottlingIntercept.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestPartialRead.java
Outdated
Show resolved
Hide resolved
hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestPartialRead.java
Outdated
Show resolved
Hide resolved
final byte[] b, | ||
final int offset, | ||
final int length) throws IOException { | ||
final Long requiredLen = Math.min(length, contentLength - position); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same is done in https://github.com/pranavsaxena-microsoft/hadoop/blob/partialReadThrottle2/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java#L482.
Also, the AbfsInputStream takes next decision on how to loop for the data on the basis of data it has recieved on one run. For example, if it had to read 4 MB of data, but server return 1MB data, it would call for the next 3 MB of data.
How?
- Increment fcursor(global variable in abfsInputStream to define on where cursor to the file is in) by the bytesRead(here 1MB): https://github.com/pranavsaxena-microsoft/hadoop/blob/partialReadThrottle2/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java#L334
- Checks if it further needs to loop: https://github.com/pranavsaxena-microsoft/hadoop/blob/partialReadThrottle2/hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/services/AbfsInputStream.java#L270-L273
In case, we don't have this line of code of checking requiredLen, and the length is > than the contentLength, read API on backend will still return only the (contentLength - position) worth of data. Since, the response is going to be similar, its better to send only (contentLength - position) in the requestParams, as the request-params will be used to change the throttling metrics.
🎊 +1 overall
This message was automatically generated. |
@@ -112,5 +112,7 @@ public final class AbfsHttpConstants { | |||
public static final char CHAR_STAR = '*'; | |||
public static final char CHAR_PLUS = '+'; | |||
|
|||
public static final String CONNECTION_RESET = "Connection reset"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jdk code doesn't expose any enum or constant, even they are using a string: https://github.com/openjdk-mirror/jdk/blob/jdk8u/jdk8u/master/src/share/classes/java/net/SocketInputStream.java#L209
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
what is the status/plan for this? i'm starting to look at the 3.3.6 feature set... |
Workload testing has to be done on the PR. Shall request you for review once that is done. Thanks. |
Description of PR
JIRA: https://issues.apache.org/jira/browse/HADOOP-18501
Error Description:
For partial read (due to account backend throttling), the ABFS driver retry but doesn't add up in the throttling metrics.
In case of partial read with connection-reset exception, ABFS driver retry for the full request and doesn't add up in throttling metrics.
Mitigation:
In case of partial read, ABFS Driver should retry for the remaining bytes, and it should be added in throttling metrics.
How was this patch tested?
Ran integeration and unit tests on the following accounts:
Test results:
:::: AGGREGATED TEST RESULT ::::
HNS-OAuth
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 1
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:218->Assert.assertTrue:42->Assert.fail:89 There should not be any network I/O (elapsedTimeMs=116).
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataContributor:84 » AccessDenied Operat...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataReader:143 » AccessDenied Operation ...
[INFO]
[ERROR] Tests run: 568, Failures: 1, Errors: 3, Skipped: 98
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestReadBufferManager.testPurgeBufferManagerForParallelStreams:85 [After closing all streams free list contents should match with [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
Expected size:<16> but was:<13> in:
<[0, 15, 14, 6, 5, 4, 11, 13, 9, 8, 10, 7, 12]>
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_120_terasort:262->executeStage:206 » IO The ownership o...
[INFO]
[ERROR] Tests run: 333, Failures: 1, Errors: 1, Skipped: 54
HNS-SharedKey
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:329 » TestTimedOut test timed o...
[INFO]
[ERROR] Tests run: 568, Failures: 0, Errors: 1, Skipped: 54
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestReadBufferManager.testPurgeBufferManagerForParallelStreams:85 [After closing all streams free list contents should match with [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
Expected size:<16> but was:<11> in:
<[12, 7, 5, 4, 6, 8, 9, 10, 11, 13, 14]>
[INFO]
[ERROR] Tests run: 333, Failures: 1, Errors: 0, Skipped: 41
NonHNS-SharedKey
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 2
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:218->Assert.assertTrue:42->Assert.fail:89 There should not be any network I/O (elapsedTimeMs=124).
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:344->lambda$testAcquireRetry$6:345 » TestTimedOut
[INFO]
[ERROR] Tests run: 568, Failures: 1, Errors: 1, Skipped: 276
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestAbfsTerasort.test_110_teragen:244->executeStage:211->Assert.assertEquals:647->Assert.failNotEquals:835->Assert.fail:89 teragen(1000, abfs://testcontainer@pranavsaxenanonhns.dfs.core.windows.net/ITestAbfsTerasort/sortin) failed expected:<0> but was:<1>
[ERROR] ITestReadBufferManager.testPurgeBufferManagerForParallelStreams:85 [After closing all streams free list contents should match with [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
Expected size:<16> but was:<10> in:
<[7, 6, 15, 14, 11, 12, 10, 13, 9, 8]>
[ERROR] Errors:
[ERROR] ITestAbfsJobThroughManifestCommitter.test_0420_validateJob » OutputValidation ...
[ERROR] ITestAbfsManifestCommitProtocol.testCommitLifecycle » OutputValidation
abfs:/... [ERROR] ITestAbfsManifestCommitProtocol.testCommitterWithDuplicatedCommit » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testConcurrentCommitTaskWithSubDir » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testMapFileOutputCommitter » OutputValidation ... [ERROR] ITestAbfsManifestCommitProtocol.testOutputFormatIntegration » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testParallelJobsToAdjacentPaths » OutputValidation [ERROR] ITestAbfsManifestCommitProtocol.testTwoTaskAttemptsCommit » OutputValidation
...[INFO]
[ERROR] Tests run: 333, Failures: 2, Errors: 8, Skipped: 46
AppendBlob-HNS-OAuth
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] TestAccountConfiguration.testConfigPropNotFound:386->testMissingConfigKey:399 Expected a org.apache.hadoop.fs.azurebfs.contracts.exceptions.TokenAccessProviderException to be thrown, but got the result: : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider"
[INFO]
[ERROR] Tests run: 107, Failures: 1, Errors: 0, Skipped: 1
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR] ITestAzureBlobFileSystemLease.testAcquireRetry:336 » TestTimedOut test timed o...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataContributor:84 » AccessDenied Operat...
[ERROR] ITestAzureBlobFileSystemOauth.testBlobDataReader:143 » AccessDenied Operation ...
[INFO]
[ERROR] Tests run: 568, Failures: 0, Errors: 3, Skipped: 98
[INFO] Results:
[INFO]
[ERROR] Failures:
[ERROR] ITestReadBufferManager.testPurgeBufferManagerForParallelStreams:85 [After closing all streams free list contents should match with [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
Expected size:<16> but was:<7> in:
<[0, 15, 14, 5, 7, 8, 9]>
[ERROR] Errors:
[ERROR] ITestAbfsTerasort.test_120_terasort:262->executeStage:206 » IO The ownership o...
[INFO]
[ERROR] Tests run: 333, Failures: 1, Errors: 1, Skipped: 54
Time taken: 50 mins 20 secs.
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?