-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-16916: ABFS: Delegation SAS generator for integration with Ranger #1965
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Thanks for the heads-up, and I'll update this after PR 1956 is merged. Yes, this is a big patch and all of it is related to enabling Delegation SAS support for Apache Ranger. I considered breaking it up into multiple JIRAs but some changes have dependencies between each other. Most of it is testing. |
ok, conflicting patch is in |
Thanks for the update, I merged and pushed the update. All tests passing against my eastus2euap account: $ mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor comments added. Also latest Yetus has 2 checkstyle issues for parameter count. Please check.
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/AbfsConfiguration.java
Outdated
Show resolved
Hide resolved
...azure/src/test/java/org/apache/hadoop/fs/azurebfs/ITestAzureBlobFileSystemDelegationSAS.java
Outdated
Show resolved
Hide resolved
Previously we only had a SASGenerator class which generated Service SAS, but I need to add DelegationSASGenerator. I broke SASGenerator out into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator. The code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new. The DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used by an authorization service such as Apache Ranger. Adding this to the tests helps us lock in this behavior. Added a MockDelegationSASTokenProvider for testing User Delegation SAS. Enable the Check Access API by default and fix the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that is not configured. To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds. After this a new SAS will be requested. The default period of 120 seconds can be changed using the configuration setting "fs.azure.sas.token.renew.period.for.streams". The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these operations must be provided tokens with appropriate SAS parameters to succeed. Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator. The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission while the getFileStatus call only requires execute permission. ADLS Gen2 Get Status API is supposed to be used for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties parameter which is set to false for getFileStatus and true for getXAttr. Added SASTokenProvider support for delete recursive. Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified. This is necessary to avoid passing null paths and to convert relative paths into absolute paths. Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires that the path in the URL and the path in the SAS token match. Internally the code was using "//" instead of "/" for the root path, sometimes. Also related to this, the AzureBlobFileSystemStore.getRelativePath API was updated so that we no longer remove and then add back a preceding forward / to paths.
1) Reverting default version back to 2018-11-09. To run ITestAzureBlobFileSystemDelegationSAS.testList you need to temporariliy set this to 2019-12-12 due to a server-side bug that will be fixed in the weeks to come. 2) AzureBlobFileSystem.getFileStatus is currently calling the GetAccessControl REST API. My previous PR fixed this to call GetStatus REST API, but the fix depends on a server-side bug that will not be available for a few weeks. Instead we'll postpone this until later, can continue calling GetAccessControl for now. With these changes, all tests are passing. Results for my production account in USWest2 without Delegation SAS tests: Tests run: 63, Failures: 0, Errors: 0, Skipped: 0 Tests run: 432, Failures: 0, Errors: 0, Skipped: 41 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24 Results for my test account in eastus2euap with Delegation SAS tests: Tests run: 63, Failures: 0, Errors: 0, Skipped: 0 Tests run: 432, Failures: 0, Errors: 0, Skipped: 33 Tests run: 206, Failures: 0, Errors: 0, Skipped: 24
…ge when hierarchical namespace is disabled. It turns out that when it is disabled, the directory query parameter used by the List Paths API must not start with a forward slash '/'. The continuation token is also affected by this, so I have fixed both of these issues. I also discovered that ITestAzureBlobFileSystemAuthorization tests that use ACLs were missing Assume.assumeTrue(this.getFileSystem().getIsNamespaceEnabled()) so I have added that. Finally, I have changed the default for fs.azure.enable.check.access back to false and we will wait for the next release of ADLS Gen2 REST API before changing this to true.
java* javax* any non-org.apache imports any org.apache.* imports
🎊 +1 overall
This message was automatically generated. |
commit b214bbd |
Previously we only had a SASGenerator class which generated Service SAS, but I need to add DelegationSASGenerator. I broke SASGenerator out into a base class and two subclasses ServiceSASGenerator and DelegationSASGenreator. The code in ServiceSASGenerator is copied from SASGenerator but the DelegationSASGenrator code is new. The DelegationSASGenerator code demonstrates how to use Delegation SAS with minimal permissions, as would be used
by an authorization service such as Apache Ranger. Adding this to the tests helps us lock in this behavior.
Added a MockDelegationSASTokenProvider for testing User Delegation SAS.
Enable the Check Access API by default and fix the ITestAzureBlobFileSystemCheckAccess tests to assume oauth client ID so that they are ignored when that is not configured.
To improve performance, AbfsInputStream/AbfsOutputStream re-use SAS tokens until the expiry is within 120 seconds. After this a new SAS will be requested. The default period of 120 seconds can be changed using the configuration setting "fs.azure.sas.token.renew.period.for.streams".
The SASTokenProvider operation names were updated to correspond better with the ADLS Gen2 REST API, since these operations must be provided tokens with appropriate SAS parameters to succeed.
Support for the version 2.0 AAD authentication endpoint was added to AzureADAuthenticator.
The getFileStatus method was mistakenly calling the ADLS Gen2 Get Properties API which requires read permission while the getFileStatus call only requires execute permission. ADLS Gen2 Get Status API is supposed to be used for this purpose, so the underlying AbfsClient.getPathStatus API was updated with a includeProperties parameter which is set to false for getFileStatus and true for getXAttr.
Added SASTokenProvider support for delete recursive.
Fixed bugs in AzureBlobFileSystem where public methods were not validating the Path by calling makeQualified. This is necessary to avoid passing null paths and to convert relative paths into absolute paths.
Canonicalized the path used for root path internally so that root path can be used with SAS tokens, which requires that the path in the URL and the path in the SAS token match. Internally the code was using "//" instead of "/" for the root path, sometimes. Also related to this, the AzureBlobFileSystemStore.getRelativePath API was updated so that we no longer remove and then add back a preceding forward / to paths.
All tests passing against my eastus2euap account:
$ mvn -T 1C -Dparallel-tests=abfs -Dscale -DtestsThreadCount=8 clean verify
[INFO] Tests run: 56, Failures: 0, Errors: 0, Skipped: 0
[WARNING] Tests run: 424, Failures: 0, Errors: 0, Skipped: 33
[WARNING] Tests run: 206, Failures: 0, Errors: 0, Skipped: 24