@@ -257,7 +257,8 @@ will have the URL `abfs://container1@abfswales1.dfs.core.windows.net/`
257257
258258
259259You can create a new container through the ABFS connector, by setting the option
260- ` fs.azure.createRemoteFileSystemDuringInitialization ` to ` true ` .
260+ ` fs.azure.createRemoteFileSystemDuringInitialization ` to ` true ` . Though the
261+ same is not supported when AuthType is SAS.
261262
262263If the container does not exist, an attempt to list it with ` hadoop fs -ls `
263264will fail
@@ -317,8 +318,13 @@ driven by them.
317318
318319What can be changed is what secrets/credentials are used to authenticate the caller.
319320
320- The authentication mechanism is set in ` fs.azure.account.auth.type ` (or the account specific variant),
321- and, for the various OAuth options ` fs.azure.account.oauth.provider.type `
321+ The authentication mechanism is set in ` fs.azure.account.auth.type ` (or the
322+ account specific variant). The possible values are SharedKey, OAuth, Custom
323+ and SAS. For the various OAuth options use the config `fs.azure.account
324+ .oauth.provider.type`. Following are the implementations supported
325+ ClientCredsTokenProvider, UserPasswordTokenProvider, MsiTokenProvider and
326+ RefreshTokenBasedTokenProvider. An IllegalArgumentException is thrown if
327+ the specified provider type is not one of the supported.
322328
323329All secrets can be stored in JCEKS files. These are encrypted and password
324330protected —use them or a compatible Hadoop Key Management Store wherever
@@ -350,6 +356,15 @@ the password, "key", retrieved from the XML/JCECKs configuration files.
350356* Note* : The source of the account key can be changed through a custom key provider;
351357one exists to execute a shell script to retrieve it.
352358
359+ A custom key provider class can be provided with the config
360+ ` fs.azure.account.keyprovider ` . If a key provider class is specified the same
361+ will be used to get account key. Otherwise the Simple key provider will be used
362+ which will use the key specified for the config ` fs.azure.account.key ` .
363+
364+ To retrieve using shell script, specify the path to the script for the config
365+ ` fs.azure.shellkeyprovider.script ` . ShellDecryptionKeyProvider class use the
366+ script specified to retrieve the key.
367+
353368### <a name =" oauth-client-credentials " ></a > OAuth 2.0 Client Credentials
354369
355370OAuth 2.0 credentials of (client id, client secret, endpoint) are provided in the configuration/JCEKS file.
@@ -465,6 +480,13 @@ With an existing Oauth 2.0 token, make a request of the Active Directory endpoin
465480 Refresh token
466481 </description >
467482</property >
483+ <property >
484+ <name >fs.azure.account.oauth2.refresh.endpoint</name >
485+ <value ></value >
486+ <description >
487+ Refresh token endpoint
488+ </description >
489+ </property >
468490<property >
469491 <name >fs.azure.account.oauth2.client.id</name >
470492 <value ></value >
@@ -506,6 +528,13 @@ The Azure Portal/CLI is used to create the service identity.
506528 Optional MSI Tenant ID
507529 </description >
508530</property >
531+ <property >
532+ <name >fs.azure.account.oauth2.msi.endpoint</name >
533+ <value ></value >
534+ <description >
535+ MSI endpoint
536+ </description >
537+ </property >
509538<property >
510539 <name >fs.azure.account.oauth2.client.id</name >
511540 <value ></value >
@@ -542,6 +571,26 @@ and optionally `org.apache.hadoop.fs.azurebfs.extensions.BoundDTExtension`.
542571
543572The declared class also holds responsibility to implement retry logic while fetching access tokens.
544573
574+ ### <a name =" delegationtokensupportconfigoptions " ></a > Delegation Token Provider
575+
576+ A delegation token provider supplies the ABFS connector with delegation tokens,
577+ helps renew and cancel the tokens by implementing the
578+ CustomDelegationTokenManager interface.
579+
580+ ``` xml
581+ <property >
582+ <name >fs.azure.enable.delegation.token</name >
583+ <value >true</value >
584+ <description >Make this true to use delegation token provider</description >
585+ </property >
586+ <property >
587+ <name >fs.azure.delegation.token.provider.type</name >
588+ <value >{fully-qualified-class-name-for-implementation-of-CustomDelegationTokenManager-interface}</value >
589+ </property >
590+ ```
591+ In case delegation token is enabled, and the config `fs.azure.delegation.token
592+ .provider.type` is not provided then an IlleagalArgumentException is thrown.
593+
545594### Shared Access Signature (SAS) Token Provider
546595
547596A Shared Access Signature (SAS) token provider supplies the ABFS connector with SAS
@@ -691,6 +740,84 @@ Config `fs.azure.account.hns.enabled` provides an option to specify whether
691740Config ` fs.azure.enable.check.access ` needs to be set true to enable
692741 the AzureBlobFileSystem.access().
693742
743+ ### <a name =" featureconfigoptions " ></a > Primary User Group Options
744+ The group name which is part of FileStatus and AclStatus will be set the same as
745+ the username if the following config is set to true
746+ ` fs.azure.skipUserGroupMetadataDuringInitialization ` .
747+
748+ ### <a name =" ioconfigoptions " ></a > IO Options
749+ The following configs are related to read and write operations.
750+
751+ ` fs.azure.io.retry.max.retries ` : Sets the number of retries for IO operations.
752+ Currently this is used only for the server call retry logic. Used within
753+ AbfsClient class as part of the ExponentialRetryPolicy. The value should be
754+ > = 0.
755+
756+ ` fs.azure.write.request.size ` : To set the write buffer size. Specify the value
757+ in bytes. The value should be between 16384 to 104857600 both inclusive (16 KB
758+ to 100 MB). The default value will be 8388608 (8 MB).
759+
760+ ` fs.azure.read.request.size ` : To set the read buffer size.Specify the value in
761+ bytes. The value should be between 16384 to 104857600 both inclusive (16 KB to
762+ 100 MB). The default value will be 4194304 (4 MB).
763+
764+ ` fs.azure.readaheadqueue.depth ` : Sets the readahead queue depth in
765+ AbfsInputStream. In case the set value is negative the read ahead queue depth
766+ will be set as Runtime.getRuntime().availableProcessors(). By default the value
767+ will be -1.
768+
769+ ### <a name =" securityconfigoptions " ></a > Security Options
770+ ` fs.azure.always.use.https ` : Enforces to use HTTPS instead of HTTP when the flag
771+ is made true. Irrespective of the flag, AbfsClient will use HTTPS if the secure
772+ scheme (ABFSS) is used or OAuth is used for authentication. By default this will
773+ be set to true.
774+
775+ ` fs.azure.ssl.channel.mode ` : Initializing DelegatingSSLSocketFactory with the
776+ specified SSL channel mode. Value should be of the enum
777+ DelegatingSSLSocketFactory.SSLChannelMode. The default value will be
778+ DelegatingSSLSocketFactory.SSLChannelMode.Default.
779+
780+ ### <a name =" serverconfigoptions " ></a > Server Options
781+ When the config ` fs.azure.io.read.tolerate.concurrent.append ` is made true, the
782+ If-Match header sent to the server for read calls will be set as * otherwise the
783+ same will be set with ETag. This is basically a mechanism in place to handle the
784+ reads with optimistic concurrency.
785+ Please refer the following links for further information.
786+ 1 . https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/read
787+ 2 . https://azure.microsoft.com/de-de/blog/managing-concurrency-in-microsoft-azure-storage-2/
788+
789+ listStatus API fetches the FileStatus information from server in a page by page
790+ manner. The config ` fs.azure.list.max.results ` used to set the maxResults URI
791+ param which sets the pagesize(maximum results per call). The value should
792+ be > 0. By default this will be 500. Server has a maximum value for this
793+ parameter as 5000. So even if the config is above 5000 the response will only
794+ contain 5000 entries. Please refer the following link for further information.
795+ https://docs.microsoft.com/en-us/rest/api/storageservices/datalakestoragegen2/path/list
796+
797+ ### <a name =" throttlingconfigoptions " ></a > Throttling Options
798+ ABFS driver has the capability to throttle read and write operations to achieve
799+ maximum throughput by minimizing errors. The errors occur when the account
800+ ingress or egress limits are exceeded and, the server-side throttles requests.
801+ Server-side throttling causes the retry policy to be used, but the retry policy
802+ sleeps for long periods of time causing the total ingress or egress throughput
803+ to be as much as 35% lower than optimal. The retry policy is also after the
804+ fact, in that it applies after a request fails. On the other hand, the
805+ client-side throttling implemented here happens before requests are made and
806+ sleeps just enough to minimize errors, allowing optimal ingress and/or egress
807+ throughput. By default the throttling mechanism is enabled in the driver. The
808+ same can be disabled by setting the config ` fs.azure.enable.autothrottling `
809+ to false.
810+
811+ ### <a name =" renameconfigoptions " ></a > Rename Options
812+ ` fs.azure.atomic.rename.key ` : Directories for atomic rename support can be
813+ specified comma separated in this config. The driver prints the following
814+ warning log if the source of the rename belongs to one of the configured
815+ directories. "The atomic rename feature is not supported by the ABFS scheme
816+ ; however, rename, create and delete operations are atomic if Namespace is
817+ enabled for your Azure Storage account."
818+ The directories can be specified as comma separated values. By default the value
819+ is "/hbase"
820+
694821### <a name =" perfoptions " ></a > Perf Options
695822
696823#### <a name =" abfstracklatencyoptions " ></a > 1. HTTP Request Tracking Options
0 commit comments