You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ARROW-2661: [Python] Adding the ability to programmatically pass hdfs configration key/value pairs via pyarrow
https://issues.apache.org/jira/browse/ARROW-2661
Both the JNI and libhdfs3 support hdfsBuilderConfSetStr so we can utilize that to allow passing arbitrary configuration values for hdfs connection similiar to how https://hdfs3.readthedocs.io/en/latest/hdfs.html supports passing them.
I've added a param called `extra_conf` to facilitate it in pyarrow, such as:
```python
import pyarrow
conf = {"dfs.nameservices": "nameservice1",
"dfs.ha.namenodes.nameservice1": "namenode113,namenode188",
"dfs.namenode.rpc-address.nameservice1.namenode113": "hostname_of_server1:8020",
"dfs.namenode.rpc-address.nameservice1.namenode188": "hostname_of_server2:8020",
"dfs.namenode.http-address.nameservice1.namenode188": "hostname_of_server1:50070",
"dfs.namenode.http-address.nameservice1.namenode188": "hostname_of_server2:50070",
"hadoop.security.authentication": "kerberos"
}
hdfs = pyarrow.hdfs.connect(host='nameservice1', driver='libhdfs3', extra_conf=conf)
```
Author: Matthew Topol <mtopol@factset.com>
Closes#2097 from zeroshade/configs and squashes the following commits:
047dd4b <Matthew Topol> forgot to use make format to fix the order of includes. oops
d27e3c3 <Matthew Topol> switching to unordered_map
858b44b <Matthew Topol> missed a flake8 spot
77eeae0 <Matthew Topol> Adding the ability to programmatically pass hdfs configuration key/value pairs in the C++ and via pyarrow
0 commit comments