- 
                Notifications
    You must be signed in to change notification settings 
- Fork 3.9k
ARROW-2661: [Python] Adding the ability to programmatically pass hdfs configration key/value pairs via pyarrow #2097
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ration key/value pairs in the C++ and via pyarrow
| Codecov Report
 @@            Coverage Diff             @@
##           master    #2097      +/-   ##
==========================================
- Coverage   86.39%   86.36%   -0.03%     
==========================================
  Files         230      230              
  Lines       40488    40414      -74     
==========================================
- Hits        34979    34904      -75     
- Misses       5509     5510       +1
 Continue to review full report at Codecov. 
 | 
        
          
                cpp/src/arrow/io/hdfs.h
              
                Outdated
          
        
      | int port; | ||
| std::string user; | ||
| std::string kerb_ticket; | ||
| std::map<std::string, std::string> extra_conf; | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ordering relevant? If not, please use an std::unordered_map
| @xhochy switched over to unordered_map as requested. honestly i should have thought to do that in the first place haha. 😃 | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1, LGTM
Thanks for adding this!
| There's no documentation for this -- can we add to the docstring and/or Sphinx? Feel free to open a new JIRA so we don't forget | 
https://issues.apache.org/jira/browse/ARROW-2661
Both the JNI and libhdfs3 support hdfsBuilderConfSetStr so we can utilize that to allow passing arbitrary configuration values for hdfs connection similiar to how https://hdfs3.readthedocs.io/en/latest/hdfs.html supports passing them.
I've added a param called
extra_confto facilitate it in pyarrow, such as: