Skip to content

Conversation

@ghost
Copy link

@ghost ghost commented Feb 5, 2019

  • Add docs
  • Format code
  • Include Python in integration tests (requires binding the JSON reader/writer from C++)
  • Validate performance?
  • Complete server bindings if approach makes sense

@wesm
Copy link
Member

wesm commented Feb 6, 2019

@lihalite glad to see you pushing this along. I think there's a JIRA already for Flight+Python so you can assign yourself that issue.

I'll defer to @pitrou on this code review -- if I can be of assistance please ping me and I will make time to review

@ghost ghost changed the title [WIP] Flight Python bindings [WIP] ARROW-3162: Flight Python bindings Feb 6, 2019
@codecov-io
Copy link

Codecov Report

Merging #3566 into master will increase coverage by 0.85%.
The diff coverage is 41.17%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3566      +/-   ##
==========================================
+ Coverage   87.82%   88.67%   +0.85%     
==========================================
  Files         666      537     -129     
  Lines       82371    73408    -8963     
  Branches     1069        0    -1069     
==========================================
- Hits        72341    65098    -7243     
+ Misses       9919     8310    -1609     
+ Partials      111        0     -111
Impacted Files Coverage Δ
python/pyarrow/includes/common.pxd 75% <ø> (ø) ⬆️
python/pyarrow/lib.pxd 0% <ø> (ø) ⬆️
python/pyarrow/ipc.pxi 66.15% <41.17%> (-1.39%) ⬇️
go/arrow/math/uint64_amd64.go
go/arrow/memory/memory_avx2_amd64.go
js/src/enum.ts
go/arrow/array/builder.go
js/src/Arrow.node.ts
js/src/schema.ts
go/arrow/type_traits_boolean.go
... and 123 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4004b72...1977c45. Read the comment docs.

@pitrou
Copy link
Member

pitrou commented Feb 7, 2019

Sorry for the delay. I have a flu, but will try to review this shortly.

Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this PR is still work in progress, so consider this a preliminary review. Thanks for doing this :-)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you defining those two different classes _CRecordBatchReader and _RecordBatchReader?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do some more refactoring here, but I'd like to have other bindings of arrow::RecordBatchReader that share the same Python interface. The existing bindings are fairly tied to binding RecordBatchStreamReader even when they don't need to be. Properly, _CRecordBatchReader should be AbstractRecordBatchReader or something similar and _RecordBatchReader should be _RecordBatchStreamReader, if that sounds OK.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are you defining those two different classes _CRecordBatchWriter and _RecordBatchWriter?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should certainly run without the GIL, no?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah good point. One thing I also want to circle back to is this breaks Ctrl-C on the Python side currently.

@ghost
Copy link
Author

ghost commented Feb 7, 2019

Sorry for the delay. I have a flu, but will try to review this shortly.

No worries, hope you feel better soon, and thank you for the comments!

@xhochy
Copy link
Member

xhochy commented Feb 8, 2019

@lihalite Can you rebase on master or would you like help with that?

@ghost
Copy link
Author

ghost commented Feb 8, 2019

@xhochy I can, will do so in a moment (sorry, didn't notice what just went in)

@wesm
Copy link
Member

wesm commented Feb 11, 2019

I had to squash and rebase due to the CMake file reformatting. I will review this also -- rather than try to boil the ocean in a single PR I suggest fixing any obvious problems with this patch and then opening a bunch of JIRAs so that the remaining work can proceed in parallel

@ghost
Copy link
Author

ghost commented Feb 11, 2019

Sounds good, thanks. The one thing I want to fix is the _CRecordBatchReader situation, then I'll open JIRAs for integration testing/completing the server side bindings. (I've been busy with other work but hopefully can pick this back up soon.)

@ghost
Copy link
Author

ghost commented Feb 12, 2019

Okay, if the changes in ipc.pxi around _CRecordBatchReader and _CRecordBatchWriter make sense, this is ready; I'll follow up with the rest soon.

David Li and others added 4 commits February 13, 2019 10:48
Add simple demo Flight client in Python

Add basic docstrings for Python Flight client

Move RecordBatchReader/Writer wrappers to lib.pxd

Link to PROTOBUF_LIBRARY, not protobuf_static

Update for FlightPutWriter

Start binding Flight server to Python

Allow Flight DoPut to be implemented in Python

Revert unnecessary build process changes

Fix failing pyarrow unit tests

Lint Python flight code

Lint Cython code

Delete unnecessary Flight pxd

Use Arrow utilities for interfacing with Python in Flight

Simplify API of Python Flight bindings

Add Python enum to mirror DescriptorType in Flight

Replace static factory methods with constructors
@ghost
Copy link
Author

ghost commented Feb 13, 2019

Rebased to fix conflict.

@pitrou pitrou changed the title [WIP] ARROW-3162: Flight Python bindings ARROW-3162: Flight Python bindings Feb 13, 2019
Copy link
Member

@pitrou pitrou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks mostly fine to me. Are you going to submit tests in a separate PR?

result = (<object> self).get_flight_info(descriptor)
if not result:
info.reset(NULL)
# TODO:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there something missing here and below? I suppose you should convert result to a CFlightInfo?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished binding the server methods yet - I'll change this to raise NotImplementedError.

'six >= 1.0.0',
'futures; python_version < "3.2"'
'futures; python_version < "3.2"',
'enum34 >= 1.1.6; python_version < "3.4"',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You must also update requirements.txt.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

@pitrou pitrou closed this in 148213e Feb 14, 2019
@wesm
Copy link
Member

wesm commented Feb 14, 2019

Thanks @lihalite! Seems like there are a number of follow ups, are there JIRA issues for these yet?

@ghost
Copy link
Author

ghost commented Feb 14, 2019

I will follow up those now @wesm. Thanks everyone for the help here!

@ghost
Copy link
Author

ghost commented Feb 14, 2019

Looks like some have already been filed, here are the relevant issues:

https://issues.apache.org/jira/browse/ARROW-3150 (ship wheels)
https://issues.apache.org/jira/browse/ARROW-3162 (complete server side)
https://issues.apache.org/jira/browse/ARROW-4573 (unit tests)
https://issues.apache.org/jira/browse/ARROW-4575 (integration tests)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants