-
Notifications
You must be signed in to change notification settings - Fork 926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite remaining Python Arrow interop conversions using the C Data Interface #16548
Rewrite remaining Python Arrow interop conversions using the C Data Interface #16548
Conversation
@zeroshade nice work on |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Things seem fine to me, I don't have any significant comments. The complexity of this stuff is so high and there's a lot of "raw" objects like PyCapsule
and void*
everywhere... but I assume that's just the cost we have to pay for dealing with a C data interface.
Thanks! Could we add that "somewhat convoluted test case" to the unit tests so that it doesn't get missed in future changes? 😄 |
@zeroshade @paleolimbot I believe that this PR has uncovered a bug in nanoarrow. I raised apache/arrow-nanoarrow#587 to discuss that, but let me know if anything seems wrong here in how we've implemented things in libcudf. The failing test can be run with |
Based on @paleolimbot's comment here, perhaps what we should be doing is adding |
I think so! (Also will be fixed by apache/arrow-nanoarrow#588, which basically adds the same switch statement to nanoarrow so that you and/or future users don't have to deal with that 🙂 ) |
It seems like we need to map That said, adding the mapping does appear sufficient to resolve the issue. |
Not a typo, as you surmised we need to map If you can think of a way to streamline it, then that would be awesome, but otherwise this sounds like the right solution until we can incorporate just calling it on the nanoarrow side via the PR that @paleolimbot linked to |
/merge |
…C Data Interface (rapidsai#16548)" This reverts commit f955dd7.
Description
This PR rewrites all remaining parts of the Python interop code previously using Arrow C++ types to instead use the C Data Interface. With this change, we no longer require pyarrow in that part of the Cython code. There are further improvements that we should make to streamline the internals, but I would like to keep this changeset minimal since getting it merged unblocks progress on multiple fronts so that we can progress further in parallel.
Contributes to #15193
Checklist