Skip to content

[Python] Use PyCapsule for communicating C Data Interface pointers at the Python level #34031

Closed
@jorisvandenbossche

Description

@jorisvandenbossche

Describe the enhancement requested

Currently we have the various _export_to_c / _import_from_c methods for working with the Arrow C Interface that expect integers as arguments for the struct pointers. We could also use PyCapsule objects for this instead of integers (or (certainly initially) in addition to), inspired by a similar interface from DLPack.

DLPack provides a stable in-memory data structure that allows exchanging array data between frameworks. It essentially plays the same role for ndarrays (tensors) as what the Arrow C interface does for arrow-compatible data (columnar data). It also defines a stable C ABI with a similar C struct definitions (header file).

In the DLPack project, apart from the stable C ABI struct, they also defined a python specification (including a method name to access the protocol, i.e. __dlpack__), see https://dmlc.github.io/dlpack/latest/python_spec.html#implementation for the details. And for that specification, they return not a raw pointer, but use a PyCapsule object (a python object that represents an "opaque value", such as a pointer, and that can used by C extensions to pass such values through Python code to other C code, https://docs.python.org/3/c-api/capsule.html).

Some details based on their implementation:

  • The PyCapsule has a name, and the producer should set that to a well-defined value (giving some protection to not getting a random pointer)
  • The consumer renames the capsule (which gives some protection to the C Data pointer being consumed more than once)
  • The PyCapsule has a destructor defined that would call the release callback (in case the object never got consumed)

The proposal would be to mimic what DLPack does in places where we now expect or return a integer pointer (the interface needs to be different as the current _export_to_c, as we would now return a capsule, instead of having the return pointer as a parameter of the method).

Component(s)

Python

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions