Description
Motivation
Pybind11 provides easy-to-use APIs for writing Python bindings. However, these bindings are often implemented with multiple level of indirections, which can have negative impact on the performance. For example, class_::def_property
and class_::def_property_readonly
are currently implemented as:
- A
PyCFunction
wrapper capturing a capsule holding adetail::function_record
- ... wrapped in an
instancemethod
- ... wrapped in a
property
stored in the class dictionary.
Compared to a getset_descriptor
directly implemented with PyGetSetDef
, this can be 5x-20x slower both in terms of execution time and instruction counts. Similarly, constructors implemented in pybind11 can be slower than a low-level constructor function directly bound to type with the tp_init
slot. Both problems show the necessity of having a set of high performance binding APIs for pybind11.
Design
The new API is called "class builder" because it employs the builder pattern:
pybind11::class_builder<Example>(m, "Example")
.def("method", [](Example& e) {})
.def_class("class_method", [](Example& e) {})
.def_static("static_method", []() { return 1; })
.def_attr_readonly("readonly_attr", []() { return 1; })
.build();
To create high-performance bindings for a class, we would call pybind11::class_builder
instead of pybind11::class_
with almost exactly the same parameters. This gives us a class builder, which we can use to define a series of methods and attributes (note these are getset_descriptor
s not property
s). After we define all the members of the class, we call the build
method, which will create the Python class and return a pybind11::class_<...>
instance.
Internally, def_attr
and def_attr_readonly
is implemented as adding a new entry to the tp_getset
slot, and def
, def_class
and def_static
is implemented as adding a new entry to the tp_methods
slot. The build
method will call PyType_Ready
method on the type being built, and wraps the ready PyTypeObject
in a pybind11::class_<...>
instance.
Misc
- With the class builder API, we are going to directly set the slots of the type object. Hence, we need to split
cpp_function
into two parts: object wrapper and dispatching logic. For the class builder API, the previous part isn't needed. We'll however keep it for backward compatibility. - Should we have a separate
class_builder
API, or should we just rework the internals ofpybind11::class_
? Personally I think it depends on whether we are allowed to add newPyMethodDef
andPyGetSetDef
after callingPyType_Ready
. I guess the answer is probably no so a separate set of APIs is likely needed. - Pybind11 functions are, in fact, closures capturing a
detail::function_record
, and it is challenging to make them work withPyMethodDef
, becausePyMethodDef
has no field to store the closure pointer (PyGetSetDef
has a closure pointer and does not have this problem). Perhaps we will need the closure functionality fromlibffi
to solve this problem?