Skip to content

Make Argument Clinic aware of possible UBSan failures  #128180

Closed
@picnixz

Description

@picnixz

Feature or enhancement

A huge part of the CPython code base uses Argument Clinic for generating methods for classes or standalone functions. Generally, methods are then used as follows:

static PyMethodDef foo_methods[] = {
    _FOOMODULE_FOO_BAR_METHODDEF
    {NULL, NULL}
};

where each <MODULE>_<CLASS>_<METHOD>_METHODDEF is something like

PyDoc_STRVAR(_foomodule_Foo_bar__doc__,
"bar($self, /)\n"
"--\n"
"\n");

#define _FOOMODULE_FOO_BAR_METHODDEF 		\
    {										\
        "bar", 								\
        (PyCFunction)_foomodule_Foo_bar, 	\
        METH_NOARGS, 						\
        _foomodule_Foo_bar__doc__           \
    },

We also have

// @file Modules/foomodule.c

/*[clinic input]
_foomodule.Foo.bar

[clinic start generated code]*/

static PyObject *
_foomodule_Foo_bar_impl(FooObject *self)
/*[clinic end generated code: output=... input=...]*/
{
	Py_RETURN_NONE;
}

// @file Modules/clinic/foomodule.c.h

static PyObject *
_foomodule_Foo_bar_impl(FooObject *self);

static PyObject *
_foomodule_Foo_bar(FooObject *self, PyObject *Py_UNUSED(ignored))
{
    return _foomodule_Foo_bar_impl(self);
}

Now, in order to fix #111178, we generally convert non-AC clinic methods by changing FooObject * to PyObject * and doing a pointer cast in the function body. See #111178 for a wider discussion. The issue is that we cannot easily do the same for AC-generated code since the prototype is part of the auto-generated block.

We can do it if we specify self: self(type="PyObject *") instead of not specifying self in the AC input, but this is an overkill. Instead, AC should always generate two functions: an implementation function which takes a typed object (i.e., FooObject) and a public dispatcher which takes a PyObject *self and possibly unused arguments. Stated otherwise:

static PyObject *
-_foomodule_Foo_bar(FooObject *self, PyObject *Py_UNUSED(ignored))
+_foomodule_Foo_bar(PyObject *self, PyObject *Py_UNUSED(ignored))
{
-    return _foomodule_Foo_bar_impl(self);
+    return _foomodule_Foo_bar_impl((FooObject *)self);
}

I decided to open a separate issue to track this change orthogonally to the UBSan failures fixes that can be achieved without any AC changes. I hope that this will not deterioriate the performances of CPython too much.


Note: this does not only affect methods as above; this affects any method that is casted to PyCFunction or anything else with incompatible function pointer signatures.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions