Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: fix improper use of C-API #12524

Merged
merged 7 commits into from
Dec 14, 2018
Merged

BUG: fix improper use of C-API #12524

merged 7 commits into from
Dec 14, 2018

Conversation

mattip
Copy link
Member

@mattip mattip commented Dec 10, 2018

In running tests on PyPy, there were failures due to improper use of the C-API:

  • Calling InitOperators to initialize the generated ufuncs before calling PyType_Ready(PyUFunc_Type)
  • Using the PySequence_Fast* functions before calling PySequence_Fast which works on CPython but does not always work on PyPy

We could prevent these mistakes by getting PyPy into our CI testing, but there are still test failures due to

  • insufficient ctypes and memoryview support in PyPy
  • late writing docstrings into types and objects via _multiarray_umath.add_docstring

@mattip mattip added 00 - Bug 09 - Backport-Candidate PRs tagged should be backported labels Dec 10, 2018
@mattip
Copy link
Member Author

mattip commented Dec 10, 2018

Something is wrong with LGTM analysis.

if (seq == NULL) {
return -1;
}
Py_DECREF(*out_kwd_obj);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use Py_SETREF(*out_kwd_obj, seq) here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now formally supported? I don't see it in https://docs.python.org/3.7/c-api/refcounting.html

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Searching those docs, Py_SETREF was mentioned in the 3.6 release notes, it was used extensively in solving bpo-20440. It seems the documentation has yet to catch up.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More relevantly though, I added a backport of it, in npy_compat I beleive

@@ -93,8 +93,16 @@ PyUFuncOverride_GetOutObjects(PyObject *kwds, PyObject **out_kwd_obj, PyObject *
return 0;
}
if (PyTuple_CheckExact(*out_kwd_obj)) {
*out_objs = PySequence_Fast_ITEMS(*out_kwd_obj);
return PySequence_Fast_GET_SIZE(*out_kwd_obj);
PyObject *seq = PySequence_Fast(*out_kwd_obj, "cannot convert");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So on pypy, exact tuples are not also always fast sequences?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They need conversion to a "fast strategy tuple". PyPy will convert a tuple of like items (all int) to an array of unboxed values. In order to support fast access, PyPy must box the values into an array of PyObject*, and build a CPython compatible PyTupleObject. Since this is expensive, it is only done when calling PySequence_Fast()

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that the PyPy implementation of PySequence_FAST contradicts the CPython docs (emphasis mine):

PyObject* PySequence_Fast(PyObject *o, const char *m)
Return value: New reference.

Return the sequence or iterable o as a list, unless it is already a tuple or list, in which case o is returned. Use PySequence_Fast_GET_ITEM() to access the members of the result. Returns NULL on failure. If the object is not a sequence or iterable, raises TypeError with m as the message text.

So I think we need to emphasize in the comment that the documentation for PySequence_Fast does not apply to PyPy.

It might be worth contacting the CPython devs and recommending they use weaker language in their documentation, to not make this guarantee.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhvk
Copy link
Contributor

mhvk commented Dec 10, 2018

Be sure to add some comments for the _Fast that this is needed for PyPy!

@mattip mattip added this to the 1.16.0 release milestone Dec 11, 2018
@@ -4570,6 +4570,9 @@ PyMODINIT_FUNC init_multiarray_umath(void) {
*/
PyArray_Type.tp_hash = PyObject_HashNotImplemented;

if (PyType_Ready(&PyUFunc_Type) < 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: I think we never do the if without curly braces.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except of course we did it in the place where you moved this from. So perhaps my sense of the style guide is wrong...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've even got if without the regular parentheses in a few rare cases thanks to macros substituting them in

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the convention is always use curly braces. Other case should be fixed when encountered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -4580,8 +4584,9 @@ PyMODINIT_FUNC init_multiarray_umath(void) {
}
initialize_casting_tables();
initialize_numeric_types();
if(initscalarmath(m) < 0)
if(initscalarmath(m) < 0) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: space after if

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed throughout the file

/*
* The C-API recommends calling PySequence_Fast before any of the other
* PySequence_Fast* functions. This is required for PyPy
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you might want to move this comment and the PySequence_Fast call to before the /* Recursive case ... */ comment

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

PyObject **in_objs, **out_objs;

/* check inputs */
nin = PyTuple_Size(args);
if (nin < 0) {
return -1;
}
in_objs = PySequence_Fast_ITEMS(args);
fast = PySequence_Fast(args, "could not convert args for fast access");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Be nice to make all the error messages the same. Maybe "Cannot convert args for fast access" or some such if length is a problem.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unified the message

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed one in ufunc_override.c.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, fixed that too

@charris charris merged commit 8814211 into numpy:master Dec 14, 2018
@charris
Copy link
Member

charris commented Dec 14, 2018

Some rainy day I'm going to fix all those long lines, but not today. Thanks Matti.

}
*out_objs = PySequence_Fast_ITEMS(seq);
ret = PySequence_Fast_GET_SIZE(seq);
Py_SETREF(*out_kwd_obj, seq);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above says that *out_kwd_obj is a borrowed reference, but as a result of PySequence_Fast, it's now a new reference, so the caller needs to decref it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, Py_SETREF is incorrect here - it's not safe to decref *out_kwd_obj, as you don't own it. Simply

*out_kwd_obj = seq

is what you want here

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. But now seq may not be a borrowed ref. I changed the interface so that *out_kwd_obj is never a borrowed ref.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that tests pass indicates there may be a leak of kwds['out'] somewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests pass because you were swapping a refcount between seq and the original out_kwd_obj. In CPython, those objects happen to be the same, so you got away with it.

@eric-wieser
Copy link
Member

eric-wieser commented Dec 14, 2018

@charris: Looks like we missed each other here. This patch has refcount bugs (probably only in PyPy), which need fixing.

@charris
Copy link
Member

charris commented Dec 14, 2018

@eric-wieser Thanks for checking.

@mattip mattip deleted the pypy-cleanup branch December 14, 2018 07:36
mattip added a commit to mattip/numpy that referenced this pull request Dec 14, 2018
@mattip
Copy link
Member Author

mattip commented Dec 14, 2018

xref #12544

@charris charris removed the 09 - Backport-Candidate PRs tagged should be backported label Dec 16, 2018
@charris charris removed this from the 1.16.0 release milestone Dec 16, 2018
charris added a commit that referenced this pull request Dec 17, 2018
BUG: fix refcount issue caused by #12524
charris pushed a commit to charris/numpy that referenced this pull request Dec 17, 2018
charris added a commit that referenced this pull request Dec 17, 2018
mozillazg pushed a commit to mozillazg/pypy that referenced this pull request Jul 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants