-
-
Notifications
You must be signed in to change notification settings - Fork 31.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-37194: Add a new public PyObject_CallNoArgs() function #13890
Conversation
Why not have both Would it be possible to not change the docs here but do that on #13844 instead? |
First, it's for practical reasons. Having to including an internal header is more painful, especially in C extensions built by setup.py. Second, honestly, I expect zero benefit of using a static inline function rather than a regular function, in term of performance. It's a trade-off between usability and performance ;-) IMHO usuability matters more than such "micro-optimization". Note: this PR prepares CPython code base to move new vectorcall private declarations to the internal C API (add a new pycore_abstract.h header). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
Sure: done. I wasn't aware of your other PR. Feel free to replace "return None" by "raise an exception and return NULL" in your PR ;-) |
@jdemeyer: Would you mind to review this change? Are you ok to make PyObject_CallNoArg() public? |
LGTM. |
_PyObject_CallNoArg() exists for 3 years and I don't recall any complain against its name :-D _PyObject_CallNoArg() was first added as a macro by https://bugs.python.org/issue27809 change:
|
Names of private functions don't really matter. Allow me to propose |
You can say the same thing for every single micro-optimization, but I'm afraid that at some point all those micro-optimizations add up to something significant. Note that the benefit of an inline function (or macro) is not just removing the function call overhead but also more possibilities for code optimization. |
If it's an optimization, you have to come up with a benchmark (micro-benchmark in this case) to prove your point ;-) |
Sure thing: done. I squashed my commits to be able to update the commit message. I also added the new function in What's New in Python 3.8. Sorry, English is not my first language. In French, "0 items" is spelled "0 élément". It's counter-intuitive for me to add an S to an empty set :-) |
Could you please check how much this will increase the stack consumption? |
If neither performance nor stack consumption do not matter, you could call |
Coming up with benchmark results is easy. The hard part is interpreting them. For example, assume hypothetically that this change costs 1 nanosecond on average per call of |
It wouldn't be worth it. Usually, we require at least 10% faster with an optimization. A Python function call usually takes between 50 and 100 ns. |
Results:
Use testcapi_call_no_args.patch and stack_overflow-4.py attached to https://bugs.python.org/issue37194 Usage:
PyObject_CallFunctionObjArgs() isn't really efficient in term of stack consumption :-(
where small_stack is always allocated and uses 40 bytes. Let me use https://bugs.python.org/issue30866 script to measure stack consumption. |
The private C API must no be used by third party code. Adding a public function allows third party code to benefit of PyObject_CallNoArgs() better performance and lower stack usage. I still plan to convert _PyObject_CallNoArg() to PyObject_CallNoArgs(), but I would prefer to do it in a separated PR since https://bugs.python.org/issue37194 became more messy than what I expected :-) I made many assumptions which became wrong:
|
The main argument is for better performance. But I also prefer PyObject_CallNoArgs() for readability. I dislike having to pass NULL to call a function if the function takes no argument: PyObject_CallObject(callable, NULL), PyEval_CallObjectWithKeywords(callable, NULL), PyObject_CallFunction(callable, NULL), PyEval_CallFunction(callable, NULL), PyVectorcall_Call(callable, NULL, NULL). |
Ah, I forgot to mention: my plan is to focus on C extensions built by setup.py. My long term plan for them is to use the stable API, so stop using the private API. If we could switch to the stable API without losing performance, that would be great :-) |
To be honest, I don't think that this is a realistic goal. But that's a different discussion which doesn't belong on this PR. |
So then do you agree with #13887 after all? |
Jeroen:
I merged the PR. me:
Jeroen:
If I have to trade between performance and stable API, I will pick stable API. CPython should use the stable API that it provides. But you are right that it's is off-topic :-) And again, it's a long-term goal ;-) me:
Since I merged PR #13887, my idea of moving the vectorcall private API to the internal C API is no longer relevant. So I updated this PR to with a new commit:
@jdemeyer, @serhiy-storchaka: Does the updated PR make more sense? |
@markshannon: Would you mind to have a look? |
I think there are cases where a function is available both in the limited API and the cpython API but with a different implementation. Maybe |
I prefer to use directly private _PyObject_CallNoArg() in CPython and use public PyObject_CallNoArgs() in C extensions. IMHO it's less surprising than the magic PyThreadState_GET() which gets a different implementation depending which header file is included... See pycore_pystate.h:
I would prefer to avoid such hack. By the way, I'm trying to always use _PyThreadState_GET(), to ensure that we always use the optimized macro. |
@serhiy-storchaka: Oh, I missed 3.8 deadline. So yeah, it seems like this function has to wait for Python 3.9 sadly. I updated my PR. Would you mind to review it? |
You should inform the author of
In all seriousness, this hack actually looks like a good idea to me and I assumed that these kind of hacks where meant to be used more frequently. Having a single function name simplifies the API while still allowing optimizations if no stable ABI is needed. |
(Ooops, I just removed "needs backport to 3.8" label. This change must not land into 3.8.) |
Add a new public PyObject_CallNoArgs() function to the C API: call a callable Python object without any arguments. It is the most efficient way to call a callback without any argument. On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL) allocates 960 bytes on the stack per call, whereas PyObject_CallNoArgs(func) only allocates 624 bytes per call. It is excluded from stable ABI 3.8. Replace private _PyObject_CallNoArg() with public PyObject_CallNoArgs() in C extensions: _asyncio, _datetime, _elementtree, _pickle, _tkinter and readline.
I rebased my PR and squashed commits to update the commit message. I also moved the news entry to What's New in Python 3.9. Updated commit message:
@methane: Would you mind have to have a look? |
Conflicts happen all the time, that's the game :-) I just added a few lines of code. I reverted many doc changes to let you include them in your other PR. I let you handle the conflict. |
Thanks for reviews. I elaborated the rationale in the final change to better explain why a new function is needed: "It is the most efficient way to call a callback without any argument." |
But a very unfair game being played between a core developer and a non-core developer. I am writing that documentation not for myself, but for the benefit of all CPython users and developers. I have little personal interest in getting that documentation merged. I don't want to fix conflicts all the time. And now, you even introduced a difference between 3.8 and 3.9 so even the backport won't be automatic. |
It's common that I have to rebase a change 3 to 5 times. You created your PR 12 days ago and only got one conflict. But it's up to you to decide to update this PR or not. Since you wrote that you gave up on PR #13844, I created PR #14156 to make changes that you asked me to revert in this PR. |
…13890) Add a new public PyObject_CallNoArgs() function to the C API: call a callable Python object without any arguments. It is the most efficient way to call a callback without any argument. On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL) allocates 960 bytes on the stack per call, whereas PyObject_CallNoArgs(func) only allocates 624 bytes per call. It is excluded from stable ABI 3.8. Replace private _PyObject_CallNoArg() with public PyObject_CallNoArgs() in C extensions: _asyncio, _datetime, _elementtree, _pickle, _tkinter and readline.
…13890) Add a new public PyObject_CallNoArgs() function to the C API: call a callable Python object without any arguments. It is the most efficient way to call a callback without any argument. On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL) allocates 960 bytes on the stack per call, whereas PyObject_CallNoArgs(func) only allocates 624 bytes per call. It is excluded from stable ABI 3.8. Replace private _PyObject_CallNoArg() with public PyObject_CallNoArgs() in C extensions: _asyncio, _datetime, _elementtree, _pickle, _tkinter and readline.
Use PyObject_CallNoArgs and PyObject_CallOneArg to replace PyObject_CallFunctionObjArgs in lib-rt, since the new API costs less memory according to python/cpython#13890 (comment) Also remove the macro in pythonsupport.h since the two API are already covered by pythoncapi_compat.h.
Add a new public PyObject_CallNoArg() function to the C API: call a
callable Python object without any arguments.
Convert _PyObject_CallNoArg(), which became a static inline function
in Python 3.8, into a "regular function". For example,
PyObject_CallNoArg() is exported in DLL symbols on Windows.
https://bugs.python.org/issue37194