Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-37194: Add a new public PyObject_CallNoArgs() function #13890

Merged
merged 1 commit into from
Jun 17, 2019
Merged

bpo-37194: Add a new public PyObject_CallNoArgs() function #13890

merged 1 commit into from
Jun 17, 2019

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Jun 7, 2019

Add a new public PyObject_CallNoArg() function to the C API: call a
callable Python object without any arguments.

Convert _PyObject_CallNoArg(), which became a static inline function
in Python 3.8, into a "regular function". For example,
PyObject_CallNoArg() is exported in DLL symbols on Windows.

https://bugs.python.org/issue37194

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 7, 2019

Why not have both PyObject_CallNoArgs and _PyObject_CallNoArgs? The latter is an inline function, which is slightly faster.

Would it be possible to not change the docs here but do that on #13844 instead?

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

Why not have both PyObject_CallNoArgs and _PyObject_CallNoArgs? The latter is an inline function, which is slightly faster.

First, it's for practical reasons. Having to including an internal header is more painful, especially in C extensions built by setup.py.

Second, honestly, I expect zero benefit of using a static inline function rather than a regular function, in term of performance.

It's a trade-off between usability and performance ;-) IMHO usuability matters more than such "micro-optimization".

Note: this PR prepares CPython code base to move new vectorcall private declarations to the internal C API (add a new pycore_abstract.h header).

Copy link
Contributor

@asvetlov asvetlov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

Would it be possible to not change the docs here but do that on #13844 instead?

Sure: done. I wasn't aware of your other PR. Feel free to replace "return None" by "raise an exception and return NULL" in your PR ;-)

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

@jdemeyer: Would you mind to review this change? Are you ok to make PyObject_CallNoArg() public?

@methane
Copy link
Member

methane commented Jun 7, 2019

LGTM.
No bike shedding is needed for naming before making it public? :)

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

No bike shedding is needed for naming before making it public? :)

_PyObject_CallNoArg() exists for 3 years and I don't recall any complain against its name :-D

_PyObject_CallNoArg() was first added as a macro by https://bugs.python.org/issue27809 change:

commit 559bb6a71399af3b1b2a0ba97230d2bcc649e993
Author: Victor Stinner <victor.stinner@gmail.com>
Date:   Mon Aug 22 22:48:54 2016 +0200

    Rename _PyObject_FastCall() to _PyObject_FastCallDict()
    
    Issue #27809:
    
    * Rename _PyObject_FastCall() function to _PyObject_FastCallDict()
    * Add _PyObject_FastCall(), _PyObject_CallNoArg() and _PyObject_CallArg1()
      macros calling _PyObject_FastCallDict()

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 7, 2019

_PyObject_CallNoArg() exists for 3 years and I don't recall any complain against its name

Names of private functions don't really matter.

Allow me to propose PyObject_CallNoArgs (with an s added at the end).

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 7, 2019

IMHO usuability matters more than such "micro-optimization".

You can say the same thing for every single micro-optimization, but I'm afraid that at some point all those micro-optimizations add up to something significant.

Note that the benefit of an inline function (or macro) is not just removing the function call overhead but also more possibilities for code optimization.

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

You can say the same thing for every single micro-optimization, but I'm afraid that at some point all those micro-optimizations add up to something significant.

If it's an optimization, you have to come up with a benchmark (micro-benchmark in this case) to prove your point ;-)

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

Allow me to propose PyObject_CallNoArgs (with an s added at the end).

Sure thing: done. I squashed my commits to be able to update the commit message. I also added the new function in What's New in Python 3.8.

Sorry, English is not my first language. In French, "0 items" is spelled "0 élément". It's counter-intuitive for me to add an S to an empty set :-)

@serhiy-storchaka
Copy link
Member

Could you please check how much this will increase the stack consumption?

@serhiy-storchaka
Copy link
Member

If neither performance nor stack consumption do not matter, you could call PyObject_CallFunctionObjArgs(callable, NULL) instead of PyObject_CallNoArgs(callable). I prefer to keep the C API minimal.

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 7, 2019

If it's an optimization, you have to come up with a benchmark (micro-benchmark in this case) to prove your point ;-)

Coming up with benchmark results is easy. The hard part is interpreting them. For example, assume hypothetically that this change costs 1 nanosecond on average per call of _PyObject_CallNoArg. Is that significant enough to say that the change is bad?

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

For example, assume hypothetically that this change costs 1 nanosecond on average per call of _PyObject_CallNoArg. Is that significant enough to say that the change is bad?

It wouldn't be worth it. Usually, we require at least 10% faster with an optimization. A Python function call usually takes between 50 and 100 ns.

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

If neither performance nor stack consumption do not matter, you could call PyObject_CallFunctionObjArgs(callable, NULL) instead of PyObject_CallNoArgs(callable). I prefer to keep the C API minimal.

Results:

test_python_call: 8311 calls before crash, stack: 1008 bytes/call
test_python_getitem: 10070 calls before crash, stack: 832 bytes/call
test_python_iterator: 8445 calls before crash, stack: 992 bytes/call
test_call_noargs: 13428 calls before crash, stack: 624 bytes/call
test_call_noargs2: 8728 calls before crash, stack: 960 bytes/call

=> total: 48982 calls, 4416 bytes
  • PyObject_CallNoArgs(func): 624 bytes/call
  • PyObject_CallFunctionObjArgs(func, NULL): 960 bytes/call (+ 336 bytes)

Use testcapi_call_no_args.patch and stack_overflow-4.py attached to https://bugs.python.org/issue37194 Usage:

git apply testcapi_call_no_args.patch
make
./python stack_overflow-4.py

PyObject_CallFunctionObjArgs() isn't really efficient in term of stack consumption :-(

static PyObject *
object_vacall(PyObject *base, PyObject *callable, va_list vargs)
{
    PyObject *small_stack[_PY_FASTCALL_SMALL_STACK];
    PyObject **stack;
    Py_ssize_t nargs;
    PyObject *result;
    Py_ssize_t i;
    va_list countva;

    ....
}

PyObject *
PyObject_CallFunctionObjArgs(PyObject *callable, ...)
{
    va_list vargs;
    PyObject *result;

    va_start(vargs, callable);
    result = object_vacall(NULL, callable, vargs);
    va_end(vargs);

    return result;
}

where small_stack is always allocated and uses 40 bytes.

Let me use https://bugs.python.org/issue30866 script to measure stack consumption.

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

What is the purpose of adding PyObject_CallNoArgs(), especially if we continue to use its private twin?

The private C API must no be used by third party code. Adding a public function allows third party code to benefit of PyObject_CallNoArgs() better performance and lower stack usage.

I still plan to convert _PyObject_CallNoArg() to PyObject_CallNoArgs(), but I would prefer to do it in a separated PR since https://bugs.python.org/issue37194 became more messy than what I expected :-)

I made many assumptions which became wrong:

  • static inline functions must not be used in the public C API. In fact, I already converted multiple macros into static inline functions!

  • static inline functions are the root issue of https://bugs.python.org/issue37191 Well, it seems like the main issuse is that some projects are compiled with a strange mixture of C standards: accept static inline from C99 but reject declaration after statement (C89/C90).

  • vectorcall APIs which are currently private must be internal. I read again Petr's message, and the intent seems to be to let third party code to experiment the new API through a private API: https://mail.python.org/pipermail/python-dev/2019-May/157753.html

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

Why we need yet one way?

The main argument is for better performance. But I also prefer PyObject_CallNoArgs() for readability.

I dislike having to pass NULL to call a function if the function takes no argument: PyObject_CallObject(callable, NULL), PyEval_CallObjectWithKeywords(callable, NULL), PyObject_CallFunction(callable, NULL), PyEval_CallFunction(callable, NULL), PyVectorcall_Call(callable, NULL, NULL).

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

I still plan to convert _PyObject_CallNoArg() to PyObject_CallNoArgs(), but I would prefer to do it in a separated PR since https://bugs.python.org/issue37194 became more messy than what I expected :-)

Ah, I forgot to mention: my plan is to focus on C extensions built by setup.py. My long term plan for them is to use the stable API, so stop using the private API. If we could switch to the stable API without losing performance, that would be great :-)

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 7, 2019

If we could switch to the stable API without losing performance, that would be great :-)

To be honest, I don't think that this is a realistic goal. But that's a different discussion which doesn't belong on this PR.

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 7, 2019

static inline functions are the root issue of https://bugs.python.org/issue37191 Well, it seems like the main issuse is that some projects are compiled with a strange mixture of C standards: accept static inline from C99 but reject declaration after statement (C89/C90).

So then do you agree with #13887 after all?

@vstinner
Copy link
Member Author

vstinner commented Jun 7, 2019

Jeroen:

So then do you agree with #13887 after all?

I merged the PR.

me:

If we could switch to the stable API without losing performance, that would be great :-)

Jeroen:

To be honest, I don't think that this is a realistic goal. But that's a different discussion which doesn't belong on this PR.

If I have to trade between performance and stable API, I will pick stable API. CPython should use the stable API that it provides. But you are right that it's is off-topic :-)

And again, it's a long-term goal ;-)

me:

Ah, I forgot to mention: my plan is to focus on C extensions built by setup.py. My long term plan for them is to use the stable API, so stop using the private API. If we could switch to the stable API without losing performance, that would be great :-)

Since I merged PR #13887, my idea of moving the vectorcall private API to the internal C API is no longer relevant. So I updated this PR to with a new commit:

    Use the public function in C extensions
    
    Replace private _PyObject_CallNoArg() with public
    PyObject_CallNoArgs() in C extensions: _asyncio, _datetime,
    _elementtree, _pickle, _tkinter and readline.

@jdemeyer, @serhiy-storchaka: Does the updated PR make more sense?

@vstinner
Copy link
Member Author

@markshannon: Would you mind to have a look?

@jdemeyer
Copy link
Contributor

I think there are cases where a function is available both in the limited API and the cpython API but with a different implementation. Maybe PyObject_CallNoArgs could be such a function? If Py_LIMITED_API is not set, then it's equivalent to the existing _PyObject_CallNoArg. If Py_LIMITED_API is set, then it becomes an actual function.

@vstinner
Copy link
Member Author

I think there are cases where a function is available both in the limited API and the cpython API but with a different implementation. Maybe PyObject_CallNoArgs could be such a function? If Py_LIMITED_API is not set, then it's equivalent to the existing _PyObject_CallNoArg. If Py_LIMITED_API is set, then it becomes an actual function.

I prefer to use directly private _PyObject_CallNoArg() in CPython and use public PyObject_CallNoArgs() in C extensions. IMHO it's less surprising than the magic PyThreadState_GET() which gets a different implementation depending which header file is included... See pycore_pystate.h:

/* Redefine PyThreadState_GET() as an alias to _PyThreadState_GET() */
#undef PyThreadState_GET
#define PyThreadState_GET() _PyThreadState_GET()

I would prefer to avoid such hack. By the way, I'm trying to always use _PyThreadState_GET(), to ensure that we always use the optimized macro.

@vstinner
Copy link
Member Author

@serhiy-storchaka: Oh, I missed 3.8 deadline. So yeah, it seems like this function has to wait for Python 3.9 sadly. I updated my PR. Would you mind to review it?

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 14, 2019

I would prefer to avoid such hack.

You should inform the author of _PyThreadState_GET() :-)

commit 50b48572d9a90c5bb36e2bef6179548ea927a35a
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Nov 1 01:51:40 2018 +0100

    [bpo-35081](https://bugs.python.org/issue35081): Add _PyThreadState_GET() internal macro (GH-10266)

In all seriousness, this hack actually looks like a good idea to me and I assumed that these kind of hacks where meant to be used more frequently. Having a single function name simplifies the API while still allowing optimizations if no stable ABI is needed.

@vstinner
Copy link
Member Author

(Ooops, I just removed "needs backport to 3.8" label. This change must not land into 3.8.)

Add a new public PyObject_CallNoArgs() function to the C API: call a
callable Python object without any arguments.

It is the most efficient way to call a callback without any argument.
On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL)
allocates 960 bytes on the stack per call, whereas
PyObject_CallNoArgs(func) only allocates 624 bytes per call.

It is excluded from stable ABI 3.8.

Replace private _PyObject_CallNoArg() with public
PyObject_CallNoArgs() in C extensions: _asyncio, _datetime,
_elementtree, _pickle, _tkinter and readline.
@vstinner
Copy link
Member Author

vstinner commented Jun 17, 2019

I rebased my PR and squashed commits to update the commit message. I also moved the news entry to What's New in Python 3.9.

Updated commit message:

commit 39e3371bc3bdf1b2b67f5bcedd877ac24d5a78d6 (HEAD -> pyobject_callnoarg, origin/pyobject_callnoarg)
Author: Victor Stinner <vstinner@redhat.com>
Date:   Fri Jun 7 13:56:46 2019 +0200

    [bpo-37194](https://bugs.python.org/issue37194): Add a new public PyObject_CallNoArgs() function
    
    Add a new public PyObject_CallNoArgs() function to the C API: call a
    callable Python object without any arguments.
    
    It is the most efficient way to call a callback without any argument.
    On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL)
    allocates 960 bytes on the stack per call, whereas
    PyObject_CallNoArgs(func) only allocates 624 bytes per call.
    
    It is excluded from stable ABI 3.8.
    
    Replace private _PyObject_CallNoArg() with public
    PyObject_CallNoArgs() in C extensions: _asyncio, _datetime,
    _elementtree, _pickle, _tkinter and readline.

@methane: Would you mind have to have a look?

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 17, 2019

The documentation change will conflict with #13844. So I would prefer if either #13844 is merged first or the documentation change is taken out of this PR and moved to #13844 or a separate PR.

@vstinner vstinner merged commit 2ff58a2 into python:master Jun 17, 2019
@vstinner vstinner deleted the pyobject_callnoarg branch June 17, 2019 12:27
@vstinner
Copy link
Member Author

The documentation change will conflict with #13844. So I would prefer if either #13844 is merged first or the documentation change is taken out of this PR and moved to #13844 or a separate PR.

Conflicts happen all the time, that's the game :-) I just added a few lines of code. I reverted many doc changes to let you include them in your other PR. I let you handle the conflict.

@vstinner
Copy link
Member Author

Thanks for reviews. I elaborated the rationale in the final change to better explain why a new function is needed: "It is the most efficient way to call a callback without any argument."

@jdemeyer
Copy link
Contributor

jdemeyer commented Jun 17, 2019

Conflicts happen all the time, that's the game :-)

But a very unfair game being played between a core developer and a non-core developer.

I am writing that documentation not for myself, but for the benefit of all CPython users and developers. I have little personal interest in getting that documentation merged. I don't want to fix conflicts all the time. And now, you even introduced a difference between 3.8 and 3.9 so even the backport won't be automatic.

@vstinner
Copy link
Member Author

I don't want to fix conflicts all the time.

It's common that I have to rebase a change 3 to 5 times. You created your PR 12 days ago and only got one conflict. But it's up to you to decide to update this PR or not.

Since you wrote that you gave up on PR #13844, I created PR #14156 to make changes that you asked me to revert in this PR.

lisroach pushed a commit to lisroach/cpython that referenced this pull request Sep 10, 2019
…13890)

Add a new public PyObject_CallNoArgs() function to the C API: call a
callable Python object without any arguments.

It is the most efficient way to call a callback without any argument.
On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL)
allocates 960 bytes on the stack per call, whereas
PyObject_CallNoArgs(func) only allocates 624 bytes per call.

It is excluded from stable ABI 3.8.

Replace private _PyObject_CallNoArg() with public
PyObject_CallNoArgs() in C extensions: _asyncio, _datetime,
_elementtree, _pickle, _tkinter and readline.
DinoV pushed a commit to DinoV/cpython that referenced this pull request Jan 14, 2020
…13890)

Add a new public PyObject_CallNoArgs() function to the C API: call a
callable Python object without any arguments.

It is the most efficient way to call a callback without any argument.
On x86-64, for example, PyObject_CallFunctionObjArgs(func, NULL)
allocates 960 bytes on the stack per call, whereas
PyObject_CallNoArgs(func) only allocates 624 bytes per call.

It is excluded from stable ABI 3.8.

Replace private _PyObject_CallNoArg() with public
PyObject_CallNoArgs() in C extensions: _asyncio, _datetime,
_elementtree, _pickle, _tkinter and readline.
JukkaL pushed a commit to python/mypy that referenced this pull request May 25, 2022
Use PyObject_CallNoArgs and PyObject_CallOneArg to replace 
PyObject_CallFunctionObjArgs in lib-rt, since the new API costs 
less memory according to python/cpython#13890 (comment)

Also remove the macro in pythonsupport.h since the two API are 
already covered by pythoncapi_compat.h.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants