-
-
Notifications
You must be signed in to change notification settings - Fork 31.6k
bpo-46323: Use PyObject_Vectorcall while calling ctypes callback function #31138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@erlend-aasland Can you please take a look? |
|
Unrelated |
Misc/NEWS.d/next/Core and Builtins/2022-02-05-14-46-21.bpo-46323.FC1OJg.rst
Outdated
Show resolved
Hide resolved
-11 ns is not really impressive :-( |
@vstinner Updated! |
Sorry, I don't have time until earliest next weekend :( (I remember doing similar stuff with some of the sqlite3 callbacks some months ago, but I discarded those changes because the added complexity did not outweigh the performance gain.) |
Yeah the performance improvement itself is not impressive, |
My rule is that an micro-optimization is worth it if it's at least 10% faster on a micro-benchmark. On a macro-benchmark like pyperformance, smaller speedup are worth it, but I have no global rule for that. Here I would expect to save 20 ns by avoiding the creating a tuple for positional arguments, but it's about -11 ns. Maybe "20 ns" is what I had in mind for my laptop CPU, but on your CPU, it's closer to 11 ns? |
It's not a strict rule and you're free to not follow it ;-) |
This comment was marked as resolved.
This comment was marked as resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (post merge 😆), good job! (I'll revisit my sqlite3 vectorcall branches now.)
/* Hm. What to return in case of error? | ||
For COM, 0xFFFFFFFF seems better than 0. | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is now misleading; it should be removed. Previously, PySequence_Length
could return -1 on error, but PyTuple_GET_SIZE
always succeeds (or, more correct: it does no error checking).
https://bugs.python.org/issue46323