Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-36974: separate vectorcall functions for each calling convention #13781

Merged
merged 4 commits into from
Jul 5, 2019

Conversation

jdemeyer
Copy link
Contributor

@jdemeyer jdemeyer commented Jun 3, 2019

Implement an optimization which is mentioned in PEP 590: use a specific vectorcall function for each calling convention (METH_O, METH_FASTCALL and so on) to replace the generic _PyCFunction_Vectorcall.

This allows getting rid of _PyMethodDef_RawFastCallKeywords in Python 3.9 (we should keep it in 3.8 since Cython uses it).

CC @encukou @markshannon

https://bugs.python.org/issue36974

@encukou
Copy link
Member

encukou commented Jun 4, 2019

Thank you!

This will not go into beta1 (we're asked to not push last-minute changes).
Not sure if it'll qualify for beta2, or if it'll have to wait until 3.9. I'll ask the RM once beta1 is out.
Either way it'll need a news entry now.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Jun 4, 2019

Either way it'll need a news entry now.

Why? It's an implementation detail that doesn't affect any public API or behavior.

Apart from that, do you accept this PR for merging (at least for 3.9)?

@encukou
Copy link
Member

encukou commented Jun 4, 2019

Either way it'll need a news entry now.

Why? It's an implementation detail that doesn't affect any public API or behavior.

It does change functionality. See the GDB plugin – there are other tools like it that aren't tested in CPython's test suite.

Apart from that, do you accept this PR for merging (at least for 3.9)?

I haven't reviewed it yet, but yes, it looks good at first glance.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Jun 4, 2019

I haven't reviewed it yet, but yes, it looks good at first glance.

If so, I'll do the analogous change for PyMethodDescr.

@markshannon
Copy link
Member

Have you benchmarked this change?

@jdemeyer
Copy link
Contributor Author

Have you benchmarked this change?

Using https://github.com/jdemeyer/callbench:

Before:

VARARGS function():             Mean +- std dev: 24.7 ns +- 0.3 ns
FASTCALL function():            Mean +- std dev: 19.5 ns +- 0.4 ns
VARARGS obj.method():           Mean +- std dev: 63.0 ns +- 2.3 ns
FASTCALL obj.method():          Mean +- std dev: 56.3 ns +- 1.5 ns
VARARGS bound method():         Mean +- std dev: 25.6 ns +- 2.7 ns
FASTCALL bound method():        Mean +- std dev: 19.4 ns +- 0.4 ns
VARARGS unbound method(obj, ):  Mean +- std dev: 34.0 ns +- 0.3 ns
FASTCALL unbound method(obj, ): Mean +- std dev: 27.4 ns +- 0.4 ns
tp_call():                      Mean +- std dev: 19.9 ns +- 0.3 ns

After:

VARARGS function():             Mean +- std dev: 25.1 ns +- 1.1 ns
FASTCALL function():            Mean +- std dev: 17.2 ns +- 0.8 ns
VARARGS obj.method():           Mean +- std dev: 58.9 ns +- 0.7 ns
FASTCALL obj.method():          Mean +- std dev: 54.0 ns +- 2.2 ns
VARARGS bound method():         Mean +- std dev: 25.1 ns +- 1.3 ns
FASTCALL bound method():        Mean +- std dev: 16.9 ns +- 0.1 ns
VARARGS unbound method(obj, ):  Mean +- std dev: 29.9 ns +- 0.3 ns
FASTCALL unbound method(obj, ): Mean +- std dev: 25.0 ns +- 0.4 ns
tp_call():                      Mean +- std dev: 19.9 ns +- 0.3 ns

So this is clearly an improvement.

@jdemeyer jdemeyer force-pushed the pycfunction_vectorcall branch from 116d4b2 to 8d6661f Compare June 11, 2019 10:37
@jdemeyer
Copy link
Contributor Author

Rebased to latest master.

@jdemeyer
Copy link
Contributor Author

CC @methane

@jdemeyer jdemeyer force-pushed the pycfunction_vectorcall branch from b7ebda2 to 0f3206a Compare June 20, 2019 19:21
@jdemeyer
Copy link
Contributor Author

Rebased to latest master.

@encukou
Copy link
Member

encukou commented Jun 24, 2019

Two issues with the code:

  • PyCFunction's vectorcall implementations are only used in methodobject.c,
    so IMO they should be moved there and made static.
  • The big macros make debugging harder. Can we do without magically defined
    variables and return from macros?
    I'd prefer somewhat longer, but more straightforward code.

I made these changes, so I sent a PR to your branch. I didn't yet have time
to polish them; feel free to work change them if you agree.

@jdemeyer
Copy link
Contributor Author

Petr: what's your opinion in general about this PR? Suppose that I merge your changes and polish them, would you then be willing to merge this? I'm asking because other potential reviewers may disagree with your changes.

@encukou
Copy link
Member

encukou commented Jun 25, 2019

Yes, I'm willing to merge this (though I might take longer than some other reviewers).
If other reviewers disagree, I'd want to hear their thoughts.

@jdemeyer
Copy link
Contributor Author

jdemeyer commented Jun 26, 2019

I made some further fixes and cleanups regarding to Petr's version:

  1. Don't declare vectorcall_FOO as inline. They are called through a function pointer, inlining is meaningless.
  2. Rename vectorcall_FOO -> cfunction_vectorcall_FOO and _PyMethodDescr_Vectorcall_FOO -> method_vectorcall_FOO.
  3. For PyCFunctionObject, use existing macros PyCFunction_GET_FUNCTION and PyCFunction_GET_SELF.
  4. return -1 in case of an exception, to follow the more common CPython convention.
  5. Put _Py_CheckFunctionResult in _PyObject_Vectorcall and PyVectorcall_Call instead of in the callables. Get rid of vectorcall_end function.
  6. Merge vectorcall_begin and get_meth as one function cfunction_enter_call/method_enter_call.
  7. For method descriptors, merge vectorcall_check and check_no_kwargs as one function method_check_args.
  8. Fix reference leaks in METH_VARARGS cases.
  9. Added a testcase for dict.update() to test method_vectorcall_VARARGS_KEYWORDS with keyword arguments. This was not tested before.

@jdemeyer jdemeyer force-pushed the pycfunction_vectorcall branch from 1160ae1 to 800b8a8 Compare July 2, 2019 21:27
@jdemeyer
Copy link
Contributor Author

jdemeyer commented Jul 2, 2019

Rebased to latest master, fixing trivial merge conflict.

@encukou encukou merged commit 0d722f3 into python:master Jul 5, 2019
@jdemeyer
Copy link
Contributor Author

jdemeyer commented Jul 5, 2019

Thanks!

@jdemeyer jdemeyer deleted the pycfunction_vectorcall branch July 5, 2019 12:51
@encukou
Copy link
Member

encukou commented Jul 5, 2019

Thank you!

@jdemeyer
Copy link
Contributor Author

See #14782 for a backport to 3.8

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants