Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-111786: Turn off optimize for speed for _PyEval_EvalFrameDefault on MSVC #111794

Merged
merged 8 commits into from
Nov 9, 2023

Conversation

mdboom
Copy link
Contributor

@mdboom mdboom commented Nov 6, 2023

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @markshannon or @brandtbucher can merge if they agree.

Copy link
Member

@markshannon markshannon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3-10% faster and fixes the build. Sounds good to me 🙂
I'd like @zooba's approval though, in case we missed something.

@mdboom mdboom requested a review from zooba November 7, 2023 13:18
@zooba
Copy link
Member

zooba commented Nov 7, 2023

My only question would be whether we know that turning t off implicitly turns s (size) on, or if it leaves us with no optimisations at all.

The results seem to be okay for PGO builds, but regular builds may suffer a regression if optimisation is just turned off. (For debug builds we have it off deliberately, and would want to avoid turning it on.)

@mdboom mdboom marked this pull request as draft November 7, 2023 17:48
@mdboom
Copy link
Contributor Author

mdboom commented Nov 7, 2023

Converting to draft -- I'm seeing some non-determinacy to this fix, so we should hold off on merging until I get to the bottom of it.

@mdboom mdboom marked this pull request as ready for review November 7, 2023 18:38
@brandtbucher
Copy link
Member

I see this has been re-opened. Did you figure out the issue?

If so, it looks like it's ready to merge (assuming we aren't seeing a regression on non-PGO builds).

@mdboom
Copy link
Contributor Author

mdboom commented Nov 7, 2023

My only question would be whether we know that turning t off implicitly turns s (size) on, or if it leaves us with no optimisations at all.

I couldn't find a way to directly introspect this. However, I decided to try to forcibly turn "s" on after turning "t" off and compare the results to this PR, e.g.:

#  pragma optimize("t", off)
#  pragma optimize("s", on)

Using SizeBench, I was able to confirm that both that and this PR generate the same size code for _PyEval_EvalFrameDefault (it's in two blocks: 37,951 bytes of hot code and 28,104 bytes of cold code).

So that (might be) indirect evidence at least that we still get size optimization this way.

@mdboom
Copy link
Contributor Author

mdboom commented Nov 7, 2023

I see this has been re-opened. Did you figure out the issue?

Yes, it was entirely me not seeing the difference between optimize("", on) and optimize("", "on"). (I blame insufficient coffee).

If so, it looks like it's ready to merge (assuming we aren't seeing a regression on non-PGO builds).

I just fired off some benchmarking runs to test the effect of this on non-PGO builds. It will take a few hours to get the results back.

@brandtbucher
Copy link
Member

brandtbucher commented Nov 7, 2023

Yes, it was entirely me not seeing the difference between optimize("", on) and optimize("", "on"). (I blame insufficient coffee).

I blame "that's a rough API design".

@zooba
Copy link
Member

zooba commented Nov 7, 2023

So that (might be) indirect evidence at least that we still get size optimization this way.

Unoptimised code will be huge by comparison - 2x or more. Sounds like we're still getting the optimization.

@mdboom
Copy link
Contributor Author

mdboom commented Nov 7, 2023

Unoptimised code will be huge by comparison - 2x or more. Sounds like we're still getting the optimization.

And some level of PGO is still happening, as evidenced by the split into hot and cold paths.

If so, it looks like it's ready to merge (assuming we aren't seeing a regression on non-PGO builds).

Unfortunately, it seems this causes an 11% regression on non-PGO builds. (This is comparing non-PGO or this PR vs. a non-PGO build of its base 9f33ede).

If we care about this (I'm not sure we do -- that's a question for someone with more knowledge of the history of Windows builds). We can probably disable this #pragma when doing a non-PGO build, but I don't see an obvious way to do it. I don't think the compiler gives us a preprocessor variable with the info (at least I can't find one), so we'd probably have to inject a variable from the build system. Anyway, that's a problem for tomorrow for me.

@zooba
Copy link
Member

zooba commented Nov 7, 2023

FYI, pyproject.props is where you'll want to add it. You'll see examples of conditional PreprocessorDefinition for Py_NOGIL, and the condition should be Condition="$(SupportPGO) and ($(Configuration) == 'PGInstrument' or $(Configuration) == 'PGUpdate')".

I'd suggest _Py_USING_PGO as a name, unless there's something already being used via configure, in which case copy that.

@mdboom mdboom requested a review from a team as a code owner November 8, 2023 13:25
@mdboom
Copy link
Contributor Author

mdboom commented Nov 8, 2023

I have updated this to only disable the optimization when doing a PGO build.

@zooba's suggestion of only doing this during the PGInstrument and PGUpdate phases doesn't work. It's actually the final Release phase that fails, and having them behave differently reintroduces the original compiler error. So in this case _Py_USING_PGO is true for all of the phases of a PGO build.

EDIT: I take that back -- with a clean build this works, and is simpler.

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Let me know when you're ready for this to be merged.

@mdboom
Copy link
Contributor Author

mdboom commented Nov 8, 2023

@gvanrossum: As far as I know this is ready, but maybe @zooba should have one more quick look.

@gvanrossum
Copy link
Member

I'm just going to merge it, it looks great.

@gvanrossum gvanrossum enabled auto-merge (squash) November 9, 2023 18:30
@gvanrossum gvanrossum merged commit bc12f79 into python:main Nov 9, 2023
aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
…SVC for PGO (python#111794)

In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
Glyphack pushed a commit to Glyphack/cpython that referenced this pull request Sep 2, 2024
…SVC for PGO (python#111794)

In PGO mode, this function caused a compiler error in MSVC.
It turns out that optimizing for space only save the day, and is even faster.
However, without PGO, this is neither necessary nor slower.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants