-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Restore stack introspection ability on 3.12 #393
Conversation
Thanks for this! Just wanted to let you know I am looking at it. I am considering dropping |
Thanks for the update! I would encourage you to do more detailed performance testing if you drop Here's a somewhat worst-case example where the greenlet stack has 100 new frames on each switch; the switch takes 2.5usec longer than it would otherwise. With 10 new frames it's 350nsec longer.
I'm not particularly concerned about the footgun; I think use of |
I'm still running higher-level benchmarks with gevent, but at a low level, I don't think the frame manipulation is going to hurt in a significant way. Unfortunately, it's not for a great reason. I mentioned earlier that in general, we've been getting slower on newer versions of Python. Picking a set of micro-benchmarks focused on greenlet creation and switching, with one pure function call thrown in (
Python 3.9 and 3.10 perform about the same, but things start slowing down after that. The one place we get faster, "read 200 nested frames" is most likely due to interpreter changes, and should only be compared within the same version of CPython. Speaking of benchmarks that can only be compared with the same version of Python, I have a set that test switching between greenlets of various stack depths (2, 200, 400 for shallow, deep, and deeper) --- very, very similar to the examples shown above. These overall get faster on newer interpreters, probably largely a result of reduced function call overhead (as can be seen from the second chart, which is basically a benchmark of function calls).
OK, given all that, how do these benchmarks behave on 3.12 when we do not, or do, manipulate the stacks on switch?
The first four rows are essentially unchanged, as they should be since they don't involve much switching. The "shallow" call stack switch is slightly slower, and the deeper the call stack gets we see a hit, which appears to be linear in the depth, also as expected. If a 200 depth call stack is towards the high end of the bell curve, yes, the relative difference looks scary, but the absolute time --- 10 microseconds --- seems reasonable. But, coming from an earlier version of Python, how much of a performance difference will there be? As mentioned above, it's hard to compare that across versions, but let's try. If we subtract the function call time from the "recur" benchmarks and compare 3.9 to 3.12, this is what we get. First, the raw numbers:
We look pretty decent in most of those! But taking out the function call time, we get:
In all except the shallowest case, we're substantially faster. And the shallowest case only makes us look bad because the function call overhead is so much higher on 3.9 than it is on 3.12 --- subtracting 208 ns from 274 ns doesn't leave much (again, I'm not sure this is a valid way to attempt to normalize comparisons). Based on the trends, though, we can guess that somewhere between 2 calls on the stack and 200 calls on the stack, Python 3.12 with this patch starts out-performing Python 3.9 and 3.10. Most real world functions do something besides call one other function, so the relative time spent switching will probably be somewhere between those two sets of results. And I haven't actually encountered a real-world application where greenlet switching time was the bottleneck, so Amdahl's law applies. These are low-level benchmarks. But unless I'm missing something, I don't see anything too concerning here, nothing that makes me rethink. And again, gevent benchmarks still to come. |
This is a great suite of benchmarks, thank you! I'm curious how the higher-level benchmarks turn out, but I'm convinced that safe-by-default is the right path forward. My inclination is to still expose the option to use the faster/less-safe mode, perhaps under a name like |
Picking 30 or so of gevent's benchmarks --- only one of which is specifically designed to test switching speed, but most/all of which do involve switching --- shows no significant performance differences from always exposing the frames or not. (The variance it does show, 2--5% or so, is well within the normal variance I can see from run to run, even when run with pyperf's rigorous mode.) I'll proceed with always exposing the frames. I wanted to be comprehensive, so in addition to running the benchmarks in gevent's default mode, using all of its Cython-compiled C accelerator modules, I also ran the set in pure-Python mode, cutting the C modules down to a minimum. The results were interesting.
Using the native code libraries, there's a clear degradation of performance from 3.9 through 3.12. But without them, there's overall not much difference. So, maybe Cython is what's getting slower? To be clear, these are relative numbers; in absolute terms, the compiled benchmarks are still notably faster than the pure-Python versions. What's especially interesting is that the difference between them appears to be shrinking. This probably bears looking into, if I can ever get around to it.
|
Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.0.0 to 3.0.3. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst">greenlet's changelog</a>.</em></p> <blockquote> <h1>3.0.3 (2023-12-21)</h1> <ul> <li>Python 3.12: Restore the full ability to walk the stack of a suspended greenlet; previously only the innermost frame was exposed. See <code>issue 388 <https://github.com/python-greenlet/greenlet/issues/388></code><em>. Fix by Joshua Oreman in <code>PR 393 <https://github.com/python-greenlet/greenlet/pull/393/></code></em>.</li> </ul> <h1>3.0.2 (2023-12-08)</h1> <ul> <li>Packaging: Add a minimal <code>pyproject.toml</code> to sdists.</li> <li>Packaging: Various updates to macOS wheels.</li> <li>Fix a test case on Arm32. Note that this is not a supported platform (there is no CI for it) and support is best effort; there may be other issues lurking. See <code>issue 385 <https://github.com/python-greenlet/greenlet/issues/385></code>_</li> </ul> <h1>3.0.1 (2023-10-25)</h1> <ul> <li>Fix a potential crash on Python 3.8 at interpreter shutdown time. This was a regression from earlier 3.0.x releases. Reported by Matt Wozniski in <code>issue 376 <https://github.com/python-greenlet/greenlet/issues/376></code>_.</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/python-greenlet/greenlet/commit/ea4bc2776c75429a577e539389fee40ad9e46707"><code>ea4bc27</code></a> Preparing release 3.0.3</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/7694880b128bfa54ea0ece59974695a06af09931"><code>7694880</code></a> Make doctests work on 3.7 again, which doesn't have importlib.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/073b1e1cf4ec25a8b90a534176bf79336d87689e"><code>073b1e1</code></a> Linting. Add linting to CI.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/9e73b59eadb68017c25e8fabf0b9dba2eef6f150"><code>9e73b59</code></a> Docs: Update from the old default theme to furo.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/5f4b4bbc098330b87211405ef50e276325c15a53"><code>5f4b4bb</code></a> Py3.12: Always expose greenlet frames on a switch.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/13148f9f975924da18bef14054a558160e13369c"><code>13148f9</code></a> Update comment that was still referring to a different, less-robust approach ...</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/b5f9d232c401d8fba8710778ad85b29a41ec3924"><code>b5f9d23</code></a> Restore stack introspection ability on 3.12</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/edbdda27ab1983ac157b588dd0c04816cb31b0ea"><code>edbdda2</code></a> Back to development: 3.0.3</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/719ea473c61e9934cb75ebd04a52dda32a030863"><code>719ea47</code></a> Preparing release 3.0.2</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/2c0793c1b35e785d4353d4c6ce17cde2137da1c4"><code>2c0793c</code></a> Add change note about macOS wheels.</li> <li>Additional commits viewable in <a href="https://github.com/python-greenlet/greenlet/compare/3.0.0...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=greenlet&package-manager=pip&previous-version=3.0.0&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details>
Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.0.1 to 3.0.3. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst">greenlet's changelog</a>.</em></p> <blockquote> <h1>3.0.3 (2023-12-21)</h1> <ul> <li>Python 3.12: Restore the full ability to walk the stack of a suspended greenlet; previously only the innermost frame was exposed. See <code>issue 388 <https://github.com/python-greenlet/greenlet/issues/388></code><em>. Fix by Joshua Oreman in <code>PR 393 <https://github.com/python-greenlet/greenlet/pull/393/></code></em>.</li> </ul> <h1>3.0.2 (2023-12-08)</h1> <ul> <li>Packaging: Add a minimal <code>pyproject.toml</code> to sdists.</li> <li>Packaging: Various updates to macOS wheels.</li> <li>Fix a test case on Arm32. Note that this is not a supported platform (there is no CI for it) and support is best effort; there may be other issues lurking. See <code>issue 385 <https://github.com/python-greenlet/greenlet/issues/385></code>_</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/python-greenlet/greenlet/commit/ea4bc2776c75429a577e539389fee40ad9e46707"><code>ea4bc27</code></a> Preparing release 3.0.3</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/7694880b128bfa54ea0ece59974695a06af09931"><code>7694880</code></a> Make doctests work on 3.7 again, which doesn't have importlib.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/073b1e1cf4ec25a8b90a534176bf79336d87689e"><code>073b1e1</code></a> Linting. Add linting to CI.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/9e73b59eadb68017c25e8fabf0b9dba2eef6f150"><code>9e73b59</code></a> Docs: Update from the old default theme to furo.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/5f4b4bbc098330b87211405ef50e276325c15a53"><code>5f4b4bb</code></a> Py3.12: Always expose greenlet frames on a switch.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/13148f9f975924da18bef14054a558160e13369c"><code>13148f9</code></a> Update comment that was still referring to a different, less-robust approach ...</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/b5f9d232c401d8fba8710778ad85b29a41ec3924"><code>b5f9d23</code></a> Restore stack introspection ability on 3.12</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/edbdda27ab1983ac157b588dd0c04816cb31b0ea"><code>edbdda2</code></a> Back to development: 3.0.3</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/719ea473c61e9934cb75ebd04a52dda32a030863"><code>719ea47</code></a> Preparing release 3.0.2</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/2c0793c1b35e785d4353d4c6ce17cde2137da1c4"><code>2c0793c</code></a> Add change note about macOS wheels.</li> <li>Additional commits viewable in <a href="https://github.com/python-greenlet/greenlet/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=greenlet&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Bumps [greenlet](https://github.com/python-greenlet/greenlet) from 3.0.1 to 3.0.3. <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/python-greenlet/greenlet/blob/master/CHANGES.rst">greenlet's changelog</a>.</em></p> <blockquote> <h1>3.0.3 (2023-12-21)</h1> <ul> <li>Python 3.12: Restore the full ability to walk the stack of a suspended greenlet; previously only the innermost frame was exposed. See <code>issue 388 <https://github.com/python-greenlet/greenlet/issues/388></code><em>. Fix by Joshua Oreman in <code>PR 393 <https://github.com/python-greenlet/greenlet/pull/393/></code></em>.</li> </ul> <h1>3.0.2 (2023-12-08)</h1> <ul> <li>Packaging: Add a minimal <code>pyproject.toml</code> to sdists.</li> <li>Packaging: Various updates to macOS wheels.</li> <li>Fix a test case on Arm32. Note that this is not a supported platform (there is no CI for it) and support is best effort; there may be other issues lurking. See <code>issue 385 <https://github.com/python-greenlet/greenlet/issues/385></code>_</li> </ul> </blockquote> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/python-greenlet/greenlet/commit/ea4bc2776c75429a577e539389fee40ad9e46707"><code>ea4bc27</code></a> Preparing release 3.0.3</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/7694880b128bfa54ea0ece59974695a06af09931"><code>7694880</code></a> Make doctests work on 3.7 again, which doesn't have importlib.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/073b1e1cf4ec25a8b90a534176bf79336d87689e"><code>073b1e1</code></a> Linting. Add linting to CI.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/9e73b59eadb68017c25e8fabf0b9dba2eef6f150"><code>9e73b59</code></a> Docs: Update from the old default theme to furo.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/5f4b4bbc098330b87211405ef50e276325c15a53"><code>5f4b4bb</code></a> Py3.12: Always expose greenlet frames on a switch.</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/13148f9f975924da18bef14054a558160e13369c"><code>13148f9</code></a> Update comment that was still referring to a different, less-robust approach ...</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/b5f9d232c401d8fba8710778ad85b29a41ec3924"><code>b5f9d23</code></a> Restore stack introspection ability on 3.12</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/edbdda27ab1983ac157b588dd0c04816cb31b0ea"><code>edbdda2</code></a> Back to development: 3.0.3</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/719ea473c61e9934cb75ebd04a52dda32a030863"><code>719ea47</code></a> Preparing release 3.0.2</li> <li><a href="https://github.com/python-greenlet/greenlet/commit/2c0793c1b35e785d4353d4c6ce17cde2137da1c4"><code>2c0793c</code></a> Add change note about macOS wheels.</li> <li>Additional commits viewable in <a href="https://github.com/python-greenlet/greenlet/compare/3.0.1...3.0.3">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=greenlet&package-manager=pip&previous-version=3.0.1&new-version=3.0.3)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `@dependabot rebase`. [//]: # (dependabot-automerge-start) [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) </details> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Fixes #388. The
gr_frame
accessor now returns a frame whosef_back
chain is valid to walk, by rewriting the frame list on demand to exclude C-stack frames and reversing this the next time the greenlet is resumed. This carries an assumption that stack walking of a suspended greenlet occurs between an access togr_frame
and the next resumption of the greenlet, which should be valid for most applications. For those that are doing something stranger, I added a newgr_frames_always_exposed
attribute; if set to True it will cause the rewrite to occur every time the greenlet is suspended, providing full semantic parity with 3.11-and-earlier at some performance cost.I did some perf testing on my laptop using
It was pretty consistently 286-287 nsec before this change and 281 nsec after. (This makes sense; it's less expensive to check the
frames_were_exposed
bool than it is to reset-and-set the topmost frame'sprevious
pointer, which is what we were doing previously.)If I set the new
gr_frames_always_exposed
attribute to True, we're back up to 288 nsec per loop (though the penalty of this would increase with the stack depth of the greenlet, and re-exposing existing frames is much less expensive than exposing new frames because the former case doesn't need to allocate any new frame objects). If I accessgr_frame
on every switch, it's 302 nsec per loop. All of this seems acceptable to me. I don't know if you have any more realistic performance tests available; it would be good to run them if so.I marked this as 3.1.0 since it provides a new public API. Feel free to adjust based on however you think about versioning.