You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The popular greenlet package implements cooperative multitasking by moving parts of the C stack around. The active greenlet has all of its stack in the expected place, but a suspended greenlet might have spilled part of its stack to the heap in order to allow the active greenlet to use the same region of stack. For many years, this has worked fine in practice because storage on the stack is generally not reachable from Python objects on the heap. The introduction with #96319 of interpreter frames stored on the C stack broke this assumption; under Python 3.12 and later, if you can get ahold of a frame object from a suspended greenlet, you can crash the interpreter by following f_back links until you reach one that would traverse an entry frame. A workaround was added in python-greenlet/greenlet@40646dc but it only protects the innermost greenlet frame (which is the easiest one to access since greenlets provide a gr_frame attribute to retrieve it), severing its link with the rest of the greenlet stack when the greenlet is suspended. This hampers the ability to understand what a suspended greenlet is doing, and it doesn't even completely resolve the crash because there are other ways to obtain a non-innermost greenlet frame.
I filed python-greenlet/greenlet#388 against greenlet to discuss ways greenlet could work around the C-stack-based interpreter frames. None of the options are really palatable; they all involve taking new dependencies on CPython internals, as well as some tradeoff between unsoundness (exposing frame objects whose f_back attribute will crash the interpreter when accessed) and poor performance (needing to walk the stack on every greenlet suspend/resume). I'm wondering if there's anything that could be done on the CPython side to better support this use case.
The easiest solution from greenlet's perspective would be to just not store interpreter frames on the C stack. It appears likely feasible to store the entry frames on the per-thread frame stack instead; to maintain stack discipline, the entry frame for evaluating an owned-by-thread frame would need to be allocated before the owned-by-thread frame, but that doesn't look like a blocker (in fact both could be allocated simultaneously). Another option would be to use a single static interpreter frame object for all entry frames, and to store their previous pointers (the only portion that definitely needs to be variable from one entry frame to the next) on a new per-thread stack. Since entry frames return using a different bytecode instruction than non-entry frames, this wouldn't introduce additional branching in the eval loop, only in frame introspection (the f_back getter, etc).
Another category of potential solution would still keep entry frames on the C stack, but would store enough information in the interpreter frame object under evaluation that it would be able to skip its entry-frame parent without accessing any portion of it. The easiest approach here would be to add a new previous_heap pointer (name for discussion purposes only) which is like previous but skips entry frames; but that's increasing the size of the interpreter frame structure, which might not be acceptable. If taking that size bump is OK then the rest of the solution is trivial; just make f_back follow previous_heap instead of previous.
Maybe someone who's more familiar with interpreter internals than I am can come up with an option that's better than any of these. But it would be really useful for greenlet if we could somehow eliminate the recently-introduced requirement to access the C stack in the course of walking the Python stack. Thanks for your consideration.
CPython versions tested on:
3.12, 3.13, CPython main branch
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered:
I think it will work robustly enough, at least until the next big change to how frames are represented, but it's taking a number of dependencies on CPython internals so I'd still like to explore any possible upstream changes that would make this easier to maintain.
Bug report
Bug description:
The popular
greenlet
package implements cooperative multitasking by moving parts of the C stack around. The active greenlet has all of its stack in the expected place, but a suspended greenlet might have spilled part of its stack to the heap in order to allow the active greenlet to use the same region of stack. For many years, this has worked fine in practice because storage on the stack is generally not reachable from Python objects on the heap. The introduction with #96319 of interpreter frames stored on the C stack broke this assumption; under Python 3.12 and later, if you can get ahold of a frame object from a suspended greenlet, you can crash the interpreter by followingf_back
links until you reach one that would traverse an entry frame. A workaround was added in python-greenlet/greenlet@40646dc but it only protects the innermost greenlet frame (which is the easiest one to access since greenlets provide agr_frame
attribute to retrieve it), severing its link with the rest of the greenlet stack when the greenlet is suspended. This hampers the ability to understand what a suspended greenlet is doing, and it doesn't even completely resolve the crash because there are other ways to obtain a non-innermost greenlet frame.I filed python-greenlet/greenlet#388 against greenlet to discuss ways greenlet could work around the C-stack-based interpreter frames. None of the options are really palatable; they all involve taking new dependencies on CPython internals, as well as some tradeoff between unsoundness (exposing frame objects whose
f_back
attribute will crash the interpreter when accessed) and poor performance (needing to walk the stack on every greenlet suspend/resume). I'm wondering if there's anything that could be done on the CPython side to better support this use case.The easiest solution from greenlet's perspective would be to just not store interpreter frames on the C stack. It appears likely feasible to store the entry frames on the per-thread frame stack instead; to maintain stack discipline, the entry frame for evaluating an owned-by-thread frame would need to be allocated before the owned-by-thread frame, but that doesn't look like a blocker (in fact both could be allocated simultaneously). Another option would be to use a single static interpreter frame object for all entry frames, and to store their
previous
pointers (the only portion that definitely needs to be variable from one entry frame to the next) on a new per-thread stack. Since entry frames return using a different bytecode instruction than non-entry frames, this wouldn't introduce additional branching in the eval loop, only in frame introspection (thef_back
getter, etc).Another category of potential solution would still keep entry frames on the C stack, but would store enough information in the interpreter frame object under evaluation that it would be able to skip its entry-frame parent without accessing any portion of it. The easiest approach here would be to add a new
previous_heap
pointer (name for discussion purposes only) which is likeprevious
but skips entry frames; but that's increasing the size of the interpreter frame structure, which might not be acceptable. If taking that size bump is OK then the rest of the solution is trivial; just makef_back
followprevious_heap
instead ofprevious
.Maybe someone who's more familiar with interpreter internals than I am can come up with an option that's better than any of these. But it would be really useful for
greenlet
if we could somehow eliminate the recently-introduced requirement to access the C stack in the course of walking the Python stack. Thanks for your consideration.CPython versions tested on:
3.12, 3.13, CPython main branch
Operating systems tested on:
Linux
The text was updated successfully, but these errors were encountered: