Closed
Description
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
assignee = 'https://github.com/markshannon'
closed_at = None
created_at = <Date 2021-03-31.16:45:03.337>
labels = ['3.11', '3.10', 'performance']
title = 'Handle generator (and coroutine) state in the bytecode.'
updated_at = <Date 2022-01-24.11:08:57.165>
user = 'https://github.com/markshannon'
bugs.python.org fields:
activity = <Date 2022-01-24.11:08:57.165>
actor = 'Mark.Shannon'
assignee = 'Mark.Shannon'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2021-03-31.16:45:03.337>
creator = 'Mark.Shannon'
dependencies = []
files = []
hgrepos = []
issue_num = 43683
keywords = ['patch']
message_count = 15.0
messages = ['389919', '390305', '390354', '404564', '405081', '405083', '405096', '407983', '407987', '407988', '408006', '408009', '409725', '409751', '411465']
nosy_count = 4.0
nosy_names = ['Mark.Shannon', 'pablogsal', 'Dennis Sweeney', 'dpgeorge']
pr_nums = ['25137', '25138', '25224', '25225', '30723']
priority = None
resolution = None
stage = 'patch review'
status = 'open'
superseder = None
type = 'performance'
url = 'https://bugs.python.org/issue43683'
versions = ['Python 3.10', 'Python 3.11']
Activity
markshannon commentedon Mar 31, 2021
Every time we send, or throw, to a generator, the C code in genobject.c needs to check what state the generator is in.
This is inefficient and couples the generator code, which should just be a thin wrapper around the interpreter, to the internals of the interpreter.
The state of the generator is known to the compiler. It should emit appropriate bytecodes to handle the different behavior for the different states.
While the main reason this is robustness and maintainability, removing the complex C code between Python caller and Python callee also opens up the possibility of some worthwhile optimizations.
There are three changes I want to make:
Add a new bytecode to handle starting a generator. This
GEN_START
bytecode would pop TOS, raising an exception if it is not None.This adds some overhead for the first call to iter()/send() but speeds up all the others.
Handle the case of exhausted generators. This is a bit more fiddly, and involves putting an infinite loop at the end of the generator. Something like:
CLEAR_FRAME
label:
GEN_RETURN (Like RETURN_VALUE None, but does not discard the frame)
JUMP label
This removes a lot of special case code for corner cases of exhausted generators and coroutines.
YIELD_FROM
. The problem here is that we need to differentiate between exceptions triggered by throw, which must call throw() on sub-generators, and exceptions propagating out of sub-generators which should be passed up the stack. By splitting the opcode into two (or more), it is clear which case is being handled in the interpreter without complicated logic in genobject.cmarkshannon commentedon Apr 6, 2021
New changeset b37181e by Mark Shannon in branch 'master':
bpo-43683: Handle generator entry in bytecode (GH-25138)
b37181e
sweeneyde commentedon Apr 6, 2021
Looks like we both opened PRs in the same minute.
The MAGIC constant didn't get updated, but perhaps that can just be included in the Minor Corrections PR.
I'd bet a CI check could be added to check that if the opcodes change then Python/importlib_external.h changes.
dpgeorge commentedon Oct 21, 2021
It looks like this change introduced a subtle, and maybe intended (?), behavioural change.
Consider (from MicroPython's test suite):
This used to work prior to commit b37181e. But after that commit it fails on the print(g.send(None)) because the generator is now stopped.
markshannon commentedon Oct 27, 2021
Damien, thanks for catching this.
The change was not intended.
There are two kind of exceptions raised by send.
(1) does not change the state of the generator, (2) does.
Sending non-None to a generator that has not started falls into kind 1, IMO. So we should revert the change.
markshannon commentedon Oct 27, 2021
Just to be clear, it is the behavior change that should be reverted, not necessarily the new bytecode.
In fact we should probably push on with (2) and add an exception handler wrapping the whole generator except the GEN_START. That way a GEN_START exception will not clear the generator.
GEN_START might need to do a bit of trickery like setting lasti back to -1.
dpgeorge commentedon Oct 27, 2021
Thanks for confirming the bug.
Sending non-None to a not-started generator could arguably be case (2), because that's exactly the semantics introduced by the commit that broke the test case :)
Honestly I don't have a strong opinion on which way this goes. But I think it would be interesting to know if there was code out there that relied on the original behaviour.
pablogsal commentedon Dec 7, 2021
Mark, is something left in this issue?
sweeneyde commentedon Dec 7, 2021
bpo-46009 about the same behavior change
pablogsal commentedon Dec 7, 2021
Unfortunately, this has not been fixed into 3.10.1 and 3.11.0a3 as this hasn't version information and therefore has missed our automatic checks for blockers.
I have marked https://bugs.python.org/issue46009? as release blocker, as well as this issue and I have marked the versions.
Please, ensure this is fixed ASAP so it doesn't go into the next bugfix release or alpha release.
23 remaining items