Description
We should insert shim frames where the C-API calls into the interpreter.
The idea is that we can simplify returns and yields, as they can assume that it is safe to just pop the current frame and continue interpretation.
The places where we enter the interpreter from the C-API are:
PyEval_EvalFrame
PyEval_EvalFrameEx
_PyEval_Vector
The interpreter is also called from gen_send_ex2
, but that's not directly part of the C-API, although its callers are.
Starting with the simplest case, the C-API functions listed above (except gen_send_ex2
), we need to push a frame with a single stack entry and EXIT_INTERPRETER
as its sole instruction.
This allows us to simplify RETURN_VALUE
and RETURN_GENERATOR
as they no longer need to check whether the frame is an entry frame.
RETURN_VALUE
goes from
PyObject *retval = POP();
_PyFrame_SetStackPointer(frame, stack_pointer);
TRACE_FUNCTION_EXIT();
DTRACE_FUNCTION_EXIT();
_Py_LeaveRecursiveCallTstate(tstate);
if (!frame->is_entry) {
frame = cframe.current_frame = pop_frame(tstate, frame);
_PyFrame_StackPush(frame, retval);
goto resume_frame;
}
/* Restore previous cframe and return. */
tstate->cframe = cframe.previous;
tstate->cframe->use_tracing = cframe.use_tracing;
return retval;
to
PyObject *retval = POP();
_PyFrame_SetStackPointer(frame, stack_pointer);
TRACE_FUNCTION_EXIT();
DTRACE_FUNCTION_EXIT();
_Py_LeaveRecursiveCallTstate(tstate);
assert(!frame->is_entry);
frame = cframe.current_frame = pop_frame(tstate, frame);
_PyFrame_StackPush(frame, retval);
goto resume_frame;
Similarly for RETURN_GENERATOR
.
The new EXIT_INTERPRETER
instruction is defined as:
PyObject *retval = POP();
/* Restore previous cframe and return. */
tstate->cframe = cframe.previous;
tstate->cframe->use_tracing = cframe.use_tracing;
return retval;
So far, so good. But things get a bit more complex with YIELD_VALUE
. First off we add a yield_offset
to the interpreter frame, so that yield
ing goes to a different location than RETURN_VALUE
. This should allow us to inline generator iteration and yield from
in a similar way to calls.
To get this to work we will need to implement the following in bytecode:
gen.__next__()
, gen.send()
, gen.throw()
, gen.close()
, coro.send()
, async_gen.throw()
, etc.
gen.__next__()
could be implemented as follows:
LOAD_FAST 0 (self)
SETUP_FINALLY error
ENTER_GENERATOR yield_to # sets `yield_offset` to offset of yield_to label, then jumps into the generator
POP_BLOCK
LOAD_FAST 0 (self)
GEN_CLEAR
LOAD_CONST StopIteration
RAISE_VARARGS 1
yield_to:
POP_BLOCK
RETURN_VALUE
error:
MATCH StopIteration (Convert StopIteration into RuntimeError)
...
RERAISE
The other functions are left as an exercise for the reader 🙂
We could start by implementing gen_send_ex2
as bytecode, then re-implementing its callers in bytecode until we can dicard gen_send_ex2
.
It might also be useful to implement next()
in bytecode to avoid the context swap.
next()
:
LOAD_FAST 0 (self)
FOR_ITER done
RETURN_VALUE
done:
LOAD_CONST StopIteration
RAISE_VARARGS 1