Skip to content

Insert shim frames at entries points to the interpreter. #436

Closed
@markshannon

Description

@markshannon

We should insert shim frames where the C-API calls into the interpreter.
The idea is that we can simplify returns and yields, as they can assume that it is safe to just pop the current frame and continue interpretation.

The places where we enter the interpreter from the C-API are:
PyEval_EvalFrame
PyEval_EvalFrameEx
_PyEval_Vector
The interpreter is also called from gen_send_ex2, but that's not directly part of the C-API, although its callers are.

Starting with the simplest case, the C-API functions listed above (except gen_send_ex2), we need to push a frame with a single stack entry and EXIT_INTERPRETER as its sole instruction.

This allows us to simplify RETURN_VALUE and RETURN_GENERATOR as they no longer need to check whether the frame is an entry frame.

RETURN_VALUE goes from

    PyObject *retval = POP();
    _PyFrame_SetStackPointer(frame, stack_pointer);
    TRACE_FUNCTION_EXIT();
    DTRACE_FUNCTION_EXIT();
    _Py_LeaveRecursiveCallTstate(tstate);
    if (!frame->is_entry) {
        frame = cframe.current_frame = pop_frame(tstate, frame);
        _PyFrame_StackPush(frame, retval);
        goto resume_frame;
    }
    /* Restore previous cframe and return. */
    tstate->cframe = cframe.previous;
    tstate->cframe->use_tracing = cframe.use_tracing;
    return retval;

to

    PyObject *retval = POP();
    _PyFrame_SetStackPointer(frame, stack_pointer);
    TRACE_FUNCTION_EXIT();
    DTRACE_FUNCTION_EXIT();
    _Py_LeaveRecursiveCallTstate(tstate);
    assert(!frame->is_entry);
    frame = cframe.current_frame = pop_frame(tstate, frame);
    _PyFrame_StackPush(frame, retval);
    goto resume_frame;

Similarly for RETURN_GENERATOR.

The new EXIT_INTERPRETER instruction is defined as:

    PyObject *retval = POP();
    /* Restore previous cframe and return. */
    tstate->cframe = cframe.previous;
    tstate->cframe->use_tracing = cframe.use_tracing;
    return retval;

So far, so good. But things get a bit more complex with YIELD_VALUE. First off we add a yield_offset to the interpreter frame, so that yielding goes to a different location than RETURN_VALUE. This should allow us to inline generator iteration and yield from in a similar way to calls.

To get this to work we will need to implement the following in bytecode:
gen.__next__(), gen.send(), gen.throw(), gen.close(), coro.send(), async_gen.throw(), etc.

gen.__next__() could be implemented as follows:

    LOAD_FAST 0 (self)
    SETUP_FINALLY error
    ENTER_GENERATOR yield_to # sets `yield_offset` to offset of yield_to label, then jumps into the generator
    POP_BLOCK
    LOAD_FAST 0 (self)
    GEN_CLEAR
    LOAD_CONST StopIteration
    RAISE_VARARGS 1
yield_to:
    POP_BLOCK
    RETURN_VALUE
error:
    MATCH StopIteration (Convert StopIteration into RuntimeError)
    ...
    RERAISE

The other functions are left as an exercise for the reader 🙂

We could start by implementing gen_send_ex2 as bytecode, then re-implementing its callers in bytecode until we can dicard gen_send_ex2.

It might also be useful to implement next() in bytecode to avoid the context swap.

next():

    LOAD_FAST 0 (self)
    FOR_ITER done
    RETURN_VALUE
done:
    LOAD_CONST StopIteration
    RAISE_VARARGS 1

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions