Description
Currently we have one instruction format, plus caches.
For instrumentation, I want to combine test-and-branch instructions, e.g. COMPARE_OP; POP_JUMP_IF_FALSE
would become COMPARE_AND_BRANCH
.
For the register interpreter we want to have instructions with up to 4 operands, but not waste space for instructions with fewer operands.
We also want 16 bit values in the cache, which is not currently supported by marshal, so that we need a wasteful quickening step for all code, even if it run only once.
Changes needed
The format of an instruction is already described in bytecodes.c
. The interpreter generator should output a table mapping the opcode of an instruction to its format.
Marshalling needs to know about 16 bit values, and caches. This is probably the largest change.
See python/cpython#99555
Generated code already knows the length of the instruction, so there is no change there.
The bytecode compiler, particularly the assembler, will need to understand formats, so that it emits the correct format.
write_instr
and computing jump offsets will get more complex, but the rest of the compiler should be unchanged.
What formats do we need.
Currently there is only one format, but with some instructions having caches.
If we include caches in the format, there are 6 formats with caches sizes of 0, 1, 2, 4, 5 and 9.
I would like to add 16 bit operands as well, and we will need between 0 and 3 8 bit operands.
Expressing formats.
I
the instruction (opcode)B
8 bit operand_
unused 8 bits (UPDATE: changed toX
)H
16 bit (one code unit) operandC
16 bit first cache entry (the counter)0
Zeroed 16 bit entry
Existing examples:
RETURN_VALUE
:I_
(UPDATE:IX
)LOAD_FAST
:IB
LOAD_ATTR
:IBC00000000
Hypothetical examples:
COMPARE_AND_BRANCH
:IBHC0
- Register
BINARY_OP
:IBBBHC
Generating all formats as enum, will ensure that the we get a compiler warning for any switch
that misses a case.