Code objects have incorrect hash/equality when code is instrumented for sys.monitoring

# Bug report

### Bug description:

A thread stress test in the coverage.py test suite was failing with seemingly impossible behavior: a code object is used as a key in a dict, and then cannot be found.  I eventually tracked it down to a mismatch between how bytecodes are examined in code_hash and in code_richcompare.

I have instructions for demonstrating the problem if you like, though it's a thread race condition, so it's a bit involved: https://gist.github.com/nedbat/2fb43527d102b8bb6e7fb7f3400fa3a9

# Hash computation

When computing the hash, opcodes are checked if they are instrumented, and if so, use the original opcodes:

```c
static Py_hash_t
code_hash(PyCodeObject *co)
{
    //...
    for (int i = 0; i < Py_SIZE(co); i++) {
        int deop = _Py_GetBaseOpcode(co, i);
        SCRAMBLE_IN(deop);
        SCRAMBLE_IN(_PyCode_CODE(co)[i].op.arg);
        i += _PyOpcode_Caches[deop];
    }
```

```c
/* Get the underlying opcode, stripping instrumentation */
int _Py_GetBaseOpcode(PyCodeObject *code, int i)
{
    int opcode = _PyCode_CODE(code)[i].op.code;
    if (opcode == INSTRUMENTED_LINE) {
        opcode = code->_co_monitoring->lines[i].original_opcode;
    }
    if (opcode == INSTRUMENTED_INSTRUCTION) {
        opcode = code->_co_monitoring->per_instruction_opcodes[i];
    }
    CHECK(opcode != INSTRUMENTED_INSTRUCTION);
    CHECK(opcode != INSTRUMENTED_LINE);
    int deinstrumented = DE_INSTRUMENT[opcode];
    if (deinstrumented) {
        return deinstrumented;
    }
    return _PyOpcode_Deopt[opcode];
}
```

```
#define _PyCode_CODE(CO) _Py_RVALUE((_Py_CODEUNIT *)(CO)->co_code_adaptive)
```

# Equality

When checking equality, opcodes are not checked if they are instrumented:

```
static PyObject *
code_richcompare(PyObject *self, PyObject *other, int op)
{
    // ...
    for (int i = 0; i < Py_SIZE(co); i++) {
        _Py_CODEUNIT co_instr = _PyCode_CODE(co)[i];
        _Py_CODEUNIT cp_instr = _PyCode_CODE(cp)[i];
        co_instr.op.code = _PyOpcode_Deopt[co_instr.op.code];
        cp_instr.op.code = _PyOpcode_Deopt[cp_instr.op.code];
        eq = co_instr.cache == cp_instr.cache;
        if (!eq) {
            goto unequal;
        }
        i += _PyOpcode_Caches[co_instr.op.code];
    }
```

It seems like the hash and the equality will be well matched if the code hasn't been instrumented. But if it has been, then they don't match.


### CPython versions tested on:

3.12

### Operating systems tested on:

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Code objects have incorrect hash/equality when code is instrumented for sys.monitoring #111984

Bug report

Bug description:

Hash computation

Equality

CPython versions tested on:

Operating systems tested on:

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Code objects have incorrect hash/equality when code is instrumented for sys.monitoring #111984

Description

Bug report

Bug description:

Hash computation

Equality

CPython versions tested on:

Operating systems tested on:

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions