Skip to content

Global String Objects are Interned Only in the First Interpreter #106931

Closed
@ericsnowcurrently

Description

@ericsnowcurrently

When a string object is interned via _PyUnicode_InternInPlace(), its "state.interned" field is set. Afterward, subsequent calls to _PyUnicode_InternInPlace() will skip that string. The problem is that some strings may be used in multiple interpreters, which each have their own interned dict. The string is shared between the interpreters, along with its "state.interned" field. That means the string will only be interned in the first interpreter where _PyUnicode_InternInPlace() is called (ignoring races in the function, e.g. gh-106930).

We need to fix it so one of the following is true:

  • there should be one global interned "dict" shared by all interpreters (we tried this already and it is very tricky)
  • strings are always interned in every interpreter, regardless of the "state.interned" value

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    3.12only security fixes3.13bugs and security fixestopic-subinterpreterstype-bugAn unexpected behavior, bug, or error

    Projects

    Status

    Done

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions