Skip to content

Accessing attributes of a lazily-loaded module is not thread-safe #114763

Closed
@effigies

Description

@effigies

Bug report

Bug description:

Attempting to access an attribute of a lazily-loaded module causes the module's __class__ to be reset before its attributes have been populated.

import importlib.util
import sys
import threading
import time

# Lazy load http
spec = importlib.util.find_spec("http")
module = importlib.util.module_from_spec(spec)
http = sys.modules["http"] = module

loader = importlib.util.LazyLoader(spec.loader)
loader.exec_module(module)

def check():
    time.sleep(0.2)
    return http.HTTPStatus.ACCEPTED == 202

def multicheck():
    for _ in range(10):
        threading.Thread(target=check).start()

if sys.argv[1:] == ["single"]:
    check()
else:
    multicheck()

The issue is here:

class _LazyModule(types.ModuleType):
"""A subclass of the module type which triggers loading upon attribute access."""
def __getattribute__(self, attr):
"""Trigger the load of the module and return the attribute."""
# All module metadata must be garnered from __spec__ in order to avoid
# using mutated values.
# Stop triggering this method.
self.__class__ = types.ModuleType

When attempting to access an attribute, the module's __dict__ is not updated until after __class__ is reset. If other threads attempt to access between these two points, then an attribute lookup can fail.

Assuming this is considered a bug, the two fixes I can think of are:

  1. A module-scoped lock that is used to protect __getattribute__'s critical section. The self.__class__ = type.ModuleType would need to be moved below __dict__.update(), which in turn would mean that self.__spec__ and self.__dict__ would need to change to object.__getattribute__(self, ...) lookups to avoid recursion.
  2. A module-scoped dictionary of locks, one-per-_LazyModule. Here, additional work would be needed to remove no-longer-needed locks without creating another critical section where a thread enters _LazyModule.__getattribute__ but looks up its lock after it is removed by the first thread.

My suspicion is that one lock is enough, so I would suggest going with 1.

CPython versions tested on:

3.8, 3.10, 3.11, 3.12

Operating systems tested on:

Linux

Linked PRs

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions