Description
Feature or enhancement
Currently, there is no good way to collect test line coverage of the standard library.
python -m test -T
python -m test -T
is using sys.settrace
through the trace
module, which slows things down significantly. Moreover, it is set relatively late, after a ton of standard library modules were already imported, making collecting full coverage from them hard. What's more, this way of collecting coverage doesn't work with -j
, and we want to switch to -j0
by default in the test runner since the battle for tests to be fully idempotent is impossible.
coverage.py?
We had a hack in coverage.py called "fullcoverage", which mimicked the encodings
module to install early, but it was recently removed because it didn't work in 3.13 and we haven't been tracking coverage in CI since 2021.
Use sys.monitoring
I want to use PEP 669 for this because it's faster and works across threads, which will become relevant with PEP 703. The idea is to gather partial coverage in each regrtest worker subprocess and then combine the results through the main process when tests are done.
Set up monitoring before site.py
To gather coverage, we need to register a callback for the line event with sys.monitoring
. Moreover, to get reliable data for this, this gathering needs to start very early in the interpreter, before site.py
executed, as that triggers a large amount of standard library imports.
This will need a small addition to the interpreter environment, an environment variable called PYTHONPRESITE
, which takes a module name. Setting it makes the interpreter attempt to import this module before site.py
and the __main__
module are created. To allow for input/output, the module is imported after encodings were loaded, stdout/stderr/stdin have been set, and open()
was exposed. The module is not considered __main__
, and stays imported after loading, making it possible for tooling loaded later to use data gathered by it.
Why even gather coverage?
I don't plan to beat people with a stick because line coverage fell from 81% to 79%.
There's been a lot of churn lately in terms of test refactors, and it's not entirely clear while reviewing those changes whether we are still running the same checks as before. In effect, I don't feel very confident approving such large changes with rearranged TestCase base classes. In those cases, I run coverage.py
with some hacks to confirm that we're not losing coverage with the change, but ultimately this is slow, painful to do, and still generates data that isn't trustworthy.
An easy way to get simple test line coverage stats without installing anything would solve my issue, and hopefully, more people would then use it.