Description
Bug report
A significant performance regression in Sphinx caused by changes in CPython 3.13
Here is a script that does the following things:
- Replaces the contents of all CPython documentation files except
Doc/library/typing.rst
with simply"foo"
- Creates a virtual environment
- Installs our doc dependencies into the environment (making sure that we use pure-Python versions for all doc dependencies rather than built wheels that might include C extensions)
- Times how long it takes to build the docs using that environment
- Restores all the modified docs files and deletes the virtual environment again
The script
import contextlib
import shutil
import subprocess
import time
import venv
from pathlib import Path
def run(args):
try:
subprocess.run(args, check=True, capture_output=True, text=True)
except subprocess.CalledProcessError as e:
print(e.stdout)
print(e.stderr)
raise
with contextlib.chdir("Doc"):
try:
for path in Path(".").iterdir():
if path.is_dir() and not str(path).startswith("."):
for doc_path in path.rglob("*.rst"):
if doc_path != Path("library/typing.rst"):
doc_path.write_text("foo")
venv.create(".venv", with_pip=True)
run([
".venv/bin/python",
"-m",
"pip",
"install",
"-r",
"requirements.txt",
"--no-binary=':all:'",
])
start = time.perf_counter()
run([
".venv/bin/python",
"-m",
"sphinx",
"-b",
"html",
".",
"build/html",
"library/typing.rst",
])
print(time.perf_counter() - start)
shutil.rmtree(".venv")
shutil.rmtree("build")
finally:
subprocess.run(["git", "restore", "."], check=True, capture_output=True)
Using a PGO-optimized build with LTO enabled, the script reports that there is a significant performance regression in Sphinx's parsing and building of library/typing.rst
between v3.13.0a1
and 909c6f7:
- On
v13.0a1
the script reports a Sphinx build time of between 1.27s and 1.29s (I ran the script several times) - On ede1504, a Sphinx build time of between 1.76 and 1.82s is reported by the script (a roughly 48% regression).
A similar regression is reported in this (much slower) variation of the script that builds the entire set of CPython's documentation rather than just library/typing.rst
.
More comprehensive variation of the script
import contextlib
import shutil
import subprocess
import time
import venv
def run(args):
subprocess.run(args, check=True, text=True)
with contextlib.chdir("Doc"):
venv.create(".venv", with_pip=True)
run([
".venv/bin/python",
"-m",
"pip",
"install",
"-r",
"requirements.txt",
"--no-binary=':all:'",
])
start = time.perf_counter()
run([
".venv/bin/python",
"-m",
"sphinx",
"-b",
"html",
".",
"build/html",
])
print(time.perf_counter() - start)
shutil.rmtree(".venv")
shutil.rmtree("build")
The PGO-optimized timings for building the entire CPython documentation is as follows:
v3.13.0a1
: 45.5s- ede1504: 62.7s
This indicates a 38% performance regression for building the entire set of CPython's documentation.
Cause of the performance regression
This performance regression was initially discovered in #118891: in our own CI, we use a fresh build of CPython in our Doctest CI workflow (since otherwise, we wouldn't be testing the tip of the main
branch), and it was observed that the CI job was taking significantly longer on the 3.13
branch than the 3.12
branch. In the context of our CI, the performance regression is even worse, because of the fact that our Doctest CI workflow uses a debug build rather than a PGO-optimized build, and the regression is even more pronounced in a Debug build.
Using a debug build, I used the first script posted above to bisect the performance regression to commit 1530932 (below), which seemed to cause a performance regression of around 300% in a debug build
15309329b65a285cb7b3071f0f08ac964b61411b is the first bad commit
commit 15309329b65a285cb7b3071f0f08ac964b61411b
Author: Mark Shannon <mark@hotpy.org>
Date: Wed Mar 20 08:54:42 2024 +0000
GH-108362: Incremental Cycle GC (GH-116206)
Doc/whatsnew/3.13.rst | 30 +
Include/internal/pycore_gc.h | 41 +-
Include/internal/pycore_object.h | 18 +-
Include/internal/pycore_runtime_init.h | 8 +-
Lib/test/test_gc.py | 72 +-
.../2024-01-07-04-22-51.gh-issue-108362.oB9Gcf.rst | 12 +
Modules/gcmodule.c | 25 +-
Objects/object.c | 21 +
Objects/structseq.c | 5 +-
Python/gc.c | 806 +++++++++++++--------
Python/gc_free_threading.c | 23 +-
Python/import.c | 2 +-
Python/optimizer.c | 2 +-
Tools/gdb/libpython.py | 7 +-
14 files changed, 684 insertions(+), 388 deletions(-)
create mode 100644 Misc/NEWS.d/next/Core and Builtins/2024-01-07-04-22-51.gh-issue-108362.oB9Gcf.rst
Performance was then significantly improved by commit e28477f (below), but it's unfortunately still the case that Sphinx is far slower on Python 3.13 than on Python 3.12:
commit e28477f214276db941e715eebc8cdfb96c1207d9
Author: Mark Shannon <mark@hotpy.org>
Date: Fri Mar 22 18:43:25 2024 +0000
GH-117108: Change the size of the GC increment to about 1% of the total heap size. (GH-117120)
Include/internal/pycore_gc.h | 3 +-
Lib/test/test_gc.py | 35 +++++++++++++++-------
.../2024-03-21-12-10-11.gh-issue-117108._6jIrB.rst | 3 ++
Modules/gcmodule.c | 2 +-
Python/gc.c | 30 +++++++++----------
Python/gc_free_threading.c | 2 +-
6 files changed, 47 insertions(+), 28 deletions(-)
create mode 100644 Misc/NEWS.d/next/Core and Builtins/2024-03-21-12-10-11.gh-issue-117108._6jIrB.rst
See #118891 (comment) for more details on the bisection results.
Profiling by @nascheme in #118891 (comment) and #118891 (comment) also confirms that Sphinx spends a significant amount of time in the GC, so it seems very likely that the changes to introduce an incremental GC in Python 3.13 is the cause of this performance regression.
Cc. @markshannon for expertise on the new incremental GC, and cc. @hugovk / @AA-Turner for Sphinx expertise.
CPython versions tested on:
3.12, 3.13, CPython main branch
Operating systems tested on:
macOS
Linked PRs
Metadata
Metadata
Assignees
Labels
Projects
Status
Done