-
-
Notifications
You must be signed in to change notification settings - Fork 2.8k
fix #8991 and enhance collection performance #9125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great @elbehery95, thanks a lot for the contribution and detailed summary.
Besides the minor comments I left, we should also add a changelog/9125.improvement.rst
file, with something like:
Improve collection performance by caching the internal hook proxy objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @elbehery95!
One thing I forgot to mention: feel free to add yourself to AUTHORS
. 👍
I'd like to review this as well, when I get a bit of free time (but if I don't review it in a few days, feel free to merge, I don't intend to block things indefinitely...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a lovely enhancement
I am getting different results when trying to import
|
@elbehery95 indeed, its a "private class" its ok to select based on where its available for now, we should make a issue report to pluggy for making it (or a protocol for it) avaliable as type) |
Added the missing |
(Not a review yet, just a comment) I'm trying to understand the pre-caching line profile you've shown in the PR description. Do I understand correctly that 64.3543 s is spent in |
Probably yes; this is the base implementation with different format Total time: 65.7615 s
File: pytest/src/_pytest/main.py
Function: gethookproxy at line 543
Line # Hits Time Per Hit % Time Line Contents
==============================================================
543 @profile
544 def gethookproxy(self, fspath: "os.PathLike[str]"):
545 # Check if we have the common case of running
546 # hooks with all conftest.py files.
547 205108 853594.0 4.2 1.3 pm = self.config.pluginmanager
548 205108 39292528.0 191.6 59.8 p = Path(fspath)
549 205108 2775737.0 13.5 4.2 import_mode = self.config.getoption("importmode")
550 205108 1111320.0 5.4 1.7 root_path = self.config.rootpath
551 205108 14185275.0 69.2 21.6 my_conftestmodules = pm._getconftestmodules(p, import_mode, rootpath=root_path)
552 205108 1414525.0 6.9 2.2 remove_mods = pm._conftest_plugins.difference(my_conftestmodules)
553 205108 703070.0 3.4 1.1 if remove_mods:
554 # One or more conftests are not in use at this fspath.
555 205096 2740815.0 13.4 4.2 from .config.compat import PathAwareHookProxy
556
557 205096 2007707.0 9.8 3.1 proxy = PathAwareHookProxy(FSHookProxy(pm, remove_mods))
558 else:
559 # All plugins are active for this fspath.
560 12 47.0 3.9 0.0 proxy = self.config.hook
561 205108 676901.0 3.3 1.0 return proxy Other usage for Timer unit: 1e-06 s
Total time: 11.6957 s
File: pytest/src/_pytest/config/__init__.py
Function: _getconftestmodules at line 525
Line # Hits Time Per Hit % Time Line Contents
==============================================================
525 @lru_cache(maxsize=128)
526 @profile
527 def _getconftestmodules(
528 self, path: Path, importmode: Union[str, ImportMode], rootpath: Path 529 ) -> List[types.ModuleType]:
530 2242 9181.0 4.1 0.1 if self._noconftest:
531 return []
532
533 2242 90943.0 40.6 0.8 if path.is_file():
534 1957 47316.0 24.2 0.4 directory = path.parent
535 else:
536 285 911.0 3.2 0.0 directory = path
537
538 # XXX these days we may rather want to use config.rootpath
539 # and allow users to opt into looking into the rootdir parent
540 # directories instead of requiring to specify confcutdir.
541 2242 6647.0 3.0 0.1 clist = []
542 23086 575779.0 24.9 4.9 for parent in reversed((directory, *directory.parents)):
543 20844 3180991.0 152.6 27.2 if self._confcutdir and parent in self._confcutdir.parents:
544 8968 27306.0 3.0 0.2 continue
545 11876 880322.0 74.1 7.5 conftestpath = parent / "conftest.py"
546 11876 562783.0 47.4 4.8 if conftestpath.is_file():
547 4169 6242670.0 1497.4 53.4 mod = self._importconftest(conftestpath, importmode, rootpath)
548 4169 16294.0 3.9 0.1 clist.append(mod)
549 2242 48137.0 21.5 0.4 self._dirpath2confmods[directory] = clist
550 2242 6397.0 2.9 0.1 return clist where looking up if parent in parents takes about 30% from the total time ; I keep thinking that using >>> from pathlib import Path
>>> a = Path() ; b = Path()/'src/config/__init__.py'
>>> timeit.timeit(lambda: a in b.parents, number=30000)
0.8800809980020858
>>> timeit.timeit(lambda: str(b).startswith(str(a)), number=30000)
0.06638047900923993 |
@elbehery95 great find, i believe it may be sensible to check the parts attributes for quick checking |
@elbehery95 I've made some changes in Sorry for the hassle. The reason I'm hesitant to add caches is that they have memory overhead and invalidation issues and generally make the code harder to understand and refactor. Especially in this case which deals with imports and side effects. So I always try to optimize the underlying operation first. |
Branch is rebased. Below is the comparison between current main (with path workaround and re-based gethookproxy with cache) about 20% gain hyperfine --warmup=1 --min-runs 4 'venv_vanila/bin/pytest --collect-only -q pandas/pandas/tests -k test_nans_count' 'venv_modified/bin/pytest --collect-only -q pandas/pandas/tests -k test_nans_count'
Benchmark #1: venv_vanila/bin/pytest --collect-only -q pandas/pandas/tests -k test_nans_count
Time (mean ± σ): 118.802 s ± 0.257 s [User: 116.185 s, System: 2.993 s]
Range (min … max): 118.515 s … 119.140 s 4 runs
Benchmark #2: venv_modified/bin/pytest --collect-only -q pandas/pandas/tests -k test_nans_count
Time (mean ± σ): 97.289 s ± 2.239 s [User: 95.390 s, System: 2.190 s]
Range (min … max): 95.295 s … 100.213 s 4 runs
Summary
'venv_modified/bin/pytest --collect-only -q pandas/pandas/tests -k test_nans_count' ran
1.22 ± 0.03 times faster than 'venv_vanila/bin/pytest --collect-only -q pandas/pandas/tests -k test_nans_count' |
I am seeing "Changes requested" but not able to spot what are those changes as i marked all of them as resolved. Let me know if anything is required from my side |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my changes requested was not resolved, so i just approved now :)
return proxy | ||
|
||
def pytest_plugin_registered(self, plugin): | ||
self._fspath_hookproxy_cache.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
im wondering if we could limit the invalidations to conftest registrations (and possibly do partial invalidations later
this is a followup idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit suspicious of this cache, but I guess we can try it.
I left a few comments.
# Check if we have the common case of running | ||
# hooks with all conftest.py files. | ||
|
||
pm = self.config.pluginmanager | ||
proxy: Optional[PathAwareHookProxy] = self._fspath_hookproxy_cache.get(fspath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cache should be keyed off path
, not fspath
, this way it's normalized to a single unsurprising type.
# Cache of hookproxy objects, which speeds up collection. | ||
# The cache is cleared whenever a new plugin got registered. | ||
# See #9125 for a detailed summary of the gains. | ||
self._fspath_hookproxy_cache: Dict[ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the fspath_
part is redundant, just _hookproxy_cache
is enough and matches gethookproxy
.
# Check if we have the common case of running | ||
# hooks with all conftest.py files. | ||
|
||
pm = self.config.pluginmanager | ||
proxy: Optional[PathAwareHookProxy] = self._fspath_hookproxy_cache.get(fspath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the annotation? It looks like it (incorrectly) gets rid of the _HookCaller
part of the union.
return proxy | ||
|
||
def pytest_plugin_registered(self, plugin): | ||
self._fspath_hookproxy_cache.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the argument that this is sufficient for the cache invalidation here? (would be good as a code comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i believe this could need a note on conftests,
i suspect it may be acceptable to do the invalidation only on the registration of conftests
I'm excited to see potential improvements in collection performance. A project I work on spends much longer on collection than I would expect (6s collection time, 30s total test time for a project of ~150 files in ~3000 tests). It looks like this work has stalled. @elbehery95 do you plan to continue working on this PR? |
The commit introduces few performance improvements for
_pytest.main.Session.gethookproxy
.As per #8991 it seems that caching
gethookproxy
can buy some collection time since almost samefspath
will yield the sameproxy
however due to the fact that plugin manger updates_conftest_plugins
as it collect; usinglru_cache
will regress some of the tests.The fix tries to implement a simple cache dict that is cleared every time the number of seen collected conftest plugin is changed.
Below are results when tested against pandas test suite about 16% less collection time.
And below are line profiler comparison
Before caching
After caching
Appreciate your thoughts and feedback