-
-
Notifications
You must be signed in to change notification settings - Fork 750
Open
Description
In #5695 (comment), we discovered that pandas was always being imported on Nanny workers because, to check version compatibility, all these modules are imported:
distributed/distributed/versions.py
Lines 15 to 29 in 9a4e0e2
| required_packages = [ | |
| ("dask", lambda p: p.__version__), | |
| ("distributed", lambda p: p.__version__), | |
| ("msgpack", lambda p: ".".join([str(v) for v in p.version])), | |
| ("cloudpickle", lambda p: p.__version__), | |
| ("tornado", lambda p: p.version), | |
| ("toolz", lambda p: p.__version__), | |
| ] | |
| optional_packages = [ | |
| ("numpy", lambda p: p.__version__), | |
| ("pandas", lambda p: p.__version__), | |
| ("lz4", lambda p: p.__version__), | |
| ("blosc", lambda p: p.__version__), | |
| ] |
Instead of importing, we could use importlib.metadata.version to get the distribution's version number without importing the package. This could speed up worker startup time and reduce memory footprint a little.
cc @crusaderky
Metadata
Metadata
Assignees
Labels
No labels