Skip to content

Don't import packages just for version compatibility check #5723

@gjoseph92

Description

@gjoseph92

In #5695 (comment), we discovered that pandas was always being imported on Nanny workers because, to check version compatibility, all these modules are imported:

required_packages = [
("dask", lambda p: p.__version__),
("distributed", lambda p: p.__version__),
("msgpack", lambda p: ".".join([str(v) for v in p.version])),
("cloudpickle", lambda p: p.__version__),
("tornado", lambda p: p.version),
("toolz", lambda p: p.__version__),
]
optional_packages = [
("numpy", lambda p: p.__version__),
("pandas", lambda p: p.__version__),
("lz4", lambda p: p.__version__),
("blosc", lambda p: p.__version__),
]

Instead of importing, we could use importlib.metadata.version to get the distribution's version number without importing the package. This could speed up worker startup time and reduce memory footprint a little.

cc @crusaderky

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions