Skip to content

Unexpected behavior of memory specification between AutoExecutor and SlurmExecutor #1761

Open
@mshvartsman

Description

I recently switched from using AutoExecutor to subclassing from SlurmExecutor directly to use a custom entrypoint, and had to modify executor parameters. It's a bit awkward that the parameters have different names (e.g. mem vs mem_gb) but this is relatively easy to catch since using the wrong name causes an error which prints all known names.

However, what is more challenging is that the units for memory also change and this isn't very obvious. I think it's because the logic which appends MB or GB to the memory amount (https://github.com/facebookincubator/submitit/blob/main/submitit/slurm/slurm.py#L525) only gets called when also converting the parameter names via AutoExecutor.

I think that the correct behavior is either to always convert the memory in the same way, or at least warn somewhere if no units are provided but are expected. Happy to PR a fix if there is agreement on how to proceed!

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions