Unexpected behavior of memory specification between AutoExecutor
and SlurmExecutor
#1761
Description
I recently switched from using AutoExecutor
to subclassing from SlurmExecutor
directly to use a custom entrypoint, and had to modify executor parameters. It's a bit awkward that the parameters have different names (e.g. mem
vs mem_gb
) but this is relatively easy to catch since using the wrong name causes an error which prints all known names.
However, what is more challenging is that the units for memory also change and this isn't very obvious. I think it's because the logic which appends MB or GB to the memory amount (https://github.com/facebookincubator/submitit/blob/main/submitit/slurm/slurm.py#L525) only gets called when also converting the parameter names via AutoExecutor
.
I think that the correct behavior is either to always convert the memory in the same way, or at least warn somewhere if no units are provided but are expected. Happy to PR a fix if there is agreement on how to proceed!