Description
These are just a few modifications/enhancements I've been thinking about, in large part because of a few issues that have come up recently including this issue: firedrakeproject/firedrake#4101 (which has also come up before in slack), and someone trying to run on an HPC where the MPI executable was srun
and not mpiexec
. I'll lay out a few things that (I think) could improve, and then suggest some possible solutions.
Restrictions/problems with current implementation.
-
If you try to use MPI "on the inside" with an MPI distribution that doesn't support nested inits, then you just get the cryptic error from MPI, rather than a nice message from
mpi-pytest
telling you what's wrong and how to fix it. @connorjward has a WIP fix for this here: WIP Emit a helpful error message when trying to run in forking mode #14 -
If you try to use MPI "on the outside" and there's a mismatch between the number of MPI ranks and the
nprocs
argument, then you get the helpful error message_pytest.config.exceptions.UsageError: Attempting to run parallel tests inside an mpiexec call where the requested and provided process counts do not match
, but it means you have to repeat yourself when you're running parallel tests. Sometimes it would be nice to "opt-in" to the tests being automatically selected. I often find myself doing something a bit unwieldy like:
N=4; mpiexec -n $N pytest -m "parallel[$N]" tests
-
If you try to use MPI "on the inside" with an MPI distribution that doesn't use
mpiexec
(say it hassrun
or something) then the parallel callback here will just fail, with no way of modifying it. -
If you use MPI "on the inside" and you want the ranks other than rank 0 to have more detailed output, then you will always be thwarted because the quiet arguments are added after the user's argument here so will always take precedence.
-
The default
nprocs
is hardcoded here, so there's no way of having this be different between different projects/invocations etc. -
Each test has to explicitly declare in the code how many processors they can run with. Sometimes it might be useful to be able to specify that a particular test can run with any number of processors (for example if the test doesn't rely on a specific number of ranks, it just tests that something runs successfully in parallel).
Possible solutions
-
See Connor's PR.
-
We could add a command line argument when running MPI "on the outside" to tell
mpi-pytest
to either fail or skip if it sees tests that use a different number of tests to the size ofCOMM_WORLD
, e.g. if I run with 2 cores and selectskip
, only the parallel tests withnprocs=2
(or that have 2 as one of the options) will be run, and the others will be skipped.
mpiexec -np 2 pytest --nprocs-mismatch=<skip,fail>
- We could add a
--mpi-executable=
command line argument to pytest to specify what to use. We might also need something to specify which flag will propagate environment variables in case it doesn't use the-genv
argument like MPICH'smpiexec
, e.g. forsrun
it would be something like:
pytest -m "parallel[2]" --mpi-executable=srun --mpi-env-flag=--export
-
This could be as simple as adding a command line argument, something like
--mpi-quiet=<none,nice,priority>
wherenone
means don't add any quiet arguments,nice
means add them before the user's arguments so they don't take priority, andpriority
means adding them after so they override any conflicts (what we do now). -
The default
nprocs
could be modifiable either through an environment variable that is read bympi-pytest
or by another command line argument like--nprocs-default=4
-
This one might be a bit tricky/contentious. The
parallel
mark could also allowparallel(nprocs=2, any=True)
so that a test runs no matter the size ofCOMM_WORLD
. Thenprocs
argument would still be needed so that running with MPI "on the inside" knows what to do. Not quite sure what the best way to implement this logic would be, but something to discuss. e.g. if you specify-m parallel[4]
would a test withparallel(nprocs=2, any=True)
be run? Or not?
P.S. I'm not attached to the names of anything I've suggested here, happy to bikeshed any that we think we do want.