-
Notifications
You must be signed in to change notification settings - Fork 3
pytest-plugin: support openmpi #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I'm away from my computer until next Wednesday so can't give a proper review until then. I have some initial comments:
- I don't think there needs to be a new module and global variable for the distribution type. Just having a free function that we call in the first appropriate pytest hook in plugin.py would be fine.
- I prefer enums to strings for this sort of thing.
I'm also a little surprised that this works. Do you have a real case where the "outer" process does not call MPI_Init?
I am newbie in writting python module. This PR is more than just an issue—a request for compatibility with OpenMPI. I use this patched mpi-pytest plugin doing pytest check of firedrakeProject(compiled with openmpi). The check command is
|
How can i save the mpiexec implemention type as a static variable. Cause runing |
Ah. In that case I don't think that this will actually work. To run with OpenMPI you need to run with pytest "on the inside". For example:
This is detailed in the README. Have you tested that your PR works? I would be very surprised (but happy) if it actually runs. |
A short terminal catch
|
Interesting... I'll have to try this myself next week. I assume if you run with |
adding code snippet in pytest unit tests/firedrake/regression/test_adv_diff.py
I can valid that the tests do run parallel, the outputs looks like this
|
I have a CI server that run full pytestCheck of the nix built firedrake (compiled with openmpi). The log is available here There are however two failure Can you reproduce these failure on Github CI. |
We currently run our test suite using OpenMPI and aren't seeing those errors. I think I have seen the interpolation failure before but didn't have the time to fix it. Please note that Firedrake does not use mpi-pytest with mpiexec "on the inside". We instead call it on the outside like so. Whilst it is cool that OpenMPI seems to work on the inside I wouldn't necessarily recommend that you do it that way. Our test suite is enormous and known to be quite fragile to very subtle global state/caching issues so I would recommend trying to emulate what we do as much as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tested this locally and it runs, which is pretty astonishing! I would be very happy to merge this code subject to some changes.
@@ -1 +1,4 @@ | |||
from .parallel_assert import parallel_assert # noqa: F401 | |||
from .detect_mpiexec_implementation import detect_mpiexec_implementation | |||
|
|||
impl = detect_mpiexec_implementation() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should avoid doing this at import time. If someone doesn't have MPI installed but they do have this plugin (unlikely I know) then this could break their system. Also some systems don't call mpiexec
and use srun
instead. I think a lazy global variable inside plugin.py
would be better:
_mpi_implementation = None
def _set_parallel_callback(...):
global _mpi_implementation
if _mpi_implementation is None:
_mpi_implementation = sniff_mpi_implementation()
...
if "open mpi" in output or "open-rte" in output: | ||
return "Open MPI" | ||
elif "mpich" in output: | ||
return "MPICH" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use an enum instead:
class MPIImplementation(enum.Enum):
OPENMPI = enum.auto()
MPICH = enum.auto()
UNKNOWN = enum.auto()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for contributing!
Its interesting that forking mode now seems to work with OpenMPI. Do we know what is different vs what we were doing before?
else: | ||
return "Unknown MPI implementation" | ||
except FileNotFoundError: | ||
raise FileNotFoundError("mpiexec not found in PATH") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
raise FileNotFoundError("mpiexec not found in PATH") | |
raise FileNotFoundError("mpiexec not found in PATH. Please run in non-forking mode to specify a different MPI executable.") |
The message should tell the user how to correct the error.
I'm hesitant to just error in this case because it will cause incorrect failures in the case that someone has an MPI distribution without mpiexec
. I'm thinking specifically of HPC systems where they are likely to have srun
or aprun
, or something else depending on the scheduling manager. There's also mpirun
but I think that's only MPICH and OpenMPI and is deprecated anyway.
The proper fix should should probably go in another PR, but I wanted to just draw attention to it here.
I think the right thing to do here is allow the user to pass the MPI executable, as in the third point in #16, and only error if both no executable is given and we cannot find mpiexec
.. It's probably unlikely that someone would be running "on the inside" on an HPC because you have way less control, but users are users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we should allow the user to specify a custom mpiexec
equivalent command, but I don't think that that needs to be thought about in this PR. Bear in mind that HPC systems would be expected to run with MPI on the outside, at least initially, and so they won't hit this.
I am happy to follow the required suggestion. |
Absolutely no idea. I definitely observed the no-forking issue myself at some point in the past. This should make things more pleasant for devs who want to be able to run |
@qbisi are you able to address these comments? If you no longer have the time to push this through I would be happy to take over the small amount of cleanup. |
Sorry, i dont have the time and capability to address these comment. Glad you can take over. |
OK. Thanks for your contribution regardless! This would never have been done because we all thought it impossible. |
This commit add compatiblity to run pytest with Open MPI implemention.