Description
Background information
I use OpenMPI and its Portals4 implementation to evaluate the performance of different library im working on.
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
OpenMPI v.5.0.5
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
OpenMPI was installed from a source / distribution tarball
Please describe the system on which you are running
OS: Rocky Linux 8.9 (Green Obsidian)
CPU: Intel(R) Xeon(R) Gold 5122 CPU @ 3.60GHz
Network type: BXIv2
Details of the problem
When I try to compile OpenMPI with the --with-portals4
option, I receive a compiler error regarding the comparison of two Portals4 handles. For example in this code snippet (opal/mca/btl/portals4/btl_portals4.c +490):
if (frag->me_h != PTL_INVALID_HANDLE) {
frag->me_h = PTL_INVALID_HANDLE;
}
Portals4 specification proposes the use of PtlHandleIsEqual()
in order to compare two handles.
Since I'm working with the BXIv2 interconnect, which is more or less a hardware implementation of Portals4, I'm not sure if this error also arises when using the Portals4 InfiniBand reference implementation from Sandia.
There are several positions in the Portals4 code base where I ran into the same issue. I have modified all the necessary parts so that they meet the Portals4 specification.
The fix for the snippet from above would look like this:
if (!PtlHandleIsEqual(frag->me_h, PTL_INVALID_HANDLE)) {
frag->me_h = PTL_INVALID_HANDLE;
}
After my modification I was able to compile OpenMPI and observe decent performance numbers on the OMB.
Can someone maybe comment on my observation and the suggested fix?