-
Notifications
You must be signed in to change notification settings - Fork 926
Description
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
master and v4.1.x
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
'--with-libevent=internal' '--enable-mpi1-compatibility' '--without-xpmem' '--with-cuda=/hpc/local/oss/cuda11.1.1' '--with-slurm' '--with-platform=contrib/platform/mellanox/optimized' '--with-hcoll=/build-result/hpcx-gcc-redhat7.6/hcoll' '--with-ucx=/build-result/hpcx-gcc-redhat7.6/ucx
Please describe the system on which you are running
- Operating system/version: RHEL 7.6
- Computer hardware: Intel x86_64
- Network type: not used
Details of the problem
According to MPI standard, message-passing calls do not modify the value of the error code field of status variables. This field may be updated only by the functions in Section 3.7.5 which return multiple statuses. The field is updated if and only if such function returns with an error code of MPI_ERR_IN_STATUS. (mpi3.1, page 30, line 39)
However, looks like OMPI always updates error code field of status parameters, even when it is not supposed to be modified.
The error can be seen with mprobe test from MPICH test suite:
mpirun -n 2 -mca pml ob1 -mca btl self,vader ./mpich/test/mpi/pt2pt/mprobe
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 224
check failed: (s2.MPI_ERROR == MPI_ERR_TOPOLOGY), line 240
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 259
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 301
check failed: (s2.MPI_ERROR == MPI_ERR_TOPOLOGY), line 315
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 336
s1.error = 0check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 79
check failed: (s2.MPI_ERROR == MPI_ERR_TOPOLOGY), line 93
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 114
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 156
check failed: (s2.MPI_ERROR == MPI_ERR_TOPOLOGY), line 169
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 194
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 224
check failed: (s2.MPI_ERROR == MPI_ERR_TOPOLOGY), line 240
check failed: (s1.MPI_ERROR == MPI_ERR_DIMS), line 259
found 28 errors