Skip to content

ompi/group: mpi group operations fails in multithreaded apps #8546

@AboorvaDevarajan

Description

@AboorvaDevarajan

Background information

ompi/group: mpi group operations fails in multithreaded apps

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

OMPI master

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

$ git submodule status
fefaed5 3rd-party/openpmix (v1.1.3-2832-gfefaed5)
477894f4720d822b15cab56eee7665107832921c 3rd-party/prrte (dev-30928-g477894f)

Please describe the system on which you are running

  • Operating system/version: RHEL8
  • Computer hardware: ppc64le
  • Network type: IB

Details of the problem

Please describe, in detail, the problem that you are having, including the behavior you expect to see, the actual behavior that you are seeing, steps to reproduce the problem, etc. It is most helpful if you can attach a small program that a developer can use to reproduce your problem.

Build OMPI with UCX

git clone --recursive https://github.com/openucx/ucx.git
cd ucx
./autogen.sh
./contrib/configure-release-mt --without-java --prefix=$shared_dir/ucx-install --with-cuda=/usr/local/cuda 
make -j40 install

 git clone --recursive https://github.com/open-mpi/ompi.git ompi
 cd ompi
 ./autogen.pl
 ./configure --disable-man-pages --enable-mca-no-build=btl-uct --enable-mpi1-compatibility --prefix $shared_dir/install
                --with-cuda=/usr/local/cuda --with-ucx=$shared_dir/ucx-install 
 make -j40 install

Minimal test to recreate the issue:

https://raw.githubusercontent.com/AboorvaDevarajan/mpi-tests/main/group_mt.c

mpicc group_mt.c -o group_mt
mpirun -np 80 -host host1:40,host2:40 ./group_mt

not ident : 3
not ident : 3
not ident : 3

Here is a probable fix that resolves the issue:
#8547

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions