Skip to content

MPI_Info_create and sessions #12854

Closed
Closed
@mpokorny

Description

@mpokorny

Background information

What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)

OpenMPI v5.0.5

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

Installed using Spack

Please describe the system on which you are running

  • Operating system/version: Pop!_OS 22.04 LTS
  • Computer hardware: 12th Gen Intel(R) Core(TM) i7-1255U
  • Network type: none

Details of the problem

The following program:

#include <mpi.h>
int
main(int argc, char** argv) {

  MPI_Info info;
  MPI_Info_create(&info);
  MPI_Session s1, s2;
  MPI_Session_init(MPI_INFO_NULL, MPI_ERRORS_RETURN, &s1);
  MPI_Session_finalize(&s1);
  MPI_Session_init(MPI_INFO_NULL, MPI_ERRORS_RETURN, &s2);
  MPI_Session_finalize(&s2);
}

fails at runtime, as follows:

$ mpirun -np 1 src/Test 
free(): double free detected in tcache 2
[aeolus:740193] *** Process received signal ***
[aeolus:740193] Signal: Aborted (6)
[aeolus:740193] Signal code:  (-6)
[aeolus:740193] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x74c71b842520]
[aeolus:740193] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x74c71b8969fc]
[aeolus:740193] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x74c71b842476]
[aeolus:740193] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x74c71b8287f3]
[aeolus:740193] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x89676)[0x74c71b889676]
[aeolus:740193] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0xa0cfc)[0x74c71b8a0cfc]
[aeolus:740193] [ 6] /lib/x86_64-linux-gnu/libc.so.6(+0xa30ab)[0x74c71b8a30ab]
[aeolus:740193] [ 7] /lib/x86_64-linux-gnu/libc.so.6(free+0x73)[0x74c71b8a5453]
[aeolus:740193] [ 8] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libopen-pal.so.80(+0xe8c49)[0x74c71bba7c49]
[aeolus:740193] [ 9] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libmpi.so.40(+0x8dab9)[0x74c71c08dab9]
[aeolus:740193] [10] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libmpi.so.40(ompi_mpi_instance_finalize+0xad)[0x74c71c08eecd]
[aeolus:740193] [11] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libmpi.so.40(MPI_Session_finalize+0x4c)[0x74c71c0c7b0c]
[aeolus:740193] [12] src/Test[0x4011a5]
[aeolus:740193] [13] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x74c71b829d90]
[aeolus:740193] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x74c71b829e40]
[aeolus:740193] [15] src/ray/Test[0x401085]
[aeolus:740193] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 740193 on node aeolus exited on
signal 6 (Aborted).
--------------------------------------------------------------------------

Removing either the call to MPI_Info_create() or the second call to MPI_Session_finalize() allows the program to complete. Another workaround I've found is to add a call to MPI_Init() prior to the call to MPI_Info_create().

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions