Closed
Description
Background information
What version of Open MPI are you using? (e.g., v4.1.6, v5.0.1, git branch name and hash, etc.)
OpenMPI v5.0.5
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Installed using Spack
Please describe the system on which you are running
- Operating system/version: Pop!_OS 22.04 LTS
- Computer hardware: 12th Gen Intel(R) Core(TM) i7-1255U
- Network type: none
Details of the problem
The following program:
#include <mpi.h>
int
main(int argc, char** argv) {
MPI_Info info;
MPI_Info_create(&info);
MPI_Session s1, s2;
MPI_Session_init(MPI_INFO_NULL, MPI_ERRORS_RETURN, &s1);
MPI_Session_finalize(&s1);
MPI_Session_init(MPI_INFO_NULL, MPI_ERRORS_RETURN, &s2);
MPI_Session_finalize(&s2);
}
fails at runtime, as follows:
$ mpirun -np 1 src/Test
free(): double free detected in tcache 2
[aeolus:740193] *** Process received signal ***
[aeolus:740193] Signal: Aborted (6)
[aeolus:740193] Signal code: (-6)
[aeolus:740193] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x74c71b842520]
[aeolus:740193] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x74c71b8969fc]
[aeolus:740193] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x74c71b842476]
[aeolus:740193] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x74c71b8287f3]
[aeolus:740193] [ 4] /lib/x86_64-linux-gnu/libc.so.6(+0x89676)[0x74c71b889676]
[aeolus:740193] [ 5] /lib/x86_64-linux-gnu/libc.so.6(+0xa0cfc)[0x74c71b8a0cfc]
[aeolus:740193] [ 6] /lib/x86_64-linux-gnu/libc.so.6(+0xa30ab)[0x74c71b8a30ab]
[aeolus:740193] [ 7] /lib/x86_64-linux-gnu/libc.so.6(free+0x73)[0x74c71b8a5453]
[aeolus:740193] [ 8] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libopen-pal.so.80(+0xe8c49)[0x74c71bba7c49]
[aeolus:740193] [ 9] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libmpi.so.40(+0x8dab9)[0x74c71c08dab9]
[aeolus:740193] [10] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libmpi.so.40(ompi_mpi_instance_finalize+0xad)[0x74c71c08eecd]
[aeolus:740193] [11] /home/martin/spack/opt/spack/linux-pop22-skylake/gcc-14.1.0/openmpi-5.0.5-dkgact6ph5rgu6fnp5tcfeejp754i7pv/lib/libmpi.so.40(MPI_Session_finalize+0x4c)[0x74c71c0c7b0c]
[aeolus:740193] [12] src/Test[0x4011a5]
[aeolus:740193] [13] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x74c71b829d90]
[aeolus:740193] [14] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x74c71b829e40]
[aeolus:740193] [15] src/ray/Test[0x401085]
[aeolus:740193] *** End of error message ***
--------------------------------------------------------------------------
prterun noticed that process rank 0 with PID 740193 on node aeolus exited on
signal 6 (Aborted).
--------------------------------------------------------------------------
Removing either the call to MPI_Info_create()
or the second call to MPI_Session_finalize()
allows the program to complete. Another workaround I've found is to add a call to MPI_Init()
prior to the call to MPI_Info_create()
.