Closed
Description
At least one path involving internal dup of communicators - in this case for MPI_Comm_spawn, hits an issue with the newly added memkind related code:
0 0x00007ffff6d7e5c9 in info_find_key (info=0x0, key=0x7ffff7a302b6 "mpi_memory_alloc_kinds") at info.c:441
#1 0x00007ffff6d7d02d in opal_info_get_nolock (info=0x0, key=0x7ffff7a302b6 "mpi_memory_alloc_kinds", value=0x7fffffffb440, flag=0x7fffffffb434) at info.c:112
#2 0x00007ffff6d7daae in opal_info_get (info=0x0, key=0x7ffff7a302b6 "mpi_memory_alloc_kinds", value=0x7fffffffb440, flag=0x7fffffffb434) at info.c:256
#3 0x00007ffff76bc43a in ompi_info_memkind_copy_or_set (parent=0x607480 <ompi_mpi_comm_world>, child=0x820dd0, info=0x0, type=0x7fffffffb4c4) at info/info_memkind.c:559
#4 0x00007ffff76837fc in ompi_comm_idup_internal (comm=0x607480 <ompi_mpi_comm_world>, group=0x1124980, remote_group=0x0, info=0x0, newcomm=0xe81c70, req=0x7fffffffb610)
at communicator/comm.c:1462
#5 0x00007ffff7680be6 in ompi_comm_set_nb (ncomm=0x7fffffffb9f8, oldcomm=0x607480 <ompi_mpi_comm_world>, local_size=1, local_ranks=0x0, remote_size=3, remote_ranks=0x0, attr=0x0,
errh=0x7ffff7d7bd60 <ompi_mpi_errors_are_fatal>, local_group=0x1124980, remote_group=0xe18b20, flags=0, req=0x7fffffffb610) at communicator/comm.c:275
#6 0x00007ffff76806bc in ompi_comm_set (ncomm=0x7fffffffb9f8, oldcomm=0x607480 <ompi_mpi_comm_world>, local_size=1, local_ranks=0x0, remote_size=3, remote_ranks=0x0, attr=0x0,
errh=0x7ffff7d7bd60 <ompi_mpi_errors_are_fatal>, local_group=0x1124980, remote_group=0xe18b20, flags=0) at communicator/comm.c:170
#7 0x00007ffff769b6f7 in ompi_dpm_connect_accept (comm=0x607480 <ompi_mpi_comm_world>, root=0, port_string=0x7fffffffc760 "2562457601.0:540482672", send_first=false,
newcomm=0x7fffffffcb60) at dpm/dpm.c:505
#8 0x00007ffff76f92ae in PMPI_Comm_spawn (command=0x4055cc "./disconnect_reconnect", argv=0x7fffffffd120, maxprocs=3, info=0x607a80 <ompi_mpi_info_null>, root=0,
comm=0x607480 <ompi_mpi_comm_world>, intercomm=0x7fffffffcbe8, array_of_errcodes=0x0) at comm_spawn.c:157
#9 0x0000000000402526 in main (argc=<optimized out>, argv=<optimized out>) at disconnect_reconnect.c:83
fix coming shortly.
Metadata
Metadata
Assignees
Labels
No labels