Skip to content

MPI_Win_create/MPI_Win_free leaks about 728 bytes per call pair when acoll component is active #13070

Closed
@hppritcha

Description

@hppritcha

A customer is looking into a memory leak issue using MPI RMA in both Open MPI and MPICH. I've been looking in to the problem on the Open MPI side and one thing that valgrind finds is that the acoll component is leaking about 728 bytes per MPI_Win_create/MPI_Win_free set of calls:

=56433== 7,280 bytes in 10 blocks are definitely lost in loss record 334 of 347
==56433==    at 0x48388E4: malloc (vg_replace_malloc.c:446)
==56433==    by 0x4A53B1D: opal_obj_new (opal_object.h:495)
==56433==    by 0x4A53A22: opal_obj_new_debug (opal_object.h:256)
==56433==    by 0x4A53BCC: mca_coll_acoll_comm_query (coll_acoll_module.c:63)
==56433==    by 0x49D1219: query_3_0_0 (coll_base_comm_select.c:588)
==56433==    by 0x49D11DD: query (coll_base_comm_select.c:571)
==56433==    by 0x49D10D0: check_one_component (coll_base_comm_select.c:536)
==56433==    by 0x49D0C85: check_components (coll_base_comm_select.c:456)
==56433==    by 0x49CFB04: mca_coll_base_comm_select (coll_base_comm_select.c:233)
==56433==    by 0x48CD150: ompi_comm_activate_complete (comm_cid.c:923)
==56433==    by 0x48CE02C: ompi_comm_activate_nb_complete (comm_cid.c:1119)
==56433==    by 0x48D1903: ompi_comm_request_progress (comm_request.c:154)
==56433== 

The acoll module destructor is getting invoked but it looks pretty complex and there's probably something not being freed.

I was using the share/openmpi/openmpi-valgrind.supp suppression file.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions