Skip to content

Need a way to limit vma tree size (specifically for rgpusm if that's possible) #5172

Open
@Akshay-Venkatesh

Description

@Akshay-Venkatesh

Background information

In smcuda btl, when MPI sends occur over recycled virtual addresses (due to cudaMalloc MPI_Isend, MPI_Irecv, waitall , cudaFree loop), there are memory leaks that take place from not closing stale cuIpcMemHandles on receiver side quickly enough (i.e they get closed during finalize). This tends to consume 4MB of memory on GPU per open entry of cuIpcMemHandle. As the number of stale entries grow, the size available for other use comes down quickly. One way of avoiding this situation is to limit the size of rcache vma tree. However, version openmpi version 3.0.x seems to have removed the way to control this size through --mca rcache_base_vma_tree_items_min $min_size --mca rcache_base_vma_tree_items_max $max_size

What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.)

2.0.x, 3.0.x

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone
Configured to build cuda with --with-cuda flag


Details of the problem

Is it possible to

  1. Enable runtime parameters that allow controlling vma tree size?
  2. Can that be set to smaller values by default to not allow memory blow up to occur frequently.
  3. Can custom parameters be provided just to control rgpusm cache vma tree size?

For 2.0.x we were able to control in the following way:

mpirun -np 2 --mca btl_openib_warn_default_gid_prefix 0 \
    --mca rcache_base_vma_tree_items_min 64  \
    --mca rcache_base_vma_tree_items_max 128 ./mpi_bug

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions