Skip to content

synchronize cuda-aware mpi streams #7733

Open
@saareliad

Description

@saareliad

Background information

  • v4.0.3

  • installed from source (tar)

  • cuda aware mpi

  • cuda 10.2

  • This is not a system problem, but suspected behavior/implementation issue in cuda-aware MPI. it will happen on all systems


Details of the problem

Inside cuda-aware MPI (here) you use async cuda streams to send messages. However, user's program run on other streams.
Therefore, your streams in cuda-aware implementation should be able to wait for work completion of user's streams,
otherwise, it would results in incorrect programs,
or it will force users to fully synchronize its streams before calling MPI.
See Pytorch discussion on the matter.

Possible solutions: expose the streams to the user, or (preferable) let the user allocate and manage them.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions