Skip to content

Setting up a new machine (with MPITrampoline)

Gabriele Bozzola edited this page Mar 14, 2024 · 1 revision

Setting up a new machine (with MPITrampoline)

While Clima codes should work with any implementation of MPI, we found that MPITrampoline provides the most stable results on cluster. This how-to guide describes how to set up a new machine using MPITrampoline. Note that each machine is unique, so this is a general guide and it might not lead to best performances.

What is needed

  • An MPI implementation (e.g., OpenMPI)
  • CUDA
  • A C/Fortran toolchain
  • Julia

Sketch of steps

  1. If you already have an MPI implementation, check that the implementation is CUDA-aware. [TODO: Add how to] Note, if your MPI implementation is using UCX, UCX has to be CUDA-aware too.
  2. If you don't have an MPI implementation, you can grab one and compile it, making sure that you are compiling a CUDA-aware implementation.
  3. Download MPIwrapper and compile it with your implementation of MPI

When you use MPITrampoline, you can run your code as

$MPITRAMPOLINE_MPIEXEC -n <np> julia ... 

Settings

JULIA_MPI_HAS_CUDA="true"
MPITRAMPOLINE_LIB="$mpi_trampoline_root/lib64/libmpiwrapper.so"
MPITRAMPOLINE_MPIEXEC="$mpi_trampoline_root/bin/mpiwrapperexec"

where mpi_trampoline_root is the path where you installed MPIwrapper

And your julia preferences should contain

[preferences.MPIPreferences]
_format = "1.0"
binary = "MPItrampoline_jll"
Clone this wiki locally