-
Notifications
You must be signed in to change notification settings - Fork 1
Setting up a new machine (with MPITrampoline)
Gabriele Bozzola edited this page Mar 14, 2024
·
1 revision
While Clima codes should work with any implementation of MPI, we found that MPITrampoline provides the most stable results on cluster. This how-to guide describes how to set up a new machine using MPITrampoline. Note that each machine is unique, so this is a general guide and it might not lead to best performances.
- An MPI implementation (e.g., OpenMPI)
- CUDA
- A C/Fortran toolchain
- Julia
- If you already have an MPI implementation, check that the implementation is CUDA-aware. [TODO: Add how to] Note, if your MPI implementation is using UCX, UCX has to be CUDA-aware too.
- If you don't have an MPI implementation, you can grab one and compile it, making sure that you are compiling a CUDA-aware implementation.
- Download MPIwrapper and compile it with your implementation of MPI
When you use MPITrampoline, you can run your code as
$MPITRAMPOLINE_MPIEXEC -n <np> julia ...
JULIA_MPI_HAS_CUDA="true"
MPITRAMPOLINE_LIB="$mpi_trampoline_root/lib64/libmpiwrapper.so"
MPITRAMPOLINE_MPIEXEC="$mpi_trampoline_root/bin/mpiwrapperexec"
where mpi_trampoline_root
is the path where you installed MPIwrapper
And your julia preferences should contain
[preferences.MPIPreferences]
_format = "1.0"
binary = "MPItrampoline_jll"