Description
Is your feature request related to a problem? Please describe.
The problem is described in this issue: #12225 . Users with applications written in heterogeneous programming languages, where all files of all translation units are, e.g., .cu
CUDA files (or hip, or sycl), often run into the issue that they can't compile their application using any of the provided wrappers. They try and struggle to, e.g., compile them using the C++ wrapper as follows:
OMPI_CXX=nvcc mpicxx .... main.cu
The heterogeneous compilers often call different compilers themselves. For example, nvcc
expands the source code into a device file compiled with a device-only compiler, and a host c++ file that is then - in the case of a CUDA C++ MPI application - compiled using a host C++ MPI compiler wrapper like mpicxx
(and a host c++ compiler like g++ or clang++ otherwise).
The feedback I've gotten from users multiple times is that they struggle to do this, they spend time fiddling with compiler wrapper options, environment variables, end up modifying their application (e.g. splitting the code that uses an accelerator from the code that initializes the program to simplify compiler), or have to go grab complex build systems like CMake to compile a single-file "MPI + CUDA C++ hello world", since CMake will query all include / link flags from the wrapper correctly, prefered compiler, and pass those to the heterogeneous compiler.
Describe the solution you'd like
Compiling an application that mixes MPI with an heterogeneous language (like CUDA C++, HIP, etc.) should be as easy as:
mpiacc hello_world.cu
Compiling multiple translation units should be as easy as compiling them with mpiacc
, and linking them together.
Describe alternatives you've considered
See above. There are many workarounds, but none of them provide a smooth experience for beginner MPI programmers willing to extend a single GPU application to multiple GPUs.
Additional context
This proposal was discussed in this week's MPICH developer call, and there is an issue tracking it here. pmodels/mpich#6867
It would be best for users if the MPI wrapper for heteregoeneous compilers would have a similar API in both implementations.