Skip to content

Conversation

@gsitaram
Copy link
Owner

This PR adds support for offload to AMD GPUs using the par_unseq execution policy in C++ standard parallelism algorithms. To trigger the GPU offload of all parallel algorithms, the --hipstdpar compilation flag must be provided. For GPU targets other than the current default of gfx906, the --offload-arch=<arch_string> option must also be provided at compile time.

When using ROCm 6.1.0, the compilation commands may look like the following if compiling for an AMD Instinct MI200 series GPU, for instance:

cmake -Bbuild -H. -DMODEL=std-data -DCMAKE_CXX_COMPILER=hipcc -DCLANG_OFFLOAD=gfx90a
cmake --build build

Please let me know if you have any questions.

@gsitaram gsitaram requested a review from afanfa April 29, 2024 21:12
@afanfa afanfa merged commit 6bd658c into main Apr 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants