Support HIP/ROCm backends for GPUs #101

benson31 · 2020-05-05T21:32:36Z

Most of the GPU calls have been factored out into clean HIP vs CUDA backends. Even though HIP is a thin layer over CUDA on NVIDIA platforms, we don't use massive portions of the API and it seemed that there could be an advantage to having the two separate. (Additionally, I don't want CUDA users to have to install ROCm on systems that don't need it just to get back to CUDA.) Moreover, I could envision an optimization for one platform being neutral or even bad for the other, so keeping the two isolated will keep the optimization paths independent.

Similarly, cuBLAS and rocBLAS have been separated as the two have surprisingly divergent APIs. This is abstracted behind the gpu_blas namespace.

This port supports Aluminum in both CUDA mode and HIP mode.

~~I have not tested with hipCUB yet, but this support should work.~~ hipCUB support has been added and seems fine.

I have not tested with GPU half types under HIP; please review this PR as-is and I will work on that functionality. If it makes it in before this merges, super. Otherwise, I can do a follow-on PR.

As part of this refactor, the preprocessing macros have changed slightly. HYDROGEN_HAVE_GPU should now be used to protect any generic GPU-specific code. HYDROGEN_HAVE_CUDA and HYDROGEN_HAVE_ROCM should be used to protect code that is GPU-backend specific, for CUDA and HIP/ROCm, respectively. Cleaning this up accounts for a large portion of the changes in this PR.

As a final note here: the SyncInfo object changed slightly. Instead of a struct with public event and stream, this is now a class that uses Event() and Stream() to access the event and stream handles, respectively. This is another large portion of the changes in the PR.

…ture-rocm-port

… more

…sn't)

ndryden

Most comments are on find/replace errors. Overall I don't see any big issues.

include/El/blas_like/level1/Copy/util.hpp

include/El/blas_like/level1/Hadamard.hpp

include/El/core/Element/impl.hpp

include/El/core/Matrix/impl.hpp

include/El/core/Memory/decl.hpp

src/core/imports/mpi/Gather.hpp

src/hydrogen/device/ROCm.cpp

tests/blas_like/Gemm.cpp

timmoon10

Currently at 140/179. So far looks good.

include/El/core/Memory/impl.hpp

include/hydrogen/device/gpu/cuda/CUDALaunchKernel.hpp

include/hydrogen/device/gpu/cuda/SyncInfo.hpp

timmoon10

158/179. One correctness error in include/hydrogen/meta/MetaUtilities.hpp.

include/hydrogen/device/gpu/rocm/SyncInfo.hpp

include/hydrogen/device/gpu/rocm/rocBLAS_API.hpp

include/hydrogen/meta/MetaUtilities.hpp

timmoon10

Overall look good to me. My comments are all nitpicks.

tests/core/DistMatrix.cpp

src/hydrogen/device/cuBLAS.cpp

src/hydrogen/device/rocBLAS.cpp

bvanessen

LGTM. I see that there are still some unaddressed comments from @timmoon10

Co-authored-by: Tim Moon <moon13@llnl.gov>

…ture-rocm-port

Co-authored-by: Tim Moon <moon13@llnl.gov>

…ntal into feature-rocm-port

… anyway.

…ture-rocm-port

benson31 added 20 commits June 13, 2019 10:43

Add hydrogen error handling mechanisms

bae39e6

new cuda management infrastructure

609a151

everything in rocm compiles i think. linker issues pending

ac94d61

remove override decoration from Element/BlockMatrix functions

d0f9dd8

Merge branch 'hydrogen' of https://github.com/llnl/elemental into fea…

a2746b7

…ture-rocm-port

patch for finding rocblas; not sure if this is strictly necessary any…

aa545c5

… more

forward kernel arguments by reference

6957e09

a few tweaks to the CMakeLists

9a42bad

Make sure ROCm and CUDA aren't enabled at the same time.

4144df4

correct a discrepancy in hipMemcpy2DAsync semantics

b824bbb

clean up HAVE_CUDA macro usage; streamline copy syntax

6512c9e

use nonblocking stream; clean up the mempool

f654343

straggler HAVE_CUDA use in include tree

e2a887f

preprocessor macro cleanup in blaslike tests

bb4db99

Remove debugging print statements

1887636

add short-circuit returns to copy/fill routines when size is zero

a4967af

some cleanup

bc2737a

a variety of fixes

a7d49d9

fix some new rocm issues

3e52a5d

update aluminum version number

0839f83

benson31 added enhancement review requested hip rocm Things related to HIP/ROCm support. labels May 5, 2020

benson31 requested review from ndryden, timmoon10 and bvanessen May 5, 2020 21:32

benson31 self-assigned this May 5, 2020

update version number

ebabe95

benson31 force-pushed the feature-rocm-port branch from cddce8d to ebabe95 Compare May 5, 2020 21:48

remove some unneeded CMake

8982b49

benson31 added 3 commits May 6, 2020 09:47

revert changes related to the hip override bug

1bcad07

add support for hipCUB and generalize cublas tensor option

9817919

fix annoying clang warnings (that GCC _should_ throw, too, but it doe…

dc8ea50

…sn't)

ndryden reviewed May 6, 2020

View reviewed changes

benson31 added 5 commits May 6, 2020 12:53

address some review comments

808a6cd

fix use of streams that should have been SyncInfos

5f9d0fe

Clean up device library functions

b08feb4

cleanup timer nonsense in Gemm test

0080781

fix some hipCUB linkage

caf00e8

ndryden approved these changes May 7, 2020

View reviewed changes

timmoon10 reviewed May 8, 2020

View reviewed changes

timmoon10 self-requested a review May 8, 2020 23:59

timmoon10 requested changes May 9, 2020

View reviewed changes

timmoon10 self-requested a review May 9, 2020 01:00

timmoon10 approved these changes May 10, 2020

View reviewed changes

bvanessen approved these changes Jun 4, 2020

View reviewed changes

benson31 and others added 6 commits June 4, 2020 14:58

Apply suggestions from code review

8be571a

Co-authored-by: Tim Moon <moon13@llnl.gov>

Merge branch 'hydrogen' of https://github.com/llnl/elemental into fea…

0b96cd2

…ture-rocm-port

Apply suggestions from code review

da98b67

Co-authored-by: Tim Moon <moon13@llnl.gov>

Merge branch 'feature-rocm-port' of https://github.com/benson31/Eleme…

675d91e

…ntal into feature-rocm-port

remove unneeded metafunction. DiHydrogen has a cleaner implementation…

bae1cc7

… anyway.

Merge branch 'hydrogen' of https://github.com/llnl/elemental into fea…

178601b

…ture-rocm-port

benson31 merged commit d2feee8 into LLNL:hydrogen Jun 5, 2020

benson31 deleted the feature-rocm-port branch June 5, 2020 22:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support HIP/ROCm backends for GPUs #101

Support HIP/ROCm backends for GPUs #101

benson31 commented May 5, 2020 •

edited

Loading

ndryden left a comment

timmoon10 left a comment

timmoon10 left a comment

timmoon10 left a comment

bvanessen left a comment

Support HIP/ROCm backends for GPUs #101

Support HIP/ROCm backends for GPUs #101

Conversation

benson31 commented May 5, 2020 • edited Loading

ndryden left a comment

Choose a reason for hiding this comment

timmoon10 left a comment

Choose a reason for hiding this comment

timmoon10 left a comment

Choose a reason for hiding this comment

timmoon10 left a comment

Choose a reason for hiding this comment

bvanessen left a comment

Choose a reason for hiding this comment

benson31 commented May 5, 2020 •

edited

Loading