-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support HIP/ROCm backends for GPUs #101
Conversation
…ture-rocm-port
cddce8d
to
ebabe95
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most comments are on find/replace errors. Overall I don't see any big issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently at 140/179. So far looks good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
158/179. One correctness error in include/hydrogen/meta/MetaUtilities.hpp
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall look good to me. My comments are all nitpicks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I see that there are still some unaddressed comments from @timmoon10
Co-authored-by: Tim Moon <moon13@llnl.gov>
…ture-rocm-port
Co-authored-by: Tim Moon <moon13@llnl.gov>
…ntal into feature-rocm-port
…ture-rocm-port
Most of the GPU calls have been factored out into clean HIP vs CUDA backends. Even though HIP is a thin layer over CUDA on NVIDIA platforms, we don't use massive portions of the API and it seemed that there could be an advantage to having the two separate. (Additionally, I don't want CUDA users to have to install ROCm on systems that don't need it just to get back to CUDA.) Moreover, I could envision an optimization for one platform being neutral or even bad for the other, so keeping the two isolated will keep the optimization paths independent.
Similarly, cuBLAS and rocBLAS have been separated as the two have surprisingly divergent APIs. This is abstracted behind the
gpu_blas
namespace.This port supports Aluminum in both CUDA mode and HIP mode.
I have not tested with hipCUB yet, but this support should work.hipCUB support has been added and seems fine.I have not tested with GPU half types under HIP; please review this PR as-is and I will work on that functionality. If it makes it in before this merges, super. Otherwise, I can do a follow-on PR.
As part of this refactor, the preprocessing macros have changed slightly.
HYDROGEN_HAVE_GPU
should now be used to protect any generic GPU-specific code.HYDROGEN_HAVE_CUDA
andHYDROGEN_HAVE_ROCM
should be used to protect code that is GPU-backend specific, for CUDA and HIP/ROCm, respectively. Cleaning this up accounts for a large portion of the changes in this PR.As a final note here: the
SyncInfo
object changed slightly. Instead of a struct with public event and stream, this is now a class that usesEvent()
andStream()
to access the event and stream handles, respectively. This is another large portion of the changes in the PR.