-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support multiple GPU backends #1566
Conversation
Thanks @jpsamaroo! Looking forward to this! |
We should think whether we want users to have to deal with glue overloading like in the OP. Seems like we should inherit the type from the call to I'm thinking we might want to obviate the need for that particular package, but if we need it for other more complex handling, then that's a separate discussion. |
Can't we do something Preferences.jl-based nowadays instead of creating additional packages? That wouldn't bloat load time without complicating development, CI, etc. |
That would still require a Project.toml dependency, which can lead to resolver errors for mismatched dependencies (because I don't develop AMDGPU.jl at nearly the pace you develop CUDA.jl 😄). |
10d1171
to
46a93b4
Compare
Took the liberty to continue some work on this: Rebased, and moved all CUDA functionality to https://github.com/FluxML/FluxCUDA.jl. I did add a dep on GPUArrays so that vendor-neutral GPU functionality can be moved back from FluxCUDA.jl using the |
b132eff
to
898470c
Compare
The last two commits illustrate the two main approaches towards where we should host this code: in a separate repo, or as a subpackage. I'm leaning towards the latter: the CI recipe isn't really more complicated, it's easy enough for aspiring contributors to
|
71bd38d
to
de96b36
Compare
Avoids a needless listing in test/Project.toml, making it possible to pick up development versions.
ad1d70c
to
d9318bb
Compare
d9318bb
to
2eaffb3
Compare
433e6a9
to
49f94d0
Compare
Any thoughts here? This seems to work fine using subpackages. |
The refactor here should probably also tackle #1597. |
Recap of a discussion with @DhairyaLGandhi, who had two concerns about this: (1) complexity of development with the GPU specific stuff moved into subpackages, and (2) a degraded default experience now that the CUDA functionality isn't available by default again. I think we can help (1) by sufficiently documenting the development workflow, and FWIW it didn't turn out difficult to |
One edge case this might address is users who have unsupported GPUs and encounter CUDA.jl errors because of it. Granted that only takes an environment variable tweak to address, but it would be nice to have it work as expected OOTB. |
Idea: FluxCUDA's |
Now that we have Metal.jl, OneAPI.jl, and ROCM.jl support, it would be nice to revisit this. |
I think at this point, we can push almost all GPU backend code into NNlib subpackages instead of creating Flux subpackages. That's the easy part though, because there remain two main blockers for making Flux actually useful for non-CUDA backends:
|
@KristofferC What's the best way forward for now to conditionally load packages? |
There's also https://github.com/JuliaNeuralGraphics/NNlibROC.jl with initial support for some things. While at it, I also want to suggest the idea of using KernelAbstractions.jl for handwritten kernels at some point. |
I have opened a KA issue for Metal support. |
It would be great to be able to use KA in NNlib. The biggest blocker right now is that it only supports Julia 1.7+, whereas FluxML supports the 1.6 LTS. Not sure how to square this circle. |
I "kind of" worked around that by allowing KA 0.7 (which supports 1.6) and manually adding compat functions. See https://github.com/avik-pal/Lux.jl/blob/main/lib/LuxLib/src/utils.jl#L4-L16 |
KA 0.7 doesn't work for existing NNlibCUDA kernels AIUI because many of them rely on atomics, and those were only added in v0.8. With big changes on the horizon for 0.9, I'd also be wary of targeting such an outdated KA version as well. If it were possible to make KA 0.9/1.0 work on 1.6 again but just with degraded performance and/or a few non-vital QoL features disabled, that might be a path forward. |
Why not put 1.6 support on a different minor version branch, and do development on head with Julia 1.7? Julia 1.6 also has problems with using multiple GPU backends at once due to limited method overlay support, making it a bad choice of version to support for GPU-supporting packages. |
Great point! |
Interesting, is there an example of other packages which do this? With this approach, is there a way to avoid maintaining 2 copies of each kernel (1 legacy for 1.6, 1 KA for 1.7+) for any kernels we want to move to KA? My main worry is that we're barely able to maintain a single copy of stuff like the |
I believe CUDA did this at some point, but I'm not aware of any packages actively doing this (I think most packages which need advanced atomics just gave up on 1.6). My thought is that, while 1.6 is a supported upstream version, it doesn't mean that NNlib needs to add new features for 1.6 users; only bug fixes are expected for the 1.6 LTS anyway. So NNlib can just do the same and only provide bugfixes on that 1.6-supporting branch. And my personal opinion: Holding back NNlib's feature development and not enhancing maintainability just because some people won't (or can't) update to a newer stable Julia version hurts more people than those that it helps. |
That sounds reasonable enough to me, thanks.
For better or worse, the main thing holding back NNlib feature development (IMO) is an acute lack of people with the know-how and interest to work on optimized low-level routines. So if there's a real possibility this could help with that, then I'm on board. cc @FluxML/committers though for additional thoughts on this. |
bump now that oneAPI has hit 1.0 |
Abstracts out some functionality to support an external support package, like https://github.com/jpsamaroo/FluxAMDGPU.jl. Better than just adding AMDGPU as a direct dependency, which would just keep bloating Flux load time and makes it harder to iterate quickly for an arguably experimental backend.
Once this is complete, AMD GPU owner should be able to do
using Flux, FluxAMDGPU, AMDGPU
and get access to basically everything they need to do Flux things.PR Checklist