Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

copied the old rnn.jl->rnncompat.jl for Flux compatibility #738

Merged
merged 1 commit into from
Mar 1, 2021

Conversation

denizyuret
Copy link
Contributor

No description provided.

@maleadt maleadt changed the base branch from master to compat March 1, 2021 07:06
@maleadt maleadt changed the base branch from compat to master March 1, 2021 07:07
@maleadt maleadt changed the base branch from master to compat March 1, 2021 07:09
@maleadt
Copy link
Member

maleadt commented Mar 1, 2021

Rebased on top of the compat branch, this is for 1.5 only, right? master will be a breaking release (3.0) anyway.

@maleadt maleadt removed the backport label Mar 1, 2021
@denizyuret
Copy link
Contributor Author

Rebased on top of the compat branch, this is for 1.5 only, right? master will be a breaking release (3.0) anyway.

This is up to Flux people, Knet does not use the interface in rnncompat.jl. I'd rather move Flux specific code out of CUDA/lib/cudnn into their respective packages.

@codecov
Copy link

codecov bot commented Mar 1, 2021

Codecov Report

Merging #738 (5c0c942) into compat (d69be98) will decrease coverage by 1.15%.
The diff coverage is 0.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           compat     #738      +/-   ##
==========================================
- Coverage   79.10%   77.94%   -1.16%     
==========================================
  Files         123      124       +1     
  Lines        6705     6807     +102     
==========================================
+ Hits         5304     5306       +2     
- Misses       1401     1501     +100     
Impacted Files Coverage Δ
lib/cudnn/CUDNN.jl 44.18% <ø> (ø)
lib/cudnn/rnncompat.jl 0.00% <0.00%> (ø)
src/pool/binned.jl 89.56% <0.00%> (+0.86%) ⬆️
lib/cudadrv/devices.jl 97.36% <0.00%> (+2.63%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d69be98...5c0c942. Read the comment docs.

@maleadt maleadt merged commit b30ac5f into JuliaGPU:compat Mar 1, 2021
@CarloLucibello
Copy link
Contributor

yes compat is enough, thanks @denizyuret

@maleadt
Copy link
Member

maleadt commented Mar 2, 2021

IIUC the compat branch should now be ready for a release. @CarloLucibello, could you test?

@denizyuret
Copy link
Contributor Author

yes compat is enough, thanks @denizyuret

No problem @CarloLucibello. There are a few compat files in CUDA/lib/cudnn that will probably be deprecated in the next major CUDA.jl release - if you want to keep using them in NNlib/Flux please copy them there:

  • rnncompat.jl (to be replaced with the new rnn interface in rnn.jl)
  • batchnorm.jl (to be replaced with the new normalization.jl interface)
  • nnlib.jl (a better place for this could be NNlib or an intermediate NNlibCUDA package)

@maleadt
Copy link
Member

maleadt commented Mar 3, 2021

There are a few compat files in CUDA/lib/cudnn that will probably be deprecated in the next major CUDA.jl release

FWIW, the next release is bound to be a breaking one, CUDA.jl 3.0. So now is the time maybe to remove those? The alternative is to release another breaking release some time after that, but that may make people unhappy too.

@CarloLucibello
Copy link
Contributor

it makes sense to remove the NNlib dependence already in 3.0. We'll create a NNlibCUDA.jl package as a subpackage of NNlib.
@DhairyaLGandhi @darsnack @ToucheSir

@DhairyaLGandhi
Copy link
Member

We have a bunch of benefits from testing with the cuda system and checking with cudnn has been helpful as well.

@darsnack
Copy link

darsnack commented Mar 3, 2021

Is it not possible to do a similar test set up with a NNlibCUDA.jl?

We've been talking a bit on Zulip about this with @jpsamaroo. Seems like separating out backend code and glue code into contained packages might be nice when we add support for AMD GPUs. Would also be nicer for dependency management and bug fix releases.

@CarloLucibello
Copy link
Contributor

@denizyuret I run Flux's test on the compat branch here and got this error

R = RNN: Error During Test at /home/carlo/.julia/packages/Flux/goUGu/test/cuda/curnn.jl:4
  Got exception outside of a @test
  UndefVarError: DropoutDesc not defined
  Stacktrace:
   [1] CUDA.CUDNN.RNNDesc{Float32}(::CUDA.CUDNN.cudnnRNNMode_t, ::Int64, ::Int64; layers::Int64) at /home/carlo/.julia/packages/CUDA/omtO5/lib/cudnn/rnncompat.jl:46
   [2] RNNDesc at /home/carlo/.julia/packages/CUDA/omtO5/lib/cudnn/rnncompat.jl:43 [inlined]
   [3] RNNDesc at /home/carlo/.julia/packages/Flux/goUGu/src/cuda/curnn.jl:13 [inlined]
   [4] desc(::Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}) at /home/carlo/.julia/packages/Flux/goUGu/src/cuda/curnn.jl:20
   [5] adjoint at /home/carlo/.julia/packages/Flux/goUGu/src/cuda/curnn.jl:70 [inlined]
   [6] _pullback at /home/carlo/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:57 [inlined]
   [7] adjoint at /home/carlo/.julia/packages/Zygote/KpME9/src/lib/lib.jl:188 [inlined]
   [8] adjoint(::Zygote.Context, ::typeof(Core._apply_iterate), ::typeof(iterate), ::Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}, ::Tuple{CuArray{Float32,1}}, ::Tuple{CuArray{Float32,1}}) at ./none:0
   [9] _pullback at /home/carlo/.julia/packages/ZygoteRules/OjfTt/src/adjoint.jl:57 [inlined]
   [10] Recur at /home/carlo/.julia/packages/Flux/goUGu/src/layers/recurrent.jl:36 [inlined]
   [11] _pullback(::Zygote.Context, ::Flux.Recur{Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}}, ::CuArray{Float32,1}) at /home/carlo/.julia/packages/Zygote/KpME9/src/compiler/interface2.jl:0
   [12] #166 at /home/carlo/.julia/packages/Flux/goUGu/test/cuda/curnn.jl:7 [inlined]
   [13] _pullback(::Zygote.Context, ::var"#166#168"{CuArray{Float32,1}}, ::Flux.Recur{Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}}) at /home/carlo/.julia/packages/Zygote/KpME9/src/compiler/interface2.jl:0
   [14] _pullback(::Function, ::Flux.Recur{Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}}) at /home/carlo/.julia/packages/Zygote/KpME9/src/compiler/interface.jl:33
   [15] pullback(::Function, ::Flux.Recur{Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}}) at /home/carlo/.julia/packages/Zygote/KpME9/src/compiler/interface.jl:39
   [16] gradient(::Function, ::Flux.Recur{Flux.RNNCell{typeof(tanh),CuArray{Float32,2},CuArray{Float32,1}}}) at /home/carlo/.julia/packages/Zygote/KpME9/src/compiler/interface.jl:48
   [17] top-level scope at /home/carlo/.julia/packages/Flux/goUGu/test/cuda/curnn.jl:7
   [18] top-level scope at /build/julia/src/julia-1.5.3/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1190
   [19] include(::String) at ./client.jl:457
   [20] top-level scope at /home/carlo/.julia/packages/Flux/goUGu/test/cuda/runtests.jl:13
   [21] include(::String) at ./client.jl:457
   [22] top-level scope at /home/carlo/.julia/packages/Flux/goUGu/test/runtests.jl:39
   [23] top-level scope at /build/julia/src/julia-1.5.3/usr/share/julia/stdlib/v1.5/Test/src/Test.jl:1115
   [24] top-level scope at /home/carlo/.julia/packages/Flux/goUGu/test/runtests.jl:38
   [25] include(::String) at ./client.jl:457
   [26] top-level scope at none:6

@denizyuret
Copy link
Contributor Author

This may be a simple fix: try copying the old https://github.com/JuliaGPU/CUDA.jl/blob/v2.6.1/lib/cudnn/dropout.jl into rnncompat.jl.

maleadt added a commit that referenced this pull request Mar 16, 2021
copied the old rnn.jl->rnncompat.jl for Flux compatibility
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants