-
-
Notifications
You must be signed in to change notification settings - Fork 609
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Batchnorm fails on GPU #385
Comments
same problem for me I read source code of using Flux
using CuArrays
bn=BatchNorm(3) |> gpu
@show typeof(bn.β)
@show typeof(bn.γ)
@show typeof(bn.λ)
@show typeof(bn.μ)
@show typeof(bn.σ)
@show typeof(bn.ϵ)
@show typeof(bn.momentum) the result is
I thought there are some inconsistency between Float32 and Float64 (or CuArray{Float32,1}) and some operations with them cause some error. I tried to set \epsilon and momentum as Float32 like this: bn=BatchNorm(3,ϵ=Float32(1e-7),momentum=Float32(0.1)) |> gpu
@show typeof(bn.ϵ) # should be Float32
@show typeof(bn.momentum) # should be Float32 but this prescription has no effect and same error occurs. |
in the function let λ = BN.λ
λ.(reshape(γ, affine_shape...) .* ((x .- μ) ./ σ) .+ reshape(β, affine_shape...)) #error
end changing this to let λ = BN.λ
#λ.(reshape(γ, affine_shape...) .* ((x .- μ) ./ σ) .+ reshape(β, affine_shape...))
foo = reshape(γ, affine_shape...) .* ((x .- μ) ./ σ)
bar = reshape(β, affine_shape...)
λ.(foo .+ bar)
end works for me, but I don't understand why. Though ϵ (in args) and momentum may be Float64, they are converted to the type T ( |
Anybody found a solution to this? |
1 similar comment
Anybody found a solution to this? |
|
It looks like it might be some kind of inference failure, as breaking up the computation like that fixes things. @avik-pal I do not think this is fixed on
It appears that breaking up any part of the computation causes things to no longer break:
@maleadt do you have any idea why this would be happening? I tried using |
I ran into this problem too, and I'm looking forward to the fix. |
BatchNorm is still buggy, with the current master branch of Flux and CuArrays. See MWE below: julia> using Flux;
julia> x = rand(10, 5) |> gpu;
julia> f = BatchNorm(10) |> gpu;
julia> f(x)
ERROR: ArgumentError: cannot take the CPU address of a CuArrays.CuArray{Float32,1}
Stacktrace:
[1] cconvert(::Type{Ptr{Nothing}}, ::CuArrays.CuArray{Float32,1}) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/nhhwE/src/array.jl:152
[2] macro expansion at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/nhhwE/src/dnn/error.jl:17 [inlined]
[3] cudnnBatchNormalizationForwardInference(::Ptr{Nothing}, ::CuArrays.CUDNN.cudnnBatchNormMode_t, ::Base.RefValue{Float32}, ::Base.RefValue{Float32}, ::CuArrays.CUDNN.TensorDesc, ::CuArrays.CuArray{Float32,4}, ::CuArrays.CUDNN.TensorDesc, ::CuArrays.CuArray{Float32,4}, ::CuArrays.CUDNN.TensorDesc, ::CuArrays.CuArray{Float32,1}, ::CuArrays.CuArray{Float32,1}, ::CuArrays.CuArray{Float32,1}, ::CuArrays.CuArray{Float32,1}, ::Float32) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/nhhwE/src/dnn/libcudnn.jl:1016
[4] #cudnnBNForward!#120(::Nothing, ::Int64, ::Int64, ::Float32, ::Bool, ::typeof(CuArrays.CUDNN.cudnnBNForward!), ::CuArrays.CuArray{Float32,4}, ::CuArrays.CuArray{Float32,1}, ::CuArrays.CuArray{Float32,1}, ::CuArrays.CuArray{Float32,4}, ::CuArrays.CuArray{Float32,1}, ::CuArrays.CuArray{Float32,1}, ::Float32) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/nhhwE/src/dnn/batchnorm.jl:62
[5] #cudnnBNForward! at ./none:0 [inlined]
[6] #batchnorm#119 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/nhhwE/src/dnn/batchnorm.jl:26 [inlined]
[7] #batchnorm at ./none:0 [inlined]
[8] #batchnorm#118 at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/CuArrays/nhhwE/src/dnn/batchnorm.jl:17 [inlined]
[9] #batchnorm at ./none:0 [inlined]
[10] (::BatchNorm{typeof(identity),CuArrays.CuArray{Float32,1},CuArrays.CuArray{Float32,1},Float32})(::CuArrays.CuArray{Float32,2}, ::Nothing) at /afs/inf.ed.ac.uk/user/s16/s1672897/.julia/packages/Flux/E2wBe/src/cuda/cudnn.jl:4 (repeats 2 times)
[11] top-level scope at REPL[4]:1 I looked into the functions used in stacktrace but didn't manage to find things obvious to me. I'm happy to dig into this as I need to use it. Any idea? @MikeInnes @maleadt |
PR in CuAarrays to fix this: JuliaGPU/CuArrays.jl#464 |
Another error of BatchNorm on GPU: julia> using Flux
julia> m = Chain(
Dense(28^2, 64),
BatchNorm(64, relu),
Dense(64, 10),
BatchNorm(10),
softmax)
Chain(Dense(784, 64), BatchNorm(64, λ = relu), Dense(64, 10), BatchNorm(10), softmax)
julia> gpu(m)(gpu(rand(28^2, 10)))
ERROR: CUDNNError(code CUDNN_STATUS_BAD_PARAM, CUDNN_STATUS_BAD_PARAM)
Stacktrace:
[1] cudnnBatchNormalizationForwardInference(::Ptr{Nothing}, ::CuArrays.CUDNN.cudnnBatchNormMode_t, ::Base.RefValue{Float32}, ::Base.RefValue{Float32}, ::CuArrays.CUDNN.TensorDesc, ::CuArrays.CuArray{Float32,4,CuArrays.CuArray{Float32,2,Nothing}}, ::CuArrays.CUDNN.TensorDesc, ::CuArrays.CuArray{Float32,4,Nothing}, ::CuArrays.CUDNN.TensorDesc, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::Float32) at C:\Users\shi\.julia\packages\CuArrays\ZYCpV\src\dnn\error.jl:19
[2] #cudnnBNForward!#344(::Nothing, ::Int64, ::Int64, ::Float32, ::Bool, ::typeof(CuArrays.CUDNN.cudnnBNForward!), ::CuArrays.CuArray{Float32,4,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,4,CuArrays.CuArray{Float32,2,Nothing}}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::Float32) at C:\Users\shi\.julia\packages\CuArrays\ZYCpV\src\dnn\batchnorm.jl:62
[3] #cudnnBNForward! at .\none:0 [inlined]
[4] #batchnorm#343 at C:\Users\shi\.julia\packages\CuArrays\ZYCpV\src\dnn\batchnorm.jl:26 [inlined]
[5] #batchnorm at .\none:0 [inlined]
[6] #batchnorm#342(::Nothing, ::Int64, ::Int64, ::Float32, ::Bool, ::typeof(CuArrays.CUDNN.batchnorm), ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,2,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::CuArrays.CuArray{Float32,1,Nothing}, ::Float32) at C:\Users\shi\.julia\packages\CuArrays\ZYCpV\src\dnn\batchnorm.jl:17
[7] BatchNorm at .\none:0 [inlined]
[8] BatchNorm at C:\Users\shi\.julia\packages\Flux\oX9Pi\src\cuda\cudnn.jl:4 [inlined]
[9] applychain at C:\Users\shi\.julia\packages\Flux\oX9Pi\src\layers\basic.jl:30 [inlined] (repeats 2 times)
[10] (::Chain{Tuple{Dense{typeof(identity),CuArrays.CuArray{Float32,2,Nothing},CuArrays.CuArray{Float32,1,Nothing}},BatchNorm{typeof(relu),CuArrays.CuArray{Float32,1,Nothing},CuArrays.CuArray{Float32,1,Nothing},Float32},Dense{typeof(identity),CuArrays.CuArray{Float32,2,Nothing},CuArrays.CuArray{Float32,1,Nothing}},BatchNorm{typeof(identity),CuArrays.CuArray{Float32,1,Nothing},CuArrays.CuArray{Float32,1,Nothing},Float32},typeof(softmax)}})(::CuArrays.CuArray{Float32,2,Nothing}) at C:\Users\shi\.julia\packages\Flux\oX9Pi\src\layers\basic.jl:32
[11] top-level scope at REPL[5]:1 |
The problem is fixed on last release (v0.11) julia> using Flux
julia> m = Chain(
Dense(28^2, 64),
BatchNorm(64, relu),
Dense(64, 10),
BatchNorm(10),
softmax) |> gpu
Chain(Dense(784, 64), BatchNorm(64, λ = relu), Dense(64, 10), BatchNorm(10), softmax)
julia>
julia> m(gpu(rand(Float32, 28^2,10)))
10×10 Array{Float32,2}:
0.0152502 0.0234013 0.0166635 … 0.0192587 0.0273978 0.0432131
0.127888 0.223488 0.17404 0.130349 0.163127 0.181742
0.0216491 0.0193224 0.0131356 0.0190182 0.0254123 0.0227821
0.0560169 0.0569448 0.0406374 0.0340359 0.0550951 0.0430422
0.141302 0.154263 0.18579 0.240547 0.137368 0.166967
0.116726 0.0991394 0.14152 … 0.0637739 0.0638257 0.0526835
0.095143 0.136513 0.0597864 0.114565 0.234256 0.0559822
0.161634 0.132457 0.137533 0.166024 0.101083 0.101018
0.0762043 0.0652032 0.127047 0.0997341 0.0523317 0.17962
0.188187 0.0892672 0.103847 0.112695 0.140104 0.152949
julia> m(gpu(rand(Float64, 28^2,10)))
10×10 Array{Float32,2}:
0.035228 0.0292133 0.0406775 … 0.0225865 0.0204292 0.0347961
0.185115 0.2619 0.105639 0.297724 0.0867104 0.169069
0.031076 0.0267011 0.0238993 0.026509 0.0134919 0.0347641
0.070163 0.0277017 0.0405046 0.0581834 0.0222371 0.0426887
0.277579 0.135808 0.195828 0.118733 0.225566 0.169406
0.0576468 0.0849154 0.140865 … 0.0701037 0.0580264 0.0819516
0.112284 0.0472886 0.0543785 0.0754303 0.074338 0.100449
0.0742676 0.0656631 0.126945 0.104323 0.228286 0.103572
0.0760779 0.235139 0.157998 0.146371 0.15624 0.075284
0.0805631 0.0856701 0.113265 0.0800346 0.114675 0.18802 |
Thank you!! |
When trying to run batchnorm on gpus the following error occurs
The text was updated successfully, but these errors were encountered: