-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DepthwiseConv does not run on GPU #459
Comments
Try You can also use |
I'm sorry I should have write The following code is what I wanted to say ...... using Flux:DepthwiseConv
num_ch=10
test_input = rand(224,224,num_ch,2) #WHCB style
depthwiseconv = DepthwiseConv((3,3),num_ch,stride=1,pad=1)
@show size(depthwiseconv(test_input)) # Works fine
using Flux
using CuArrays
depthwiseconv = depthwiseconv |> gpu
test_input = test_input |> gpu
@show size(depthwiseconv(test_input)) # Oops EDITED this occurs the following error.
|
I guess we may not have an implementation in CuArrays yet; cc @avik-pal |
Yes that is the case. Since the CPU code is by calling the im2col (thats were the conversion to pointer error is coming from) and corresponding functions, the implementation does not work on GPU. However, this can be fixed easily by using CUDNN (I recently came to know about this feature). |
What's the status on this issue, @avik-pal ? My current research depends on Unfortunately, I'm not really familiar with CUDNN/CuArrays.jl "under the hood", but if there's anything I'm able to assist with, I'm happy to help... |
I second Sleort's comment. I'm highly interested in getting this available on the GPU. Just CPU performance is prohibitively slow except for the tiniest of problems. |
Admittedly, at this point, this is out of my depth to fully grasp what to do. I gather you are referring to https://github.com/JuliaGPU/CuArrays.jl/blob/master/src/dnn/libcudnn.jl ? Is the linked function the only thing you require to be wrapped? Where do you think this should go? Essentially libcudnn would be the appropriate place, right? Tell me where, Ill do the PR if so desired. function cudnnSetConvolutionGroupCount(convDesc,groupCount)
@check ccall((:cudnnSetConvolutionGroupCount,libcudnn),
cudnnStatus_t,
(cudnnConvolutionDescriptor_t,Cint),
convDesc,groupCount)
end
function cudnnGetConvolutionGroupCount(convDesc,groupCount)
@check ccall((:cudnnGetConvolutionGroupCount,libcudnn),
cudnnStatus_t,
(cudnnConvolutionDescriptor_t,Ptr{Cint}),
convDesc,groupCount)
end Would be great to see this supported in Flux, as its an extremely useful feature for various architectures. Would then, as you said regarding group conv, also resolve #330. |
Yes this is the only function to be wrapped. You should open the pr in CuArrays.jl and we can discuss there. The next thing would be to modify the ConvDesc call here to call the wrapped function first. |
Great, opened the PR as you probably been notified about. Happy to have a crack at other changes required as well (albeit with some help probably :)). |
Hello, any news? |
The PRs to follow are #948 and FluxML/NNlib.jl#146. If you're interested in getting this through, contributions are always welcome ;) |
Looks like last comments on this are pretty old, is there any walk-around currently to run DepthwiseConv on GPU? Do you need help with testing? |
Yes please, this should be usable now. |
Any progress on this? |
You can use depthwise convolutions on GPU today by simply setting |
Great! Thanks :) |
We could deprecate DepthwiseConv or rewire it to Conv. Marking this has a decision to be made for next release. |
So we just need this, and delete it all?
|
I think so |
We need to setup forwarding to the CPU depthwise kernels in NNlib based on group count. That's the tricky bit and I'm not sure how best to do it. |
Are you thinking that these are more efficient than the |
choosing the most performant implementation based on the group count (if there is any need for that) should be handled by NNlib itself |
The main differences to note are https://github.com/FluxML/NNlib.jl/blob/master/src/impl/depthwiseconv_im2col.jl#L27-L46 vs https://github.com/FluxML/NNlib.jl/blob/master/src/impl/conv_im2col.jl#L45-L58, https://github.com/FluxML/NNlib.jl/blob/master/src/conv.jl#L266-L270 vs https://github.com/FluxML/NNlib.jl/blob/master/src/conv.jl#L186-L200, and https://github.com/FluxML/NNlib.jl/blob/master/src/dim_helpers/DepthwiseConvDims.jl. Unsure if it would be easier to try unifying the dims code or the implementations first. |
I'm happy to hear
Adds support for Depthwise Convolutions
#279 is merged into master.I updated the latest flux i.e.
pkg> add Flux#master
.Here is a sample code I tested
This works fine, but once I copy depthwiseconv layer into GPU via
gpu
functionthe code above occurs error with the following message:
The text was updated successfully, but these errors were encountered: