-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added the SkipConnection layer and constructor #446
Conversation
Oh, there's also a nomenclature issue. When people think of DenseBlocks they are mostly thinking about densely connected convolutional blocks. I suppose we could call this something else, like |
This seems like a good implementation to me but I wonder if it isn't a bit specific. Maybe others can chime in if this seems like it'd be useful to them. Or perhaps there's some precedent in other frameworks? I like the idea of calling it |
Skip connections and deep supervision are quite commonplace nowadays, but I get the point. I'm not sure about other frameworks, I'll investigate. Though Flux does have a bit of domain-specific methods already, skip connections are everywhere in computer vision and segmentation in general. |
It would be nice if this could be written so that new arrays don't have to be allocated on every forward pass. Models used for real-time inference would benefit. |
I'm on board with adding this, with the tests and docs you mentioned. I'd like to remove the special casing of |
Oh yeah, you're completely right about that @MikeInnes . When I started with Flux I had just picked Julia, and wasn't that familiar with it at all. I can do it better now. Would it be better to do it with multiple-dispatch (still casing, but improved in my eyes) or a simple keyword-argument defaulting to |
I'm thinking just keep it as a positional argument but without a default, so you can do |
@r3tex Thanks for the suggestion! My Julia-fu is weak, is the new, streamlined, version enough? I could look into types as well I guess. Here's proof gradients propagate correctly:
|
Also, is the current |
Added some tests, still superficial, could add more if asked. Is it okay @MikeInnes ? |
This is looking nice! Couple small things: remove the |
Added missing export Corrected channel placement Dimension 4 cannot be assumed to always be the Channel dimension Deprecation of `treelike` Code now makes use of `@treelike` macro instead of the deprecated `treelike` function (it worked on my end because I'm on Julia 0.7, while Julia 1.0 deprecated stuff) Update basic.jl Renaming to SkipConnection * Update Flux.jl * Update basic.jl Updated `SkipConnection` with a `connection` field I'm pretty sure I broke something now, but this PR should follow along these lines `cat` needs special treatment (the user can declare his own `concatenate` connection, but I foresee it's going to be used often so we can simply define special treatment) Forgot to remove some rebasing text Forgot to remove some more rebasing text Removed local copy and default cat method from the function calls Adjusted some more types for inference, could improve on this as well Re-placed some left-over spaces
@MikeInnes I added a line to docs/src/models/layers.md, under |
Looks perfect, thanks @bhvieira! bors r+ |
446: Added the SkipConnection layer and constructor r=MikeInnes a=bhvieira I added a DenseBlock constructor, which allows one to train DenseNets (you can train ResNets and MixNets with this as well, only need change the connection, which is concatenation for DenseNets). Disclaimer: I created the block for a 3D U-Net, so the assumption here is that whatever layer is inside the block, its output has the same spatial dimension (i.e. all array dimensions excluding the channel and minibatch dimensions) as the input, otherwise the connection wouldn't match. I'm not sure this matches the topology of every DenseNet there is out there, but I suppose this is a good starting point. No tests yet, will add them as the PR evolve. I'm open to suggestions! :) Co-authored-by: Bruno Hebling Vieira <bruno.hebling.vieira@usp.br> Co-authored-by: Mike J Innes <mike.j.innes@gmail.com>
Build succeeded |
I added a DenseBlock constructor, which allows one to train DenseNets (you can train ResNets and MixNets with this as well, only need change the connection, which is concatenation for DenseNets).
Disclaimer: I created the block for a 3D U-Net, so the assumption here is that whatever layer is inside the block, its output has the same spatial dimension (i.e. all array dimensions excluding the channel and minibatch dimensions) as the input, otherwise the connection wouldn't match. I'm not sure this matches the topology of every DenseNet there is out there, but I suppose this is a good starting point.
No tests yet, will add them as the PR evolve.
I'm open to suggestions! :)