-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add Upsample and PixelShuffle layers #1468
Conversation
I didn't know you could do that. I'd like to be cautious and keep it as a mandatory argument, at least for some time, because the default mode for pytorch and keras is nearest neighbor, that we don't have yet. |
I'm not really worked up about what other frameworks do and follow that necessarily. |
Reproducibility is really important in deep learning research (but also production). And current research and productions is using either pytorch or tensorflow, so we shouldn't make crazy hard for people to figure out why they are not able to reproduce published results or their own python code using flux |
I don't see how it is hard or intend to force other frameworks design on to ours |
I guess if there's a default we can't later change it, although not sure linear is a bad default. It could be always printed to make this more visible. If it must be typed, then perhaps it should be |
If someone tries to translate pytorch's |
I don't think it's unreasonable to have both, that was exactly what had us add the multiple constructors for Conv with kwargs, and later had them removed for being not used enough (how did we determine this?) I don't think it's 'unnatural' in any way to omit a keyword given that they are moving to a different framework with a different language with different semantics, with convenience constructors on top for the 'natural' transition and later moved to the more language/ framework specific implementation. |
I don't understand, why the symbol is not enough? Btw, we could have
I'm not saying it is unnatural to use a positional arg instead of a keyword, I'm saying it is natural when porting to omit the Anyways, we can implement right now the following Upsample(scale, mode)
# or
# Upsample(scale, mode=:linear)
Upsample(; scale, mode)
# or
# Upsample(; scale, mode=:linear) Later probably change the keyword constructor to Upsample(; scale=nothing, outsize=nothing, mode) # require inside either scale or outsize != nothing
# or
# Upsample(; scale=nothing, outsize=nothing, mode=:linear) |
Another option, which corresponds to the direction taken by @maxfreu in FluxML/NNlib.jl#266, is to have just one constructor Upsample(scale=1; outsize=nothing, mode) where |
We can consider that at a later point in time |
FluxML/NNlib.jl#269 adds I guess that using identical notation and very different defaults is likely to trip people up unnecessarily. If we want different defaults, then perhaps we should pick a different name -- perhaps We could allow positional
The complaint is that it's too much -- that options you have to type out a word for, and remember how to spell, aren't great. My suggestion was "1 for linear" (the order of the polynomial involved) whether this is bilinear (for 2D images) or linear (for 1D) or trilinear (if that's a thing people say?). Is too obscure? Or does it exclude options one may wish to add later? |
Hi, I would keep scale as positional argument, because it is most often used in DL (and most often it is 2), so writing it out every time would be too verbose. The rest should be keywords. Nearest is a reasonable default mode because it is fastest + and already default in other frameworks (pyt & tf). I would keep the spelled out symbol form and document them well. Furthermore scale should not be restricted to int here, as the implementations will support reals at some point. So error at a lower level. Edit: Hmmm when we add 1D and 3D support, linear interpolation all is the same more or less algorithmically, so it might make sense to dispatch on the array dimension and just give the interpolation order, indeed. Sounds fancy, can't decide... |
Needs GPU tests |
@maxfreu what about having these 2 constructors Upsample(scale; mode=:nearest)
Upsample(; scale=nothing, outsize=nothing, mode=:nearest) ? btw, nearest is now available FluxML/NNlib.jl#269 |
Having all keyword constructors is fine, but if we don't support multiple modes etc, let's not make it an option and rather document it properly. Also, we should let users tweak the outsize in the first constructor as well, and check for compatible settings |
I think it's fine. I'll have to wait for your PR in NNllib to be merged before adapting the wrapper here though |
Best not to create ambiguity when reading the code anyway |
So you opt for |
My reading was that supplying both the size and the scale was the ambiguity in question? Anyway maybe we should merge this & return to further methods in another PR. My reservation about mode, one last time, is that it seems a little weird to need to spell the dimension out (in latin!), although perhaps never a problem in practice:
If we do spell it out, I also wonder a bit whether it need be a keyword, since |
For info: I changed |
As names go, yes I would pick this over size. Depending on the case,
I agree, and I think that's the api that we should adopt for this layer, and as said earlier having options with kwargs is fine. There's no reason to remove them if some people find it useful. Same goes for the conv constructors imo |
Yes, I think that if we allow more options than The obvious precedent for |
Right but if we use the function there, there's presumably an existing constructor so I think thats doable. Ultimately I feel relying on Interpolations.jl is the better longer term approach, rather than redoing work, so hopefully we can bring that into our ecosystem. |
Yeah +1 for storing the interpolating function. It simplifies the code significantly, and I think any issues with show can be resolved since these aren't arbitrary functions. The constructor API should still be symbol based. |
Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
I'll change the implementation to store internally the function. For the record, I think this change is just exploring the "who cares?" design manifold, adding zero value to users and to us developers and wasting my time, but I'll soldier on and do it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite having suggested that, I agree it's not a big deal either way. And doesn't change the user interface, so could be done later by anyone who does care.
Also, is not clear how I should change the show method function Base.show(io::IO, u::Upsample{mode}) where {mode}
print(io, "Upsample(")
print(io, ":", mode)
u.scale !== nothing && print(io, ", scale = $(u.scale)")
u.size !== nothing && print(io, ", size = $(u.size)")
println(io, ")")
end If |
You could do: function Base.show(io::IO, u::Upsample)
print(io, "Upsample(")
print(io, ":", nameof(u.mode))
u.scale !== nothing && print(io, ", scale = $(u.scale)")
u.size !== nothing && print(io, ", size = $(u.size)")
println(io, ")")
end
I also think that this change can easily be done in a separate PR. It doesn't change the desired API in any way. I'd be fine merging as is. |
It's alright to show |
05a9a2d
I'm slightly confused, are we adding the positional scale constructor here already? |
yes I added it since I add to add a few changes from comments in any case |
need another approval |
Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
bors r+ |
Build succeeded: |
PR Checklist
@dhairyagandhi96
(for API changes).