Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Upsample and PixelShuffle layers #1468

Merged
merged 19 commits into from
Feb 11, 2021
Merged

add Upsample and PixelShuffle layers #1468

merged 19 commits into from
Feb 11, 2021

Conversation

CarloLucibello
Copy link
Member

@CarloLucibello CarloLucibello commented Jan 16, 2021

PR Checklist

  • Tests are added
  • Entry in NEWS.md
  • Documentation, if applicable
  • Final review from @dhairyagandhi96 (for API changes).

@CarloLucibello CarloLucibello mentioned this pull request Jan 16, 2021
92 tasks
src/layers/upsample.jl Outdated Show resolved Hide resolved
src/layers/upsample.jl Outdated Show resolved Hide resolved
src/layers/upsample.jl Outdated Show resolved Hide resolved
src/layers/upsample.jl Show resolved Hide resolved
@CarloLucibello
Copy link
Member Author

My main point here is that I don't think there's much gained by storing Val(mode) in the type, we can just put the symbol there directly. It should have a default.

I didn't know you could do that. I'd like to be cautious and keep it as a mandatory argument, at least for some time, because the default mode for pytorch and keras is nearest neighbor, that we don't have yet.

@DhairyaLGandhi
Copy link
Member

I'm not really worked up about what other frameworks do and follow that necessarily.

@CarloLucibello
Copy link
Member Author

I'm not really worked up about what other frameworks do and follow that necessarily.

Reproducibility is really important in deep learning research (but also production). And current research and productions is using either pytorch or tensorflow, so we shouldn't make crazy hard for people to figure out why they are not able to reproduce published results or their own python code using flux

@DhairyaLGandhi
Copy link
Member

I don't see how it is hard or intend to force other frameworks design on to ours

@mcabbott
Copy link
Member

default mode for pytorch and keras is nearest neighbor, that we don't have yet

I guess if there's a default we can't later change it, although not sure linear is a bad default. It could be always printed to make this more visible.

If it must be typed, then perhaps it should be order=1 for linear, 0 for nearest-neighbour?

@CarloLucibello
Copy link
Member Author

If someone tries to translate pytorch's Upsample(scale_factor=(2,2)) with flux's Upsample(scale=(2,2)) as it may feel natural, they would be doing a wrong port, where the mistake in the upsampling layer would be really hard to spot in a model containing many other layers. A mandatory argument forces you to think and compare. I'm not saying we shouldn't have a default, I'm saying we should stick with conventions since most of the time they are there for a good reason, and depart only if there are other good reasons to, and I see none here. I'm just being cautious since we are introducing these layers for the first time, adding a default in the future is a non-breaking change, the other way around is breaking instead.

@DhairyaLGandhi
Copy link
Member

I don't think it's unreasonable to have both, that was exactly what had us add the multiple constructors for Conv with kwargs, and later had them removed for being not used enough (how did we determine this?)

I don't think it's 'unnatural' in any way to omit a keyword given that they are moving to a different framework with a different language with different semantics, with convenience constructors on top for the 'natural' transition and later moved to the more language/ framework specific implementation.

@CarloLucibello
Copy link
Member Author

CarloLucibello commented Jan 17, 2021

If it must be typed, then perhaps it should be order=1 for linear, 0 for nearest-neighbour?

I don't understand, why the symbol is not enough?

Btw, we could have :linear to represent 1d linear, and also :bilinear and :trilinear, let forward pass dispatch handle the appropriate NNlib call (or unify also those).
We could still allow passing :bilinear and :trilinear modes, since they are commonly used names.

I don't think it's 'unnatural' in any way to omit a keyword given that they are moving to a different framework with a different language with different semantics, with convenience constructors on top for the 'natural' transition and later moved to the more language/ framework specific implementation.

I'm not saying it is unnatural to use a positional arg instead of a keyword, I'm saying it is natural when porting to omit the mode argument if not present in the original code and that would lead to an incorrect port if the defaults differ.

Anyways, we can implement right now the following

Upsample(scale, mode)
# or
# Upsample(scale, mode=:linear)

Upsample(; scale, mode)
# or
# Upsample(; scale, mode=:linear)

Later probably change the keyword constructor to

Upsample(; scale=nothing, outsize=nothing, mode) # require inside either scale or outsize != nothing
# or
# Upsample(; scale=nothing, outsize=nothing, mode=:linear)

@CarloLucibello
Copy link
Member Author

Another option, which corresponds to the direction taken by @maxfreu in FluxML/NNlib.jl#266, is to have just one constructor

Upsample(scale=1; outsize=nothing, mode)

where scale is ignored if outsize is given.

@DhairyaLGandhi
Copy link
Member

We can consider that at a later point in time

@mcabbott
Copy link
Member

mcabbott commented Jan 17, 2021

FluxML/NNlib.jl#269 adds upsample_nearest.

I guess that using identical notation and very different defaults is likely to trip people up unnecessarily. If we want different defaults, then perhaps we should pick a different name -- perhaps Resize or Resample are possibilities. Or UpsampleLinear and UpsampleNearest.

We could allow positional Upsample(2) now and Upsample(size=(100,100)) later (default to nothing & check in the body, or whatever). [Edit -- as you said, somehow I lost that. And:]

If it must be typed, then perhaps it should be order=1 for linear, 0 for nearest-neighbour?

I don't understand, why the symbol is not enough?

The complaint is that it's too much -- that options you have to type out a word for, and remember how to spell, aren't great. My suggestion was "1 for linear" (the order of the polynomial involved) whether this is bilinear (for 2D images) or linear (for 1D) or trilinear (if that's a thing people say?). Is too obscure? Or does it exclude options one may wish to add later?

@maxfreu
Copy link

maxfreu commented Jan 18, 2021

Hi, I would keep scale as positional argument, because it is most often used in DL (and most often it is 2), so writing it out every time would be too verbose. The rest should be keywords. Nearest is a reasonable default mode because it is fastest + and already default in other frameworks (pyt & tf). I would keep the spelled out symbol form and document them well. Furthermore scale should not be restricted to int here, as the implementations will support reals at some point. So error at a lower level.

Edit: Hmmm when we add 1D and 3D support, linear interpolation all is the same more or less algorithmically, so it might make sense to dispatch on the array dimension and just give the interpolation order, indeed. Sounds fancy, can't decide...

@DhairyaLGandhi
Copy link
Member

Needs GPU tests

@CarloLucibello
Copy link
Member Author

CarloLucibello commented Jan 20, 2021

@maxfreu what about having these 2 constructors

Upsample(scale; mode=:nearest)
Upsample(; scale=nothing, outsize=nothing, mode=:nearest)

?

btw, nearest is now available FluxML/NNlib.jl#269

@DhairyaLGandhi
Copy link
Member

Having all keyword constructors is fine, but if we don't support multiple modes etc, let's not make it an option and rather document it properly.

Also, we should let users tweak the outsize in the first constructor as well, and check for compatible settings

src/layers/upsample.jl Outdated Show resolved Hide resolved
src/layers/upsample.jl Outdated Show resolved Hide resolved
@maxfreu
Copy link

maxfreu commented Jan 20, 2021

@maxfreu what about having these 2 constructors

Looks good to me! @mcabbott suggested to use size as a keyword and Base.size where needed. What do you think?

@CarloLucibello
Copy link
Member Author

Looks good to me! @mcabbott suggested to use size as a keyword and Base.size where needed. What do you think?

I think it's fine. I'll have to wait for your PR in NNllib to be merged before adapting the wrapper here though

@DhairyaLGandhi
Copy link
Member

Best not to create ambiguity when reading the code anyway

@maxfreu
Copy link

maxfreu commented Jan 21, 2021

Best not to create ambiguity when reading the code anyway

So you opt for outsize, right?

@mcabbott
Copy link
Member

mcabbott commented Jan 21, 2021

Best not to create ambiguity when reading the code anyway

My reading was that supplying both the size and the scale was the ambiguity in question?

Anyway maybe we should merge this & return to further methods in another PR.

My reservation about mode, one last time, is that it seems a little weird to need to spell the dimension out (in latin!), although perhaps never a problem in practice:

Chain(Upsample(2, mode=:linear), Conv((3,), 5=>7), ...)
Chain(Upsample(2, mode=:bilinear), Conv((3,3), 5=>7), ...)
Chain(Upsample(2, mode=:trilinear), Conv((3,3,3), 5=>7), ...)

If we do spell it out, I also wonder a bit whether it need be a keyword, since Upsample(2, :blinear) seems hard to misinterpret.

@maxfreu
Copy link

maxfreu commented Jan 21, 2021

For info: I changed outsize to size in my PRs.

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Jan 21, 2021

So you opt for outsize, right?

As names go, yes I would pick this over size. Depending on the case, dims and axis works, and maybe sz as a shorthand

If we do spell it out, I also wonder a bit whether it need be a keyword, since Upsample(2, :blinear) seems hard to misinterpret.

I agree, and I think that's the api that we should adopt for this layer, and as said earlier having options with kwargs is fine. There's no reason to remove them if some people find it useful. Same goes for the conv constructors imo

@mcabbott
Copy link
Member

Yes, I think that if we allow more options than scale, mode, that's when keywords start to be needed. Doesn't hurt to allow them from the start I guess.

The obvious precedent for size as a keyword is that Base does range(1, 2, length=100) and discourages things like range(1, 2, len=100) (partial word) or range(1, 2, outlength=100) (no added clarity, longer & harder to remember).

src/layers/upsample.jl Outdated Show resolved Hide resolved
@DhairyaLGandhi
Copy link
Member

which includes printing something copy-pastable.

Right but if we use the function there, there's presumably an existing constructor so I think thats doable.

Ultimately I feel relying on Interpolations.jl is the better longer term approach, rather than redoing work, so hopefully we can bring that into our ecosystem.

@darsnack
Copy link
Member

darsnack commented Feb 10, 2021

Yeah +1 for storing the interpolating function. It simplifies the code significantly, and I think any issues with show can be resolved since these aren't arbitrary functions. The constructor API should still be symbol based.

Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
@CarloLucibello
Copy link
Member Author

I'll change the implementation to store internally the function. For the record, I think this change is just exploring the "who cares?" design manifold, adding zero value to users and to us developers and wasting my time, but I'll soldier on and do it

mcabbott
mcabbott previously approved these changes Feb 10, 2021
Copy link
Member

@mcabbott mcabbott left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Despite having suggested that, I agree it's not a big deal either way. And doesn't change the user interface, so could be done later by anyone who does care.

@CarloLucibello
Copy link
Member Author

CarloLucibello commented Feb 10, 2021

Also, is not clear how I should change the show method

function Base.show(io::IO, u::Upsample{mode}) where {mode}
  print(io, "Upsample(")
  print(io, ":", mode)
  u.scale !== nothing && print(io, ", scale = $(u.scale)")
  u.size !== nothing && print(io, ", size = $(u.size)")
  println(io, ")")
end

If mode now is the function type instead of the symbol, when printing I should convert it back to the symbol, which is quite ugly

@darsnack
Copy link
Member

darsnack commented Feb 10, 2021

You could do:

function Base.show(io::IO, u::Upsample)
  print(io, "Upsample(")
  print(io, ":", nameof(u.mode))
  u.scale !== nothing && print(io, ", scale = $(u.scale)")
  u.size !== nothing && print(io, ", size = $(u.size)")
  println(io, ")")
end

I'll soldier on and do it

I also think that this change can easily be done in a separate PR. It doesn't change the desired API in any way. I'd be fine merging as is.

@DhairyaLGandhi
Copy link
Member

DhairyaLGandhi commented Feb 10, 2021

It's alright to show typeof(func) but I agree you don't have to do it if you don't care to do it.

DhairyaLGandhi
DhairyaLGandhi previously approved these changes Feb 10, 2021
@DhairyaLGandhi
Copy link
Member

I'm slightly confused, are we adding the positional scale constructor here already?

@CarloLucibello
Copy link
Member Author

typeof(func) or related would yield upsample_nearest, but we want :nearest here to obtain a string that represents valid code.

I'm slightly confused, are we adding the positional scale constructor here already?

yes I added it since I add to add a few changes from comments in any case

@CarloLucibello
Copy link
Member Author

need another approval

mcabbott
mcabbott previously approved these changes Feb 11, 2021
src/layers/upsample.jl Outdated Show resolved Hide resolved
Co-authored-by: Michael Abbott <32575566+mcabbott@users.noreply.github.com>
@DhairyaLGandhi
Copy link
Member

bors r+

@bors
Copy link
Contributor

bors bot commented Feb 11, 2021

Build succeeded:

@bors bors bot merged commit ddb5e9c into master Feb 11, 2021
@CarloLucibello CarloLucibello deleted the cl/upsample branch April 7, 2022 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants