Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #1556 #1557

Merged
merged 18 commits into from
Mar 31, 2021
Merged

Fix #1556 #1557

merged 18 commits into from
Mar 31, 2021

Conversation

DhairyaLGandhi
Copy link
Member

@DhairyaLGandhi DhairyaLGandhi commented Mar 31, 2021

Reverts some Dense changes since those caused issues downstream. cc @ChrisRackauckas @darsnack @CarloLucibello

@DhairyaLGandhi
Copy link
Member Author

It needs some fixes to the tests to make sure we are getting expected results. This is a technically breaking change now.

Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the correct way to fix #1556. It is not necessary to revert the constructor work. Beyond necessitating a breaking change, most of the stuff being reverted here is actually good work that should be kept. Let's not do drastic changes to the codebase for bug fixes if it is not necessary. If there's a desire to do a design refactor, then that should be done in a separate PR. Happy for us to have a design discussion, but in this case, we can release the appropriate fix without a design overhaul.

A proper fix for #1556 will involve just deleting the eltype check from create_bias.

@darsnack
Copy link
Member

To be clear, it isn't create_bias or the Dense changes that caused this issue. create_bias was added in #1379, and the erroneous eltype check was added in #1440. All the changes in #1440 can be kept without the unnecessary check, and create_bias is reverted to its prior behavior.

@DhairyaLGandhi
Copy link
Member Author

I'll do a more minimal change. What specific changes would you like retained?

I would also prefer for some changes to our design to simplify its implementation. Removing create_bias from here doesn't seem too disruptive to me. Good call on fixing the bad check too.

@darsnack
Copy link
Member

My preference would be to keep everything except delete this check code and just return bias. We could remove create_bias, but to do it in a non-breaking way that keeps the current API means adding a bias isa Bool check to all the constructors with a bias. create_bias should really be an internal utility cause that's how it is used.

One other possible change is the specification of W in Dense to be <:AbstractMatrix. My gut says this constraint should be lifted and allowed to be <:AbstractArray. But I haven't hunted down where and why that change happened, so maybe it is necessary. Possibly @CarloLucibello or @mcabbott can shed some light.

I do agree with you that we can/should think about a better design that doesn't require an internal utility to handle the bias argument. But my feeling is that it would involve a discussion on Flux.Zeros, so that's why I am suggesting opening a separate issue/PR for that refactor.

@darsnack
Copy link
Member

It's possible there is more than the change I suggested to be reverted, but can we approach the PR from the opposite direction? i.e. restricting the changes to what I suggest, then slowly reverting more stuff as necessary?

@mcabbott
Copy link
Member

No time today but agree that if trying to fix something quickly, the minimal change is best:

My preference would be to keep everything except delete this check code and just return bias.

That seems to be what's actually causing problems, right?

One other possible change is the specification of W in Dense to be <:AbstractMatrix. My gut says this constraint should be lifted and allowed to be <:AbstractArray.

Dense calls W * x, are there other ndims(W) this can sensibly work with?

@DhairyaLGandhi
Copy link
Member Author

The thinking is thus:

  1. remove the check in create_bias
  2. remove the conversion

Removing the function from here basically means we don't have to worry about its behaviour in a corner case, and the package becomes more reliable for downstream users. We aren't losing features here either.

@DhairyaLGandhi
Copy link
Member Author

@darsnack the layer is back to where it was, the tests are back to their earlier state as well.

Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking close to me. Can you test the snippet @ChrisRackauckas provided in #1556 to see if it is resolved?

Dense calls W * x, are there other ndims(W) this can sensibly work with?

Probably not, but I guess our preference is to not pre-constrain. I think the change to <:AbstractArray makes sense. W * x will throw an error on the forward pass, or the output passed into the next layer will throw a dimension mismatch. As long as we don't have a silent error, then I think it might better to leave it open.

src/layers/basic.jl Outdated Show resolved Hide resolved
src/layers/basic.jl Outdated Show resolved Hide resolved
src/layers/basic.jl Outdated Show resolved Hide resolved
@DhairyaLGandhi
Copy link
Member Author

The MWE works

Copy link
Member

@darsnack darsnack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking correct to me.

test/utils.jl Outdated Show resolved Hide resolved
@DhairyaLGandhi
Copy link
Member Author

bors r+

bors bot added a commit that referenced this pull request Mar 31, 2021
1557: Fix #1556 r=DhairyaLGandhi a=DhairyaLGandhi

Reverts some Dense changes since those caused issues downstream. cc @ChrisRackauckas @darsnack @CarloLucibello 

Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
@DhairyaLGandhi
Copy link
Member Author

bors ping

@bors
Copy link
Contributor

bors bot commented Mar 31, 2021

pong

@DhairyaLGandhi DhairyaLGandhi merged commit f352374 into master Mar 31, 2021
@mcabbott
Copy link
Member

IMO this should be reverted. Changes orthogonal to fixing a pressing issue don't need to be rushed in.

@bors
Copy link
Contributor

bors bot commented Mar 31, 2021

This PR was included in a batch that successfully built, but then failed to merge into master (it was a non-fast-forward update). It will be automatically retried.

darsnack added a commit to darsnack/Flux.jl that referenced this pull request Mar 31, 2021
@darsnack darsnack mentioned this pull request Mar 31, 2021
4 tasks
@@ -40,16 +40,17 @@ import Flux: activations
@test Dense(rand(Float16, 100,10), true).bias isa Vector{Float16} # creates matching type
@test_skip Dense(rand(Float16, 100,10), rand(100)).bias isa Vector{Float16} # converts to match

@test Dense(3,4; init=Base.randn, bias=true).bias isa Vector{Float64}
@test_skip Dense(3,4; init=Base.randn, bias=true).bias isa Vector{Float64}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this now producing a float32 bias? this is no good.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since there is no indication of the type of bias, it would make sense to produce single precision bias. It is also addressed in #1559 to restore backwards compatibility with the behaviour.

@DhairyaLGandhi DhairyaLGandhi mentioned this pull request Apr 1, 2021
bors bot added a commit that referenced this pull request Apr 1, 2021
1561: Reverts #1557 r=DhairyaLGandhi a=DhairyaLGandhi

Reverts #1557 

Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
bors bot added a commit that referenced this pull request Apr 1, 2021
1561: Reverts #1557 r=DhairyaLGandhi a=DhairyaLGandhi

Reverts #1557 

Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
@CarloLucibello CarloLucibello deleted the dg/den branch April 7, 2022 07:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants