-
-
Notifications
You must be signed in to change notification settings - Fork 608
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix #1556 #1557
Fix #1556 #1557
Conversation
It needs some fixes to the tests to make sure we are getting expected results. This is a technically breaking change now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is the correct way to fix #1556. It is not necessary to revert the constructor work. Beyond necessitating a breaking change, most of the stuff being reverted here is actually good work that should be kept. Let's not do drastic changes to the codebase for bug fixes if it is not necessary. If there's a desire to do a design refactor, then that should be done in a separate PR. Happy for us to have a design discussion, but in this case, we can release the appropriate fix without a design overhaul.
A proper fix for #1556 will involve just deleting the eltype
check from create_bias
.
I'll do a more minimal change. What specific changes would you like retained? I would also prefer for some changes to our design to simplify its implementation. Removing |
My preference would be to keep everything except delete this check code and just One other possible change is the specification of I do agree with you that we can/should think about a better design that doesn't require an internal utility to handle the bias argument. But my feeling is that it would involve a discussion on |
It's possible there is more than the change I suggested to be reverted, but can we approach the PR from the opposite direction? i.e. restricting the changes to what I suggest, then slowly reverting more stuff as necessary? |
No time today but agree that if trying to fix something quickly, the minimal change is best:
That seems to be what's actually causing problems, right?
Dense calls |
The thinking is thus:
Removing the function from here basically means we don't have to worry about its behaviour in a corner case, and the package becomes more reliable for downstream users. We aren't losing features here either. |
@darsnack the layer is back to where it was, the tests are back to their earlier state as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking close to me. Can you test the snippet @ChrisRackauckas provided in #1556 to see if it is resolved?
Dense calls W * x, are there other ndims(W) this can sensibly work with?
Probably not, but I guess our preference is to not pre-constrain. I think the change to <:AbstractArray
makes sense. W * x
will throw an error on the forward pass, or the output passed into the next layer will throw a dimension mismatch. As long as we don't have a silent error, then I think it might better to leave it open.
Co-authored-by: Kyle Daruwalla <daruwalla.k.public@icloud.com>
The MWE works |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking correct to me.
bors r+ |
1557: Fix #1556 r=DhairyaLGandhi a=DhairyaLGandhi Reverts some Dense changes since those caused issues downstream. cc @ChrisRackauckas @darsnack @CarloLucibello Co-authored-by: Dhairya Gandhi <dhairya@juliacomputing.com>
bors ping |
pong |
IMO this should be reverted. Changes orthogonal to fixing a pressing issue don't need to be rushed in. |
This PR was included in a batch that successfully built, but then failed to merge into master (it was a non-fast-forward update). It will be automatically retried. |
@@ -40,16 +40,17 @@ import Flux: activations | |||
@test Dense(rand(Float16, 100,10), true).bias isa Vector{Float16} # creates matching type | |||
@test_skip Dense(rand(Float16, 100,10), rand(100)).bias isa Vector{Float16} # converts to match | |||
|
|||
@test Dense(3,4; init=Base.randn, bias=true).bias isa Vector{Float64} | |||
@test_skip Dense(3,4; init=Base.randn, bias=true).bias isa Vector{Float64} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this now producing a float32 bias? this is no good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since there is no indication of the type of bias, it would make sense to produce single precision bias. It is also addressed in #1559 to restore backwards compatibility with the behaviour.
Reverts some Dense changes since those caused issues downstream. cc @ChrisRackauckas @darsnack @CarloLucibello