Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Glorot initialization, add He initialization #937

Merged
merged 6 commits into from
Nov 26, 2019

Conversation

Sleort
Copy link
Contributor

@Sleort Sleort commented Nov 19, 2019

Should fix #442 .
Adds He weight initialization as a bonus :-)

Should fix the issue reported at FluxML#442 .
Adds He weight initialization as a bonus :-)
Should fix FluxML#442 .
Adds He weight initialization as a bonus :-)
@MikeInnes
Copy link
Member

Best to split this into two PRs to make it easier to review and merge.

@Sleort
Copy link
Contributor Author

Sleort commented Nov 20, 2019

I couldn't figure out how to split the PR into two, so I deleted the He initialization stuff (for now).

(And sorry for the mess. I'm not yet completely familiar/comfortable with how Git(hub) works...)

@MikeInnes
Copy link
Member

No worries. Splitting the commit is possible but it's usually easier just to delete code and/or copy-paste it somewhere else for the he PR, so that's the right thing to do.

@Sleort
Copy link
Contributor Author

Sleort commented Nov 22, 2019

Anything else I should do to get this PR reviewed / merged?

@MikeInnes
Copy link
Member

I'm a bit uncomfortable with the heuristic this uses, but at least it has good precedent in other libraries. We can always redesign in future if needed.

Thanks @Sleort!

bors r+

bors bot added a commit that referenced this pull request Nov 26, 2019
937: Fix Glorot initialization, add He initialization r=MikeInnes a=Sleort

Should fix #442 .
Adds He weight initialization as a bonus :-)

Co-authored-by: Troels Arnfred Bojesen <tr-ab@online.no>
@bors
Copy link
Contributor

bors bot commented Nov 26, 2019

Build succeeded

@bors bors bot merged commit 3f97701 into FluxML:master Nov 26, 2019
@Sleort
Copy link
Contributor Author

Sleort commented Nov 27, 2019

I'm a bit uncomfortable with the heuristic this uses, but at least it has good precedent in other libraries. We can always redesign in future if needed.

I agree (see the discussion at #442). A more "proper" way of treating this would, however, require breaking changes, since we would have to dispatch on layer type in addition to dimensionality. Actually, ideally we should maybe make the initialization depend on the activation function as well, such that the "right" initialization could be automatically selected (Glorot for tanh, He for relu, etc.)...?

@MikeInnes
Copy link
Member

We don't generally consider changing a layer's default initialisation to be breaking, so we could make the defaults smarter at any time, even if it isn't exposed as an API. A more general user-accessible initialisation API would be nice to have though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Glorot initialization not correctly implemented
2 participants