Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic weight dropout #735

Closed
albertz opened this issue Nov 5, 2021 · 1 comment
Closed

Generic weight dropout #735

albertz opened this issue Nov 5, 2021 · 1 comment

Comments

@albertz
Copy link
Member

albertz commented Nov 5, 2021

Currently the NativeLstm2 supports the special options (can be passed via unit_opts to RecLayer) rec_weight_dropout and rec_weight_dropout_shape.

There should be a generic way to do this for every layer.

Related is the param_variational_noise option, which is both a layer option and a global option.

Probably it should just be added in the same way, i.e. a param_dropout layer option or so.
Although that would not be the exact same behavior, as rec_weight_dropout is currently only applied on the weight matrix and not on all params of the RecLayer (so not the linear input transform matrix, and also not the bias).
So we need to make it a bit more generic yet.

Related is also the gradient_noise option, which is a global config option and can also be passed per layer via updater_opts.

@albertz
Copy link
Member Author

albertz commented Mar 4, 2022

Note that all layers allow to pass variables as arguments now, or allow for a simple alternative (e.g. DotLayer instead of LinearLayer). And then this becomes straight forward.

This is how it is also done on returnn-common side, and this is how generic weight dropout would be implemented there.

I'm not sure that we really need another new feature on RETURNN side for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant