Generic weight dropout #735

albertz · 2021-11-05T09:57:12Z

Currently the NativeLstm2 supports the special options (can be passed via unit_opts to RecLayer) rec_weight_dropout and rec_weight_dropout_shape.

There should be a generic way to do this for every layer.

Related is the param_variational_noise option, which is both a layer option and a global option.

Probably it should just be added in the same way, i.e. a param_dropout layer option or so.
Although that would not be the exact same behavior, as rec_weight_dropout is currently only applied on the weight matrix and not on all params of the RecLayer (so not the linear input transform matrix, and also not the bias).
So we need to make it a bit more generic yet.

Related is also the gradient_noise option, which is a global config option and can also be passed per layer via updater_opts.

The text was updated successfully, but these errors were encountered:

albertz · 2022-03-04T09:00:32Z

Note that all layers allow to pass variables as arguments now, or allow for a simple alternative (e.g. DotLayer instead of LinearLayer). And then this becomes straight forward.

This is how it is also done on returnn-common side, and this is how generic weight dropout would be implemented there.

I'm not sure that we really need another new feature on RETURNN side for this.

albertz mentioned this issue Nov 5, 2021

How to define the API for parameter initialization, regularization (L2, weight dropout, etc), maybe updater opts per-param rwth-i6/returnn_common#59

Closed

albertz closed this as completed Mar 4, 2022

albertz mentioned this issue May 24, 2024

RF weight dropout and variational noise #1518

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generic weight dropout #735

Generic weight dropout #735

albertz commented Nov 5, 2021

albertz commented Mar 4, 2022

Generic weight dropout #735

Generic weight dropout #735

Comments

albertz commented Nov 5, 2021

albertz commented Mar 4, 2022