You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently the NativeLstm2 supports the special options (can be passed via unit_opts to RecLayer) rec_weight_dropout and rec_weight_dropout_shape.
There should be a generic way to do this for every layer.
Related is the param_variational_noise option, which is both a layer option and a global option.
Probably it should just be added in the same way, i.e. a param_dropout layer option or so.
Although that would not be the exact same behavior, as rec_weight_dropout is currently only applied on the weight matrix and not on all params of the RecLayer (so not the linear input transform matrix, and also not the bias).
So we need to make it a bit more generic yet.
Related is also the gradient_noise option, which is a global config option and can also be passed per layer via updater_opts.
The text was updated successfully, but these errors were encountered:
Note that all layers allow to pass variables as arguments now, or allow for a simple alternative (e.g. DotLayer instead of LinearLayer). And then this becomes straight forward.
This is how it is also done on returnn-common side, and this is how generic weight dropout would be implemented there.
I'm not sure that we really need another new feature on RETURNN side for this.
Currently the
NativeLstm2
supports the special options (can be passed viaunit_opts
toRecLayer
)rec_weight_dropout
andrec_weight_dropout_shape
.There should be a generic way to do this for every layer.
Related is the
param_variational_noise
option, which is both a layer option and a global option.Probably it should just be added in the same way, i.e. a
param_dropout
layer option or so.Although that would not be the exact same behavior, as
rec_weight_dropout
is currently only applied on the weight matrix and not on all params of theRecLayer
(so not the linear input transform matrix, and also not the bias).So we need to make it a bit more generic yet.
Related is also the
gradient_noise
option, which is a global config option and can also be passed per layer viaupdater_opts
.The text was updated successfully, but these errors were encountered: