Closed
Description
Weight normalization is a reparameterization of the weights in terms of two quantities which are optimized instead of the weights directly. The authors provide a reference implementation in Python using Theano, and the functionality is available in PyTorch as a hook.
I've toyed just a tiny bit with implementing this, but I think my knowledge of Flux's internals is insufficient. I was looking at implementing it as a layer that wraps another layer, for which the params
are the g and v parameterization of the weights of the wrapped layer. Perhaps there's a better way.
Metadata
Metadata
Assignees
Labels
No labels