Open
Description
In MMCLS, we use permute
+ F.layer_norm
to implement LayerNorm2d
.
https://github.com/open-mmlab/mmclassification/blob/d2e505415040bf5329ab218bb6fe3d899f176cd5/mmcls/models/backbones/convnext.py#L35-L40
However, in ConvNeXt official repo, they use a more intuitional implementation.
elif self.data_format == "channels_first":
u = x.mean(1, keepdim=True)
s = (x - u).pow(2).mean(1, keepdim=True)
x = (x - u) / torch.sqrt(s + self.eps)
x = self.weight[:, None, None] * x + self.bias[:, None, None]
return x
We need a speed comparision between both implementations.