[Performance] Speed of different LayerNorm2d implementations

In MMCLS, we use `permute` + `F.layer_norm` to implement `LayerNorm2d`.
https://github.com/open-mmlab/mmclassification/blob/d2e505415040bf5329ab218bb6fe3d899f176cd5/mmcls/models/backbones/convnext.py#L35-L40
However, in ConvNeXt official repo, they use a more intuitional implementation.
```python
        elif self.data_format == "channels_first":
            u = x.mean(1, keepdim=True)
            s = (x - u).pow(2).mean(1, keepdim=True)
            x = (x - u) / torch.sqrt(s + self.eps)
            x = self.weight[:, None, None] * x + self.bias[:, None, None]
            return x
```
We need a speed comparision between both implementations. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Performance] Speed of different LayerNorm2d implementations #931

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Performance] Speed of different LayerNorm2d implementations #931

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions