Skip to content

LayerList的写法与梯度异常 #69174

@johnyanccer

Description

@johnyanccer

请提出你的问题 Please ask your question

grad

纵轴是参数梯度L2范数,横轴是训练步数,input数据依次经过layer0到layer9,然后通过classifier 得到输出,并计算loss,没有设置共享参数

如图呈现靠近输入的梯度大,靠近输出的梯度小,怀疑是LayerList写法不对,但是按照文档和transformer.py写法结果仍然一样

目前是按照layerlist文档的写法

class MyLayer(paddle.nn.Layer):

    def __init__(self):
        super().__init__()
        self.linears = paddle.nn.LayerList(
            [paddle.nn.Linear(10, 10) for i in range(10)])

    def forward(self, x):
        # LayerList can act as an iterable, or be indexed using ints
        for i, l in enumerate(self.linears):
            x = self.linears[i // 2](x) + l(x)
        return x

但是nn.layer.transformer文件中的写法如下:

    def __init__(self, encoder_layer, num_layers, norm=None):
        super().__init__()
        self.layers = LayerList(
            [
                (
                    encoder_layer
                    if i == 0
                    else type(encoder_layer)(**encoder_layer._config)
                )
                for i in range(num_layers)
            ]
        )

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions