Skip to content

Better initialization support #670

Closed
@domluna

Description

@domluna

This is very nice, thanks!

It would be useful to open an issue to discuss the need for the Linear layer here. Hopefully we can make the builtins more flexible so this kind of thing is less necessary.

Originally posted by @MikeInnes in FluxML/model-zoo#115 (comment)

The primary need for making a new type Linear, was the bias initializer only takes in the output dimension, which is intuitive but is problematic when considering some bias initialization rely on more than the output dimension. For example, the default nn.Linear layer in PyTorch scales the initialization of the bias by the input dimension. Relevant code:

def reset_parameters(self):
        init.kaiming_uniform_(self.weight, a=math.sqrt(5))
        if self.bias is not None:
            fan_in, _ = init._calculate_fan_in_and_fan_out(self.weight)
            bound = 1 / math.sqrt(fan_in)
            init.uniform_(self.bias, -bound, bound)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions