Skip to content

Adjoints for regularizers? #1575

Closed
Closed
@caseykneale

Description

@caseykneale

Not sure if this is the right way to go about it so I'd like to ask what you all think... Would it make sense to make some adjoints for regularizers, and or attach them to specific layers?
ie:

L1(Δ, x, λ) = ( Δ .+ ( λ .* sign.( Δ ) ) ) .* ( abs.( x ) .> λ )
L1hook( x, λ ) = x
@adjoint L1hook( x, λ ) = x, Δ -> (L1( Δ, x, λ ),nothing)

note: this isn't a perfect lasso representation - it's missing a term based on the optimizer learning rate I think, but its a quick demonstrative hack and works if one is mindful of the magnitude of their independent variables.

L1Dense(...)
    σ.(L1hook(W, λ)*x .+ b)
end

For L2 it's not as big of a deal, but maybe there are other cases where baking this type of capability into some layers is worth while? Open to feedback and willing to make a PR if its deemed a reasonable suggestion

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions