Skip to content

to_vec(::Diagonal): densify or not? #186

Closed
@mzgubic

Description

@mzgubic

This issue is just putting down some of my notes on why I think the current status is ok.

The question is whether to_vec(::Diagonal) should return only the vector along the diagonal, or the whole densified matrix. I.e. should it do

julia> v, b = to_vec(Diagonal([1, 2])); v # struct-like
2-element Vector{Int64}:
 1
 2

or

julia> v, b = to_vec(Diagonal([1, 2])); v # array-like
4-element Vector{Int64}:
 1
 0
 0
 2

in other words, do we treat it as a struct or as an array? At present, we densify it, which feels wrong in the sense that the off-diagonal zeros are structural, i.e. they can't (conceptually) be perturbed. Practically, when we perturb the off-diagonal elements, we just get no effect and we waste computation.

On the other hand, it is practical. Consider test_rrule(Symmetric, randn(3, 3), :U), and we want to test whether the rule accepts Diagonal tangents as well, which we do by test_rrule(Symmetric, randn(3, 3), :U; check_inferred=false, output_tangent=Diagonal(rand(3, 3))). If we do not densify the Diagonal, the test fails. The reason is that the finite differences jacobian is computed for the output type (Symmetric) vector (length 9), while the non-densified vector for a 3by3 Diagonal has length 3, which causes an awkward DimensionMismatch("second dimension of A, 9, does not match length of x, 3") error, which is highly likely confusing to a newcomer.

So the tradeoff seems to be sacrificing convenience or wasting computational power. In the 21st century convenience is sacred and computation is cheap, so keeping Diagonal densified is ok with me.


Densifying used to cause different (and quite common) problems when testing rules in ChainRulesTestUtils, e.g. for d=Diagonal(rand(3)); test_rrule(*, d, d) because the rand_tangent would make a Tangent{Diagonal}(diag=[1., 2., 3.]) (vector of length 3), whereas the primal was a Diagonal (vector of length 9). However these have been mostly fixed by adding the ProjectTo into rand_tangent in ChainRulesCore 1.0.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions