Skip to content

Matrix multiplication logic poor for ForwardDiff.Dual #513

Open
@ryanelandt

Description

@ryanelandt

StaticArrays has heuristics that determine what code to make to multiply matrices. There seems to have a heuristic for BlasFloat and one for everything else (i.e.Any). The present Any heuristic makes bad choices for Dual. I think that it would be straightforward to create a better heuristic for ForwardDiff.Dual by including the number of partials in the heuristic. The obvious issue with this is that it would require ForwardDiff to be a dependency to StaticArrays. Is there a good way to get this performance issue fixed?

using BenchmarkTools
using ForwardDiff
using StaticArrays

Type_Dual = ForwardDiff.Dual{Float64,Float64,26}

A = rand(SMatrix{4,4,Type_Dual,16})
B = rand(SMatrix{4,4,Type_Dual,16})

@btime $A * $B  # DEFAULT
# 1.376 μs (0 allocations: 0 bytes)

@btime StaticArrays.mul_loop($(Size(A)),$(Size(B)),$A,$B)
# 614.142 ns (0 allocations: 0 bytes)

@btime StaticArrays.mul_unrolled_chunks($(Size(A)),$(Size(B)),$A,$B)
# 688.962 ns (0 allocations: 0 bytes)

@btime StaticArrays.mul_unrolled($(Size(A)),$(Size(B)),$A,$B)
# 1.382 μs (0 allocations: 0 bytes)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions