Skip to content

Complicated lazy broadcasting slower than equivalent single broadcast #56629

Open
@mcabbott

Description

@mcabbott

This example from Discourse shows a slowdown when broadcasting a moderately complicated expression, instead of broadcasting a function containing the same expression:

arrayfun!(C, A, B) = @. C = A^2 + B^2 + A * B + A / B - A * B - A / B + A * B + A / B - A * B - A / B
scalarfun(A::Real, B::Real) = A^2 + B^2 + A * B + A / B - A * B - A / B + A * B + A / B - A * B - A / B

let N = 151
    A, B, C1, C2 = (rand(N,N,N).+1 for _ in 1:4)
    @btime arrayfun!($C1, $A, $B)
    @btime $C2 .= scalarfun.($A, $B)
    C1  C2
end
#  17.306 ms (11 allocations: 352 bytes)
#   5.900 ms (0 allocations: 0 bytes)

The effect seems fairly robust, it's not particular to 3D arrays, nor to A^2.
Replacing @. with .+ etc. helps a bit (which according to #29120 removes n-ary +, here n<=4):

arrayfun!(C, A, B) = C .= A.^2 .+ B.^2 .+ A .* B .+ A ./ B .- A .* B .- A ./ B .+ A .* B .+ A ./ B .- A .* B .- A ./ B
#  17.345 ms (0 allocations: 0 bytes)

Simpler expressions also have the slowdown but no allocation:

arrayfun!(C, A, B) = @. C = A^2 + B^2 + A * B + A / B
scalarfun(A::Real, B::Real) = A^2 + B^2 + A * B + A / B
#  3.148 ms (0 allocations: 0 bytes)
#  971.000 μs (0 allocations: 0 bytes)

Even simpler expressions like arrayfun!(C, A, B) = @. C = A^2 + B^2 show no slowdown at all.

Metadata

Metadata

Assignees

No one assigned

    Labels

    broadcastApplying a function over a collectionperformanceMust go faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions