Skip to content

Fusing failing when used on Varargs + kwargs #57910

@miguelborrero5

Description

@miguelborrero5

The version used to run this example is:

Julia Version 1.11.3
Commit d63aded (2025-01-21 19:42 UTC)
Build Info:
Official https://julialang.org/ release
Platform Info:
OS: macOS (arm64-apple-darwin24.0.0)
CPU: 8 × Apple M1 Pro
WORD_SIZE: 64
LLVM: libLLVM-16.0.6 (ORCJIT, apple-m1)
Threads: 1 default, 0 interactive, 1 GC (on 6 virtual cores)

The issue is the following: consider the function

using LinearAlgebra
using BenchmarkTools
M = [1.0  2.0;
        3.0  4.0]
    n = size(M,1)
    I_n = Matrix{Float64}(I, n, n)
    A_temp = zeros(n, n)
    diff_temp = zeros(n, n)
    function f_standard(args, M, I_n, A_temp, diff)
        # Collect the scalar inputs into a vector
        for i in eachindex(A_temp)
            A_temp[i] = args[i]
        end
        # Compute the difference A*M - I
        mul!(diff, A_temp, M)
        @. diff -= I_n
        return mapreduce(x -> x^2, +, diff)
    end

As expected, no allocations.

@btime f_standard($(rand(4)), $(M), $(I_n), $(A_temp), $(diff_temp))
  21.314 ns (0 allocations: 0 bytes)

However, consider the following modification of passing some of the arguments as kwargs and changing the first argument into Varargs:

function f_varargs_kwargs(args...; M, I_n, A_temp, diff)
        # Collect the scalar inputs into a vector
        for i in eachindex(A_temp)
            A_temp[i] = args[i]
        end
        # Compute the difference A*M - I
        mul!(diff, A_temp, M)
        @. diff -= I_n
        return mapreduce(x -> x^2, +, diff)
end

It does allocate

julia> @btime f_varargs_kwargs($(rand()), $(rand()), $(rand()), $(rand()); M = $(M), I_n = $(I_n), A_temp = $(A_temp), diff = $(diff_temp))
  36.374 ns (5 allocations: 80 bytes)

However, the compiler seems to be specializing over all arguments:

m = @which f_varargs_kwargs(rand(), rand(), rand(), rand(); M = M, I_n = I_n, A_temp = A_temp, diff = diff_temp)
m.specializations

gives:
svec(MethodInstance for Core.kwcall(::@NamedTuple{M::Matrix{Float64}, I_n::Matrix{Float64}, A_temp::Matrix{Float64}, diff::Matrix{Float64}}, ::typeof(f_varargs_kwargs), ::Float64, ::Vararg{Float64}), MethodInstance for Core.kwcall(::@NamedTuple{M::Matrix{Float64}, I_n::Matrix{Float64}, A_temp::Matrix{Float64}, diff::Matrix{Float64}}, ::typeof(f_varargs_kwargs), ::Float64, ::Float64, ::Float64, ::Float64), nothing, nothing, nothing, nothing, nothing)

Also, weirdly, the problem seems to come from the line
@. diff -= I_n
Since commenting out this line

function f_varargs_kwargs_2(args...; M, I_n, A_temp, diff)
        # Collect the scalar inputs into a vector
        for i in eachindex(A_temp)
            A_temp[i] = args[i]
        end
        # Compute the difference A*M - I
        mul!(diff, A_temp, M)
        #@. diff -= I_n
        return mapreduce(x -> x^2, +, diff)
end
@btime f_varargs_kwargs_2($(rand()), $(rand()), $(rand()), $(rand()); M = $(M), I_n = $(I_n), A_temp = $(A_temp), diff = $(diff_temp))

gives no allocations:
12.929 ns (0 allocations: 0 bytes)
I posted this is the Julia discourse and people were also confused so I decided to post an issue about fusing failing here. @code_warntype and @code_lowered also assume full specialization in the allocating case.

Thanks in advance.

Metadata

Metadata

Assignees

No one assigned

    Labels

    broadcastApplying a function over a collectionkeyword argumentsf(x; keyword=arguments)performanceMust go faster

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions