Description
I raised this issue on discourse, where the consensus seems to be that this is at least very confusing, and possibly a bug.
If I use a kwarg for a function that was passed as an argument to another function, Julia does not specialize the latter function. If I don't use the kwarg for that very same function, it does specialize. Below, I'll include a less trivial example that's closer to my actual use case. But schematically, the idea is this:
f1(a; b=10) = a+b
f2(c, f) = c + f(20)
f3(d, f) = d + f(30; b=40)
Calling, for example, f2(5, f1)
will result in a specialized f2
; calling f3(5, f1)
will not result in specialization. I imagine Julia is clever enough to optimize the problem away in this schematic. But in my code, I was seeing slowdowns of ~100x and allocations of multiple GiBs on each call to my core computation function.
As pointed out in discourse, it's possible to manually trigger specialization by adding a type parameter. But the failure to automatically specialize is a problem for a few reasons:
- It's surprising. The performance tip on specialization says "Julia will always specialize when the argument is used within the method, but not if the argument is just passed through to another function." In this case, I did use the argument (the function with the kwarg). From the discussion on discourse, it looks like the problem is that Julia immediately lowers that to just pass the function through to
Core.kwfunc
. So technically the argument "is just passed through to another function" — but not by the programmer. (Gotta love passive voice!) - It's very hard to diagnose. None of the usual tools — profiling, allocation tracking,
@code_warntype
, JET, Traceur — pointed out any problem with the use of kwargs. In fact, profiling and allocation actively focused my attention on other parts of the code that were not at all the source of the problem. Even the(@which f(...)).specializations
trick from that section of the performance tips seemed to say the function was being specialized for my arguments. (See below.) - It seems to contradict the docs. If the goal when designing this heuristic is to detect when a function is "just passed through" so that it will "usually [have] no performance impact at runtime", surely the decision of how to arrange parameters in a function definition should not affect the result.
So, at the very least, I would think this is a documentation bug, because the kwarg wrinkle should be noted in that performance tip — rather than requiring the user to mentally combine disparate arcana from the most cryptic parts of the docs. It would also be nice if some standard tools could point toward the source of the problem. But maybe this is truly a bug in Julia, which should actually specialize even when a kwarg is used?
For reference, here's a working example that's complicated enough that Julia doesn't just optimize the problem away, while still being a greatly simplified version of my actual use case:
using Profile
function index(n, mp, m; n_max=n)
n + mp + m + n_max
end
function inplace!(a, n_max, index_func)
i1 = index_func(1, 2, 3; n_max=n_max) # Using this version leads to allocations below
# i1 = index_func(1, 2, 3) # Using this version leads to 0 allocations
i2 = size(a, 1) - 2i1
for i in 1:i2 # Allocates 3182688 B if using kwarg above
a[i + i1] = a[i + i1 - 1] # Allocates 9573120 B if using kwarg above
end
for i in 3:i2-4 # Allocates 3182576 B if using kwarg above
a[i + i1] -= a[i + i1 - 2] # Allocates 12771408 B if using kwarg above
end
end
function compute_a(n_max::Int64)
a = randn(Float64, 100_000)
inplace!(a, n_max, index)
Profile.clear_malloc_data()
inplace!(a, n_max, index)
end
compute_a(10)
And yes, there are plenty of ways to improve the performance of this simplified code with function barriers and such. But my actual code is too complicated for that, with the kwarg func being used multiple times inside some loops.
If I look at the specializations of inplace!(a, n_max, index)
, I get
svec(MethodInstance for inplace!(::Vector{Float64}, ::Int64, ::Function), MethodInstance for inplace!(::Vector{Float64}, ::Int64, ::typeof(index)), nothing, nothing, nothing, nothing, nothing, nothing)
That second element really looks to me like it specialized for my particular index
function.
Here's all my versioninfo
julia> versioninfo()
Julia Version 1.7.2
Commit bf53498635 (2022-02-06 15:21 UTC)
Platform Info:
OS: macOS (x86_64-apple-darwin19.5.0)
CPU: Intel(R) Core(TM) i7-4770HQ CPU @ 2.20GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-12.0.1 (ORCJIT, haswell)