After approximately the zillionth time seeing people get confusing or incorrect benchmark results because they did:
instead of
I started wondering if maybe we could do something to avoid forcing this cognitive burden on users.
As inspiration, I've used the following macro in the unit tests to measure "real" allocations from a single execution of a function:
macro wrappedallocs(expr)
argnames = [gensym() for a in expr.args]
quote
function g($(argnames...))
@allocated $(Expr(expr.head, argnames...))
end
$(Expr(:call, :g, [esc(a) for a in expr.args]...))
end
end
@wrappedallocs f(x) turns @allocated f(x) into something more like:
function g(_y)
@allocated f(_y)
end
g(y)
which does the same computation but measures the allocations inside the wrapped function instead of at global scope.
It might be possible to do something like this for benchmarking. This particular implementation is wrong, because @wrappedallocs f(g(x)) will only measure the allocations of f() not g(), but a similar approach, involving walking the expression to collect all the symbols and then passing those symbols through a new outer function, might work.
The result would be that
would turn into something like
function _f(_f, _g, _y, _x)
@_benchmark _f(_g(_y), _x)
end
_f(f, g, y, x)
where @_benchmark does basically what regular @benchmark does right now. Passing _f and _g as arguments is not necessary if they're regular functions, but it is necessary if they're arbitrary callable objects.
The question is: is this a good idea? This makes BenchmarkTools more complicated, and might involve too much magic. I also haven't thought through how to integrate this with the setup arguments. I'm mostly just interested in seeing if this is something that's worth spending time on.
One particular concern I have is if the user tries to benchmark a big block of code, we may end up with the wrapper function taking a ridiculous number of arguments, which I suspect is likely to be handled badly by Julia. Fortunately, the macro can at least detect that case and demand that the user manually splice in their arguments.
After approximately the zillionth time seeing people get confusing or incorrect benchmark results because they did:
instead of
I started wondering if maybe we could do something to avoid forcing this cognitive burden on users.
As inspiration, I've used the following macro in the unit tests to measure "real" allocations from a single execution of a function:
@wrappedallocs f(x)turns@allocated f(x)into something more like:which does the same computation but measures the allocations inside the wrapped function instead of at global scope.
It might be possible to do something like this for benchmarking. This particular implementation is wrong, because
@wrappedallocs f(g(x))will only measure the allocations off()notg(), but a similar approach, involving walking the expression to collect all the symbols and then passing those symbols through a new outer function, might work.The result would be that
would turn into something like
where
@_benchmarkdoes basically what regular@benchmarkdoes right now. Passing_fand_gas arguments is not necessary if they're regular functions, but it is necessary if they're arbitrary callable objects.The question is: is this a good idea? This makes
BenchmarkToolsmore complicated, and might involve too much magic. I also haven't thought through how to integrate this with thesetuparguments. I'm mostly just interested in seeing if this is something that's worth spending time on.One particular concern I have is if the user tries to benchmark a big block of code, we may end up with the wrapper function taking a ridiculous number of arguments, which I suspect is likely to be handled badly by Julia. Fortunately, the macro can at least detect that case and demand that the user manually splice in their arguments.