Closed
Description
Hey all.
As it stands, calling AD.derivative for FiniteDifferences and ForwardDiff back-ends first calculates the jacobian and then flattens it into the derivative. For a few edge cases, say a single-input function, this is significantly slower:
using FiniteDifferences, BenchmarkTools
import AbstractDifferentiation as AD
fdm = central_fdm(2,1,adapt=0)
fd = AD.FiniteDifferencesBackend(fdm)
with_AD(x) = AD.derivative(fd,sin,x)
without_AD(x) = fdm(sin,x)
blame_the_jacobian(x) = jacobian(fdm,sin,x)
@benchmark with_AD(1.)
@benchmark without_AD(1.)
@benchmark blame_the_jacobian(1.)
BenchmarkTools.Trial: 10000 samples with 10 evaluations.
Range (min … max): 1.070 μs … 552.240 μs ┊ GC (min … max): 0.00% … 99.30%
Time (median): 1.160 μs ┊ GC (median): 0.00%
Time (mean ± σ): 1.327 μs ± 5.524 μs ┊ GC (mean ± σ): 4.13% ± 0.99%
▅█
▂███▆▄▃▃▃▂▂▂▂▃▂▂▂▃▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▂
1.07 μs Histogram: frequency by time 2.42 μs <
Memory estimate: 944 bytes, allocs estimate: 17.
BenchmarkTools.Trial: 10000 samples with 961 evaluations.
Range (min … max): 85.640 ns … 2.145 μs ┊ GC (min … max): 0.00% … 93.60%
Time (median): 88.658 ns ┊ GC (median): 0.00%
Time (mean ± σ): 97.445 ns ± 46.875 ns ┊ GC (mean ± σ): 0.98% ± 2.08%
▂█▇ ▁▄▅▄▃ ▄ ▁▁▁▁ ▁
███▇██▆██████▆▆███▄▅▆▇█████▇▆▅▃▄▅▅▄▅▆▆▅▆▇▇█▇▃▄▄▂▅▄▅▄▅▃▄▄▅▃▄ █
85.6 ns Histogram: log(frequency) by time 173 ns <
Memory estimate: 32 bytes, allocs estimate: 2.
BenchmarkTools.Trial: 10000 samples with 111 evaluations.
Range (min … max): 774.775 ns … 47.623 μs ┊ GC (min … max): 0.00% … 97.59%
Time (median): 819.820 ns ┊ GC (median): 0.00%
Time (mean ± σ): 950.669 ns ± 1.825 μs ┊ GC (mean ± σ): 7.82% ± 4.01%
▄▇█▆▄▂▂▁▁▁▁ ▃▄ ▁▁ ▁
██████████████████████▇▆▆▆▆▆▆▆▆▅▆▅▅▆▆▅▆▄▅▅▅▅▄▄▄▄▆▅▄▆▄▄▄▅▅▄▄▅ █
775 ns Histogram: log(frequency) by time 1.83 μs <
Memory estimate: 864 bytes, allocs estimate: 14.
This is also the case for other, less silly examples like small neural networks with a single input. What are the reasons for not implementing the derivative directly? Something along the lines of:
function AD.derivative(ba::AD.FiniteDifferencesBackend, f, xs...)
return (ba.method(f, xs...),)
end
Metadata
Metadata
Assignees
Labels
No labels