-
Notifications
You must be signed in to change notification settings - Fork 149
make Chunk(::StaticArray) type stable by default #707
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
As you say chunk size = length does seem to be the present behaviour, If you explicitly make a julia> f1(x) = @show x[1];
julia> ForwardDiff.Chunk(@SMatrix rand(1,100))
ForwardDiff.Chunk{12}()
julia> ForwardDiff.GradientConfig(f1, @SMatrix rand(1,100)).seeds
(Partials(1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 0.0), Partials(0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0))
julia> ForwardDiff.gradient(f1, @SArray(rand(1,100)), ForwardDiff.GradientConfig(f1, @SArray rand(1,100)))
x[1] = Dual{ForwardDiff.Tag{typeof(f1), Float64}}(0.6424264074844214,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0)
1×100 SMatrix{1, 100, Float64, 100} with indices SOneTo(1)×SOneTo(100):
1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 Using chunk size 100 might be sub-optimal. I would guess that whatever compiler limitations StaticArrays runs into for long arrays likely apply to long tuples of partials too, so this thing with |
I have attempted an alternate scheme based partly on @avik-pal 's suggestion. It seems to work when I test it locally, though it is certainly rather precarious and scary looking as it assumes inference can follow a bunch of logic without breaking. I think what I have here now would be considered non-breaking though. For whatever it's worth, were I to do this from scratch, I'd probably split |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #707 +/- ##
==========================================
+ Coverage 89.57% 89.59% +0.01%
==========================================
Files 11 11
Lines 969 980 +11
==========================================
+ Hits 868 878 +10
- Misses 101 102 +1 ☔ View full report in Codecov by Sentry. |
# Constrained to `N <= threshold`, minimize (in order of priority): | ||
# 1. the number of chunks that need to be computed | ||
# 2. the number of "left over" perturbations in the final chunk | ||
function pickchunksize(input_length, threshold = DEFAULT_CHUNK_THRESHOLD) | ||
function pickchunksize(::Val{input_length}, ::Val{threshold} = DEFAULT_CHUNK_THRESHOLD) where {input_length, threshold} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used in downstream packages, e.g., in SciML: https://github.com/search?q=org%3ASciML+pickchunksize&type=code So even though technically it might not breaking, it will affect many downstream packages. Similarly, Chunk
is used by quite a few downstream packages, e.g., SciML or also LogDensityProblemsAD (and hence e.g. Turing indirectly).
Since the next release will be breaking though anyway, maybe that's not a major issue. I think it would be good to keep the API simple though, ie., to not support both integers and Val
s.
For me, the following diff is enough: -const CHUNKS = [Chunk{i}() for i in 1:DEFAULT_CHUNK_THRESHOLD]
-
function Chunk(input_length::Integer, threshold::Integer = DEFAULT_CHUNK_THRESHOLD)
N = pickchunksize(input_length, threshold)
- 0 < N <= DEFAULT_CHUNK_THRESHOLD && return CHUNKS[N]
+ 0 < N <= DEFAULT_CHUNK_THRESHOLD && return Chunk{N}()
return Chunk{N}()
end
The reason the This should be benchmarked again with this change though: #373 |
This prevents constant propagation of the length of a static array. Alternative to #707
Made an alternative PR at #708 |
FYI @ExpandingMan the idea of DifferentiationInterface is to prepare the operator once and then reuse the preparation several times. For gradient and Jacobian, the preparation phase does involve a type-unstable code to Chunk, but then the resulting extras object allows for type-stable application of the operator. In this spirit, it's okay for preparation to be type unstable (and currently it has to be with ForwardDiff), but the code you put in hot loops should be the prepared version of the operator. |
Thanks @KristofferC , I have confirmed that #708 works, so that seems obviously preferable to this, I will close this. |
This prevents constant propagation of the length of a static array. Alternative to #707 Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
This prevents constant propagation of the length of a static array. Alternative to #707 Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
This prevents constant propagation of the length of a static array. Alternative to #707 Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
This prevents constant propagation of the length of a static array. Alternative to #707 Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
This prevents constant propagation of the length of a static array. Alternative to #707 Co-authored-by: David Widmann <devmotion@users.noreply.github.com>
The function
Chunk(x)
is manifestly type unstable. In many cases when the value is not directly used, the compiler is clever enough to elide this, but in some cases its mere presence can cause significant performance penalties.In particular, I encountered this when trying to get
DifferentiationInterface.jl
to behave nicely with static arrays. That package callsGradientConfig
(orJacobianConfig
, etc) and for whatever reason the compiler gets upset about the calls toChunk
and the whole thing is wildly inefficient.This PR attempts to address that issue by defining a
StaticArray
method forChunk
and setting the size to the arrays length. It is not entirely clear to me that this is the desired behavior, butdualize
already seems to do this. Also this ignores the global chunk size limit setting, but again,dualize
appears to already do this.Note that I have explicitly confirmed that the
@inferred
tests I have added fail without this PR but succeed with it.(btw I am getting a failure in one of the tests for allocations during
seed!
when I run on my end, but it looks totally unrelated to this so I'm not sure what's going on there. Maybe it'll pass in CI/CD, who knows)