-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aliased differentials and Inplace Accumulation #350
Comments
@willtebbutt and @mcabbott have both proposed doing something that looks somewhat like a thunk. The other option is something that gathers up a long chain of all attempts to accumulate against it, and then at the last minute when it has no choice calls |
Yes, I think it's not quite But it's similar in that I think it could be handled by the code interfacing to The simplest version would just copy on the first attempt to write. My guess is that would get you most of the benefit. The repeated scalar indexing cases of FluxML/Zygote.jl#962 still wouldn't involve any extra copies, as they didn't call the adjoint of Delaying this as you suggest might mean that all other views of the same data have been discarded by the time you want to write. If the un-wrapping (and re-wrapping if |
If a pullback is something like
dx->(dx, dx)
fordx
being a reference type, e.g. like the pullback for+(::Array, ::Array)
, then the two outputs are the same object.They are aliased.
Even though they are actually differentials for (usually) distinct (nonaliased) primal values.
This causes inplace accumulation to act wrong.
See discussion here FluxML/Zygote.jl#962 (comment)
where @mcabbott was just adding inplace accumulation for
getindex
.I am wondering if we need to require that if the primals are not aliased the differentials also need to not be aliased.
This extra copies is slow if not inplace accumulation is not actually done though.
we might want to make it configurable depending on the AD system.
(this related to the config needs for #68)
I think the ideal solution would be copy-on-write objects.
But doing that without language support is suffering.
The text was updated successfully, but these errors were encountered: