Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check that if only one adjoint exists, it is not thunked #53

Merged
merged 4 commits into from
Jul 13, 2020

Conversation

sethaxen
Copy link
Member

@sethaxen sethaxen commented Jul 12, 2020

Currently, the (unenforced) standard is that single-argument functions do not thunk in the pullback. On JuliaDiff/ChainRules.jl#182, @willtebbutt and I were discussing whether we should extend that so that if there is only a single meaningful argument, then that argument's adjoint should not be thunked. This PR takes that opinion and adds a conservative test to rrule_test: namely, for functions whose pullbacks only produce one adjoint that is not DoesNotExist, that single adjoint is not thunked.

This is more to generate discussion. As you can see in the integration test, several pullbacks in ChainRules break this proposed rule:

  • sum
  • mean
  • tr
  • diag
  • Symmetric
  • Hermitian
  • Adjoint
  • adjoint
  • Transpose
  • transpose
  • UpperTriangular
  • LowerTriangular
  • triu
  • tril
  • asum

@nickrobinson251
Copy link
Contributor

I'd have thought that whether or not to thunk some computation is a question only of whether or not creating the thunk is less work then evaluating the computation in the first place, because even in the single arg case, we could encounter a * ::AbstractZero, and thanks to thunk save ever running the expensive computation.

@oxinabox ?

@willtebbutt
Copy link
Member

willtebbutt commented Jul 13, 2020

Multiplication between differentials is undefined, and doesn't happen during AD. It also shouldn't happen inside pullbacks, because they operate linearly on their input (cotangent of output of primal), so there shouldn't be any internal multiply-by-AbstractZero stuff going on with differentials.

edit: if we have docs that suggest this might happen, they should be amended. I think we probably thought that this was possible before we really tied down our definition of differentials.

@oxinabox
Copy link
Member

Zero() and (even more so) One() do still kind of serve purpose as scalars / basis elements as well as as actual differentials.
And in that role they can show up in a multiplication with a AbstractThunk and the Zero and indeed eleminate the thunk.

But even for that case, which we are not so much in-love with right now,
I don't think it is generally worth it, in the tradeoff for the extra machinery that thunking brings out.

we probably do still want to allow AbstractThunks to be the only nontrivial return so that we can still do InplaceThunk as I still haven't imagined any more efficient way to do optionally inplace accumulation.

@willtebbutt
Copy link
Member

we probably do still want to allow AbstractThunks to be the only nontrivial return so that we can still do InplaceThunk as I still haven't imagined any more efficient way to do optionally inplace accumulation.

I think this is reasonable, but if this is the only thing that we want to do then what @sethaxen has actually implemented is fine -- it doesn't make sense for it to be a Thunk, but it does make sense for it to be an InplaceThunk.

@oxinabox
Copy link
Member

Indeed

@willtebbutt
Copy link
Member

Great. The question is now whether or not to view this as a breaking change. I think we probably should call it a breaking change because it will cause test failures that wouldn't previously have occurred.

The only argument in favour of non-breaking that I can think of is that anything that fails these tests would have been considered a bug previously (kind of).

So if no one objects I'm going to suggest a minor version bump.

@oxinabox
Copy link
Member

yes, this is breaking

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, subject to minor version bump. We expect ChainRules tests to break, so CI failing for them is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants