Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix Zygote on 1.6, fix #851 #848

Merged
merged 7 commits into from
Dec 21, 2020

Conversation

simeonschaub
Copy link
Member

@simeonschaub simeonschaub commented Dec 7, 2020

Relies on FluxML/ZygoteRules.jl#13 FluxML/ZygoteRules.jl#15. The test I marked as broken doesn't fail if run in an interpreter, so this might be a Julia bug. Other than that, I still see failures related to LoopVectorization, might be due to llvmcall. @chriselrod, do you have any ideas?

fixes #851

@simeonschaub simeonschaub marked this pull request as draft December 7, 2020 22:03
@simeonschaub
Copy link
Member Author

Ok, besides LoopVectorization, the remaining failures should be fixed by JuliaLang/julia#38764. @mcabbott What's the status of #842? With that merged, we would be pretty close to getting all tests to pass on 1.6.

@mcabbott
Copy link
Member

mcabbott commented Dec 8, 2020

#842 is fine, I think, but it wants JuliaSIMD/LoopVectorization.jl#167.

If the version of Zygote which has #842 is 0.6, then it need not wait --- anyone needing vmap should not upgrade until that's ready. That sounds simplest.

@simeonschaub
Copy link
Member Author

Ah, I see! We should coordinate that with #824 then, since that needs a breaking release as well.

@CarloLucibello CarloLucibello added this to the v0.6 milestone Dec 8, 2020
@simeonschaub
Copy link
Member Author

Note that this doesn't need to hold up a 0.6 release, the changes in this PR should not be breaking.

@simeonschaub simeonschaub marked this pull request as ready for review December 8, 2020 22:02
@simeonschaub
Copy link
Member Author

OK, I was still missing some things, but I have now verified this on top of JuliaLang/julia#38764 and all tests pass! 🎉

test/cuda.jl Outdated Show resolved Hide resolved
@CarloLucibello CarloLucibello removed this from the v0.6 milestone Dec 9, 2020
Relies on ZygoteRules.jl#13. The test I marked as broken doesn't fail if run in an interpreter, so this might be a Julia bug. Other than that, I still see failures related to LoopVectorization, might be due to llvmcall. @chriselrod, do you have any ideas?
@CarloLucibello
Copy link
Member

CarloLucibello commented Dec 12, 2020

Once the compat issues here are fixed, you should also comment out the continue-on-error lines in .github/workflow/ci.yml so that we can check any regression on nightly

@simeonschaub simeonschaub changed the title WIP: fix Zygote on 1.6 fix Zygote on 1.6 Dec 12, 2020
@simeonschaub
Copy link
Member Author

What's still holding this up is that FluxML/ZygoteRules.jl#13 hasn't been released yet, but I know @oxinabox had some doubts about the approach there, so maybe we don't want to rush this.

@oxinabox
Copy link
Member

oxinabox commented Dec 12, 2020

What's still holding this up is that FluxML/ZygoteRules.jl#13 hasn't been released yet, but I know @oxinabox had some doubts about the approach there, so maybe we don't want to rush this.

Yeah, i think that it is confusing and inconsistent with all other uses of adjoint.
Futher it will make #603 harder because now some inputs don't have sensitivities in the pullback.
which the current code in FluxML/ZygoteRules.jl#10 depends on.
I don't understand the logic of that being less likely to break things than what I proposed in #851
so i think i am missing something

@simeonschaub
Copy link
Member Author

I don't understand the logic of that being less likely to break things than what I proposed in #851

My thought process was that someone might have defined an adjoint for literal_getproperty with the intention of overloading getfield, but the more I think about it, the more I think that case might now be broken with my solution as well, so I would be in favor of introducing a literal_getfield instead.

@oxinabox
Copy link
Member

According to JuliaHub code search the things to test/check are:

I think it should only matter if they were directly using getfield in code (sincde using . would be come getproperty).
But i think the case that it matters for is broken enough that they would have openned an issue.

@simeonschaub
Copy link
Member Author

simeonschaub commented Dec 12, 2020

OK, I updated this PR accordingly and tested that this still works. Since master is now at 0.6 anyways, I think this should be fine.

@simeonschaub simeonschaub changed the title fix Zygote on 1.6 fix Zygote on 1.6, fix #851 Dec 14, 2020
@test_broken begin
a3, pb3 = Zygote.pullback(h, 1)
((1,),) == pb3(1)
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is broken here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really not sure. That's the one I mentioned in the description, which doesn't occur when run with JuliaInterpreter. I didn't have time yet to really look into it, but this might be a Julia bug.

Copy link
Member

@oxinabox oxinabox Dec 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested AD is hard.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Diffractor.jl to the rescue! :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not entirely.
But a but.
The code will be less gross and a bit faster.
But the problem is just legit hard.

@CarloLucibello
Copy link
Member

what is left to do to make CI happy?

@simeonschaub
Copy link
Member Author

simeonschaub commented Dec 15, 2020

what is left to do to make CI happy?

Merging FluxML/ZygoteRules.jl#15 and tagging a new ZygoteRules release should make tests pass. I don't know whether all the CUDA fixes are released yet, but IIUC, those tests should only get run by bors.

@simeonschaub
Copy link
Member Author

Ok, this should be good to go now. It wouldn't hurt to have another set of eyes look this over, but not sure who would be best to do this.

@CarloLucibello
Copy link
Member

Maybe @oxinabox or @willtebbutt could approve this?

Copy link
Member

@willtebbutt willtebbutt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Broadly LGTM, just a couple of queries I would like addressed before I approve.

.github/workflows/ci.yml Show resolved Hide resolved
src/forward/lib.jl Outdated Show resolved Hide resolved
src/lib/lib.jl Outdated Show resolved Hide resolved
src/compiler/reverse.jl Show resolved Hide resolved

@adjoint function literal_getproperty(x, ::Val{f}) where f
val = getproperty(x, f)
@adjoint function literal_getfield(x, ::Val{f}) where f
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be cleaner to have separate adjoints for literal_getproperty and literal_getfield

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you explain what you mean here? We now don't define a default adjoint for literal_getproperty, but instead fall back to recursing into getproperty

Copy link
Member

@oxinabox oxinabox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM barring on thing with the inconsistency.
Once that is fixed feel free to merge.

@simeonschaub
Copy link
Member Author

@oxinabox Did I address your comments?

@oxinabox
Copy link
Member

You did looks great

@CarloLucibello CarloLucibello merged commit b3edf99 into FluxML:master Dec 21, 2020
@simeonschaub simeonschaub deleted the sds/fix_on_master branch December 21, 2020 13:52
simeonschaub added a commit to simeonschaub/Zygote.jl that referenced this pull request Feb 23, 2021
This has regressed quite a bit due to FluxML#848. With this PR, we should be able to get back the same performance as before, assuming there is no custom implementation or pullback for `getproperty`. Still need to add tests.
simeonschaub added a commit to simeonschaub/Zygote.jl that referenced this pull request Oct 5, 2021
This has regressed quite a bit due to FluxML#848. With this PR, we should be able to get back the same performance as before, assuming there is no custom implementation or pullback for `getproperty`. Still need to add tests.
bors bot added a commit that referenced this pull request Oct 14, 2021
909: improve inference for getproperty r=willtebbutt a=simeonschaub

This has regressed quite a bit due to #848. With this PR, we should be able to get back the same performance as before, assuming there is no custom implementation or pullback for `getproperty`. Still need to add tests.

Co-authored-by: Simeon David Schaub <schaub@mit.edu>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Forward pass of getfield acts like getproperty in _pullback
6 participants