Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex numbers alla Flux 1776 #47

Merged
merged 10 commits into from
Feb 5, 2022
Merged

Complex numbers alla Flux 1776 #47

merged 10 commits into from
Feb 5, 2022

Conversation

mcabbott
Copy link
Member

@mcabbott mcabbott commented Jan 30, 2022

This copies FluxML/Flux.jl#1776 but doesn't work. Why?

Xref #38

Edit: Now this is on top of #48. The original tests pass, but ADADelta only just passes, only the last few digits of the loss decrease. Not sure if That's true in Flux too. Might be a bug somewhere? Because the parameters chosen are strange ones.

loss

src/rules.jl Outdated Show resolved Hide resolved
src/rules.jl Outdated Show resolved Hide resolved
@mcabbott mcabbott force-pushed the complex branch 2 times, most recently from 284714f to 309e606 Compare January 31, 2022 18:48
@cossio
Copy link
Contributor

cossio commented Feb 1, 2022

What's the loss function you are using in the plot?

@mcabbott
Copy link
Member Author

mcabbott commented Feb 1, 2022

It's the one from the tests, here https://github.com/FluxML/Optimisers.jl/pull/47/files#diff-9f183063573ccf8f01dd29629bad9986f3faa3ce2fdeeb3f0855eaa1f6450715R214 and the plot commands are in comments there. The same loss is used in the Flux PR's tests.

Comment on lines +207 to +185
# The Flux PR had 1e-2 for all. But ADADelta(ρ) needs ρ≈0.9 not small. And it helps to make ε not too small too:
ADAM(1e-2), RMSProp(1e-2), RADAM(1e-2), OADAM(1e-2), ADAGrad(1e-2), ADADelta(0.9, 1e-5), NADAM(1e-2), AdaBelief(1e-2),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix, BTW.

@mcabbott
Copy link
Member Author

mcabbott commented Feb 5, 2022

Good to go?

test/rules.jl Outdated
Comment on lines 22 to 28
LOG = Dict()

loggradient(o) = (f, xs...) -> begin
y, dxs = Zygote.withgradient(f, xs...)
push!(get!(() -> Float32[], LOG, name(o)), y)
dxs # save the loss, return the gradient
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This (LOG) doesn't seem to be in use anywhere, is it still necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just makes debugging easier if you can plot things form the tests you just ran. It's not strictly necessary but also doesn't really get in the way, I think.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough. A comment with what you just described would help then.

@mcabbott mcabbott merged commit adc0e85 into FluxML:master Feb 5, 2022
@mcabbott mcabbott deleted the complex branch February 5, 2022 20:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants