Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] [RFC!] Refactoring... #266

Merged
merged 20 commits into from
Sep 26, 2016
Merged

[WIP] [RFC!] Refactoring... #266

merged 20 commits into from
Sep 26, 2016

Conversation

pkofod
Copy link
Member

@pkofod pkofod commented Aug 16, 2016

A first attempt, proof of concept, work in progress, and whatever other word you might want to use for temporary or unfinished.

I didn't want to go through the entire package without any comments, so I would love to hear everyone's input.

There are some weird things in there, and specifically a few new types are simply invented for dispatch reasons. They will go again.

Right now, I'm seeing state. here, state. there, state.s everywhere! Should we use something like the macros in Parameters such that we can omit the state.s?

I'll keep on working on it, but criticism is very welcome.

TODO

  • Add all multivariate solvers
  • handle generic fields (includes remove the method_name in favour of a default field in the types)
  • cleanup (white space, Refactored etc)
  • move callback and uncomment
  • state.s, state.s everywhere!

@tbreloff
Copy link

I'm not as familiar with the codebase as others, but from my quick review I like what I see. 👍

@KristofferC
Copy link
Contributor

KristofferC commented Aug 16, 2016

More removed lines than added? Ship it!

But seriously. I also think that something like Parameters would be good, either manually or just use the package. The state. is a bit distracting.

@pkofod
Copy link
Member Author

pkofod commented Aug 16, 2016

@KristofferC I also sneaked in the transition from macros to functions in trace that we discussed a while ago.

alpha::T
mayterminate::Bool
f_calls::Int64
g_calls::Int64
Copy link
Contributor

@Evizero Evizero Aug 16, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things like f_calls etc seem generic enough to not be in each solver specific state.
or alternatively one could create a little macro that adds these common fields to the current type

type BFGSState{T}
    bfgs_specific_variable::Int64

    @add_generic_fields()
end

(I think)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! At least all the fields for multivariateoptimizationoptions Will be the same.

@pkofod pkofod force-pushed the pkm/refactor branch 2 times, most recently from 700c597 to 8e53c71 Compare August 16, 2016 20:10
@codecov-io
Copy link

codecov-io commented Aug 16, 2016

Current coverage is 86.75% (diff: 91.16%)

Merging #266 into master will increase coverage by 0.76%

@@             master       #266   diff @@
==========================================
  Files            31         32     +1   
  Lines          2348       2144   -204   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits           2019       1860   -159   
+ Misses          329        284    -45   
  Partials          0          0          

Powered by Codecov. Last update f0f5c18...b3dff26

@pkofod
Copy link
Member Author

pkofod commented Aug 18, 2016

Added ParticleSwarm. I just need SimulatedAnnealing and the two Newton's.

Tests are running longer now, mostly because we recently added more tests I think. One of the NelderMead tests might also be part of it. This means that I've had to reset the tests a few time due to time outs. Let's see if it continues to be a problem, and we might have to do something about it.

mayterminate::Bool
f_calls::Int64
g_calls::Int64
lsr
Copy link
Contributor

@Evizero Evizero Aug 18, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the type of this not known? only asking since you went through the trouble of using leaf types everywhere else a little further down it is initialized with LineSearchResults(T)

EDIT: actually, no, I am wrong, Array{T} won't be a leaf type I don't think.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may have forgotten some annotations in the process :) thanks for reading the code so thoroughly.

@pkofod pkofod force-pushed the pkm/refactor branch 4 times, most recently from ac3f5dc to e496672 Compare August 26, 2016 08:20
@pkofod
Copy link
Member Author

pkofod commented Aug 26, 2016

fails because nelder mead is slow, maybe I should drop that test again.

@pkofod
Copy link
Member Author

pkofod commented Aug 27, 2016

The fields that are in all states are now added with a macro. This is what you meant right @Evizero ? What do you think @johnmyleswhite (and others). Is it a few lines fewer at the cost of a decrease in readability?

@Evizero
Copy link
Contributor

Evizero commented Aug 27, 2016

That was what I meant, yes. If it is worth the reduction of code repetition I can't say, but it seems worth considering.

@pkofod
Copy link
Member Author

pkofod commented Aug 29, 2016

That was what I meant, yes. If it is worth the reduction of code repetition I can't say, but it seems worth considering.

With the general refactoring + the macros (the initialize_linesearch could be a function I guess) there's a pretty nice +/- line ratio, at least.

clear!(lsr)
push!(lsr, zero(T), f_x, dphi0)
type AcceleratedGradientDescentState{T}
@add_generic_fields()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I suggest we use composition rather than macros? In that approach, you'd create a new type called GenericOptimizationState and place that type inside of each of these types.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do that. If I had to try to defend it, it would probably be that it adds a layer of dots (state.generic.x) but I guess this can be handled with an "unpack"-like functionality as in Parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any comments for my comment @johnmyleswhite ? At least I think we should do this with the line search part, so

method.linesearch!(d, state.x, state.s, state.x_ls, state.g_ls, state.lsr,
                     state.alpha, state.mayterminate)

becomes

method.linesearch!(d, state.x, state.s, state.linesearch_buffers)

or whatever name the field might get.

@ahwillia
Copy link
Contributor

ahwillia commented Sep 3, 2016

Still really excited about this! One thing I'd like to try is to build on top of this to optimize problems with constraints.

I'm not sure if this will end up being a dead end, but I decided to start experiment by doing projected gradient descent to solve a non-negative least squares problem (a terrible way to solve it, but a useful toy problem nonetheless).

using Optim

const A = rand(10,10)
const b = rand(10)

f(x) = 0.5*sumabs2(A*x - b)
d = DifferentiableFunction(f)
x0 = randn(10)

# project onto non-negative orthant
project!(x) = map!(z-> z<0 ? 0.0 : z, x)

gd = GradientDescent()
o = OptimizationOptions()

s = Optim.initial_state(gd, o, d, x0)

for iter = 1:iterations
    Optim.update!(d, s, gd)
    project!(s.x)
end

Right now I get this error:

ERROR: LoadError: AssertionError: c > 0
 in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:186
 in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:176 (repeats 4 times)
 in update!(::Optim.DifferentiableFunction, ::Optim.GradientDescentState{Float64}, ::Optim.GradientDescent{Void}) at /Users/alex/.julia/v0.5/Optim/src/gradient_descent.jl:52
 in macro expansion; at /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl:23 [inlined]
 in macro expansion; at /Users/alex/.julia/v0.5/ProgressMeter/src/ProgressMeter.jl:473 [inlined]
 in anonymous at ./<missing>:?
 in include_from_node1(::String) at ./loading.jl:426
while loading /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl, in expression starting on line 22

I'll keep playing around with this, but am curious to hear feedback on whether this will mesh well with the intended approach here.

(I also want to mention #254 so that earlier discussion is linked to this PR)

@pkofod
Copy link
Member Author

pkofod commented Sep 3, 2016

Great to see a simple example of how this could be useful outside of Optim.

I actually found a place in NelderMead where I hadn't added the state. from playing around with your example :) (apparently none of our tests causes a shrink!)

I'll see if I can further investigate it later.

@ahwillia
Copy link
Contributor

ahwillia commented Sep 3, 2016

Changing the inner loop like this seems to work (though there is a lot of unnecessary overhead):

s = Optim.initial_state(gd, o, d, x0)
for iter = 1:iterations
    Optim.update!(d, s, gd)
    s = Optim.initial_state(gd, o, d, project!(s.x))
end

Looks like the problem is that update! calculates the gradient at the end of the function so I need to manually recalculate the gradient before moving to the next loop iteration. I haven't given this much thought, but would it make sense for update! to calculate the gradient before doing linesearch and moving the parameters?

@johnmyleswhite
Copy link
Contributor

I haven't given this much thought, but would it make sense for update! to calculate the gradient before doing linesearch and moving the parameters?

I think this could be made to work if we want to make update! idempotent in the sense that it will not modify the current state if it detects convergence. But you also need to decide if you want to check for convergence at all.

@pkofod
Copy link
Member Author

pkofod commented Sep 7, 2016

Changing the inner loop like this seems to work (though there is a lot of unnecessary overhead):

s = Optim.initial_state(gd, o, d, x0)
for iter = 1:iterations
Optim.update!(d, s, gd)
s = Optim.initial_state(gd, o, d, project!(s.x))
end

Looks like the problem is that update! calculates the gradient at the end of the function so I need to manually recalculate the gradient before moving to the next loop iteration. I haven't given this much thought, but would it make sense for update! to calculate the gradient before doing linesearch and moving the parameters?

Again, I think your example really shows why this refactoring is worth it, but we need to think about how to do this the best way.

We could move the gradient calls to the top. What is the issue with that? Well, then we have a superfluous gradient calculation in the first iteration, as we have already calculated the gradient to do the pre-iteration convergence check. If we don't do it, we revive the "linesearch error if initial point is a stationary point". Instead, we could remove the initial convergence check, and then turn the line search error into a line search warning. We could then add a flag, such that when the warning is thrown, a flag is changed to true, and as soon as update! is over, the major loop terminates.

This would then require some form of #263 to be implemented. I think that might work.

@anriseth anriseth mentioned this pull request Sep 7, 2016
1 task
@pkofod
Copy link
Member Author

pkofod commented Sep 19, 2016

Still really excited about this! One thing I'd like to try is to build on top of this to optimize problems with constraints.

I'm not sure if this will end up being a dead end, but I decided to start experiment by doing projected gradient descent to solve a non-negative least squares problem (a terrible way to solve it, but a useful toy problem nonetheless).

using Optim

const A = rand(10,10)
const b = rand(10)

f(x) = 0.5_sumabs2(A_x - b)
d = DifferentiableFunction(f)
x0 = randn(10)

project onto non-negative orthant

project!(x) = map!(z-> z<0 ? 0.0 : z, x)

gd = GradientDescent()
o = OptimizationOptions()

s = Optim.initial_state(gd, o, d, x0)

for iter = 1:iterations
Optim.update!(d, s, gd)
project!(s.x)
end

Right now I get this error:

ERROR: LoadError: AssertionError: c > 0
in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:186
in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:176 (repeats 4 times)
in update!(::Optim.DifferentiableFunction, ::Optim.GradientDescentState{Float64}, ::Optim.GradientDescent{Void}) at /Users/alex/.julia/v0.5/Optim/src/gradient_descent.jl:52
in macro expansion; at /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl:23 [inlined]
in macro expansion; at /Users/alex/.julia/v0.5/ProgressMeter/src/ProgressMeter.jl:473 [inlined]
in anonymous at ./:?
in include_from_node1(::String) at ./loading.jl:426
while loading /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl, in expression starting on line 22

I'll keep playing around with this, but am curious to hear feedback on whether this will mesh well with the intended approach here.

(I also want to mention #254 so that earlier discussion is linked to this PR)

This now works (as far as I can tell)

using Optim

const A = rand(10,10)
const b = rand(10)

f(x) = 0.5*sumabs2(A*x - b)
d = DifferentiableFunction(f)
x0 = randn(10)

# project onto non-negative orthant
project!(x) = map!(z-> z<0 ? 0.0 : z, x)

gd = GradientDescent()
o = OptimizationOptions()

s = Optim.initial_state(gd, o, d, x0)
iterations = 100
for iter = 1:iterations
    Optim.update_state!(d, s, gd)
    project!(s.x)
    Optim.update_g!(d, s, gd)
end

@pkofod
Copy link
Member Author

pkofod commented Sep 19, 2016

att: @ahwillia

(btw, if I edit a post and add a mention, are people then pinged?)

@pkofod
Copy link
Member Author

pkofod commented Sep 19, 2016

I believe most of the work in the PR is done by now. The only problem right now is the appearance of state. everywhere. We may want to unpack them ala Parameters.jl, but that is more of a prettifying thing. If it blocks work such as the linesearch PR, then I think it can wait. Comments are still very welcome.

@ahwillia
Copy link
Contributor

👏 nice. I will play around with this later this week. Let me know if there is anything specific you'd like feedback on.

@pkofod
Copy link
Member Author

pkofod commented Sep 26, 2016

Alright, I'll go ahead and merge this, but this doesn't have to be the final say. We can test it out, and tweak it before the next tag. Thanks for all the comments!

@pkofod pkofod merged commit 051641e into master Sep 26, 2016
@pkofod
Copy link
Member Author

pkofod commented Sep 29, 2016

I'm not sure if that's right either. For example, this LLVM output looks pretty dire for keyword arguments in extremely simple functions: https://gist.github.com/johnmyleswhite/7484870a0f3656c735ae38bd1d5a0268

I know this is closed now, but @johnmyleswhite we were not the first to be surprised/confused by the wording in the docs JuliaLang/julia#9551 (comment)

@pkofod pkofod deleted the pkm/refactor branch October 17, 2016 20:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants