[WIP] [RFC!] Refactoring... #266

pkofod · 2016-08-16T13:50:17Z

A first attempt, proof of concept, work in progress, and whatever other word you might want to use for temporary or unfinished.

I didn't want to go through the entire package without any comments, so I would love to hear everyone's input.

There are some weird things in there, and specifically a few new types are simply invented for dispatch reasons. They will go again.

Right now, I'm seeing state. here, state. there, state.s everywhere! Should we use something like the macros in Parameters such that we can omit the state.s?

I'll keep on working on it, but criticism is very welcome.

TODO

Add all multivariate solvers
handle generic fields (includes remove the method_name in favour of a default field in the types)
cleanup (white space, Refactored etc)
move callback and uncomment
state.s, state.s everywhere!

tbreloff · 2016-08-16T13:58:01Z

I'm not as familiar with the codebase as others, but from my quick review I like what I see. 👍

KristofferC · 2016-08-16T14:04:05Z

More removed lines than added? Ship it!

But seriously. I also think that something like Parameters would be good, either manually or just use the package. The state. is a bit distracting.

pkofod · 2016-08-16T14:09:01Z

@KristofferC I also sneaked in the transition from macros to functions in trace that we discussed a while ago.

Evizero · 2016-08-16T15:41:27Z

src/bfgs.jl

+    alpha::T
+    mayterminate::Bool
+    f_calls::Int64
+    g_calls::Int64


Things like f_calls etc seem generic enough to not be in each solver specific state.
or alternatively one could create a little macro that adds these common fields to the current type

type BFGSState{T} bfgs_specific_variable::Int64 @add_generic_fields() end

(I think)

Sure! At least all the fields for multivariateoptimizationoptions Will be the same.

codecov-io · 2016-08-16T21:02:13Z

Current coverage is 86.75% (diff: 91.16%)

Merging #266 into master will increase coverage by 0.76%

@@             master       #266   diff @@
==========================================
  Files            31         32     +1   
  Lines          2348       2144   -204   
  Methods           0          0          
  Messages          0          0          
  Branches          0          0          
==========================================
- Hits           2019       1860   -159   
+ Misses          329        284    -45   
  Partials          0          0

Powered by Codecov. Last update f0f5c18...b3dff26

pkofod · 2016-08-18T12:46:56Z

Added ParticleSwarm. I just need SimulatedAnnealing and the two Newton's.

Tests are running longer now, mostly because we recently added more tests I think. One of the NelderMead tests might also be part of it. This means that I've had to reset the tests a few time due to time outs. Let's see if it continues to be a problem, and we might have to do something about it.

Evizero · 2016-08-18T15:43:52Z

src/accelerated_gradient_descent.jl

+      mayterminate::Bool
+      f_calls::Int64
+      g_calls::Int64
+      lsr


Is the type of this not known? ~~only asking since you went through the trouble of using leaf types everywhere else~~ a little further down it is initialized with LineSearchResults(T)

EDIT: actually, no, I am wrong, Array{T} won't be a leaf type I don't think.

I may have forgotten some annotations in the process :) thanks for reading the code so thoroughly.

pkofod · 2016-08-26T14:26:50Z

fails because nelder mead is slow, maybe I should drop that test again.

pkofod · 2016-08-27T08:52:08Z

The fields that are in all states are now added with a macro. This is what you meant right @Evizero ? What do you think @johnmyleswhite (and others). Is it a few lines fewer at the cost of a decrease in readability?

Evizero · 2016-08-27T08:55:06Z

That was what I meant, yes. If it is worth the reduction of code repetition I can't say, but it seems worth considering.

pkofod · 2016-08-29T06:05:35Z

That was what I meant, yes. If it is worth the reduction of code repetition I can't say, but it seems worth considering.

With the general refactoring + the macros (the initialize_linesearch could be a function I guess) there's a pretty nice +/- line ratio, at least.

johnmyleswhite · 2016-08-29T16:22:22Z

src/accelerated_gradient_descent.jl

-        clear!(lsr)
-        push!(lsr, zero(T), f_x, dphi0)
+type AcceleratedGradientDescentState{T}
+    @add_generic_fields()


Can I suggest we use composition rather than macros? In that approach, you'd create a new type called GenericOptimizationState and place that type inside of each of these types.

We could do that. If I had to try to defend it, it would probably be that it adds a layer of dots (state.generic.x) but I guess this can be handled with an "unpack"-like functionality as in Parameters.

Any comments for my comment @johnmyleswhite ? At least I think we should do this with the line search part, so

method.linesearch!(d, state.x, state.s, state.x_ls, state.g_ls, state.lsr, state.alpha, state.mayterminate)

becomes

method.linesearch!(d, state.x, state.s, state.linesearch_buffers)

or whatever name the field might get.

ahwillia · 2016-09-03T02:20:10Z

Still really excited about this! One thing I'd like to try is to build on top of this to optimize problems with constraints.

I'm not sure if this will end up being a dead end, but I decided to start experiment by doing projected gradient descent to solve a non-negative least squares problem (a terrible way to solve it, but a useful toy problem nonetheless).

using Optim

const A = rand(10,10)
const b = rand(10)

f(x) = 0.5*sumabs2(A*x - b)
d = DifferentiableFunction(f)
x0 = randn(10)

# project onto non-negative orthant
project!(x) = map!(z-> z<0 ? 0.0 : z, x)

gd = GradientDescent()
o = OptimizationOptions()

s = Optim.initial_state(gd, o, d, x0)

for iter = 1:iterations
    Optim.update!(d, s, gd)
    project!(s.x)
end

Right now I get this error:

ERROR: LoadError: AssertionError: c > 0
 in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:186
 in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:176 (repeats 4 times)
 in update!(::Optim.DifferentiableFunction, ::Optim.GradientDescentState{Float64}, ::Optim.GradientDescent{Void}) at /Users/alex/.julia/v0.5/Optim/src/gradient_descent.jl:52
 in macro expansion; at /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl:23 [inlined]
 in macro expansion; at /Users/alex/.julia/v0.5/ProgressMeter/src/ProgressMeter.jl:473 [inlined]
 in anonymous at ./<missing>:?
 in include_from_node1(::String) at ./loading.jl:426
while loading /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl, in expression starting on line 22

I'll keep playing around with this, but am curious to hear feedback on whether this will mesh well with the intended approach here.

(I also want to mention #254 so that earlier discussion is linked to this PR)

pkofod · 2016-09-03T06:50:49Z

Great to see a simple example of how this could be useful outside of Optim.

I actually found a place in NelderMead where I hadn't added the state. from playing around with your example :) (apparently none of our tests causes a shrink!)

I'll see if I can further investigate it later.

ahwillia · 2016-09-03T19:32:21Z

Changing the inner loop like this seems to work (though there is a lot of unnecessary overhead):

s = Optim.initial_state(gd, o, d, x0)
for iter = 1:iterations
    Optim.update!(d, s, gd)
    s = Optim.initial_state(gd, o, d, project!(s.x))
end

Looks like the problem is that update! calculates the gradient at the end of the function so I need to manually recalculate the gradient before moving to the next loop iteration. I haven't given this much thought, but would it make sense for update! to calculate the gradient before doing linesearch and moving the parameters?

johnmyleswhite · 2016-09-06T15:04:48Z

I haven't given this much thought, but would it make sense for update! to calculate the gradient before doing linesearch and moving the parameters?

I think this could be made to work if we want to make update! idempotent in the sense that it will not modify the current state if it detects convergence. But you also need to decide if you want to check for convergence at all.

pkofod · 2016-09-07T07:10:18Z

Changing the inner loop like this seems to work (though there is a lot of unnecessary overhead):

s = Optim.initial_state(gd, o, d, x0)
for iter = 1:iterations
Optim.update!(d, s, gd)
s = Optim.initial_state(gd, o, d, project!(s.x))
end

Looks like the problem is that update! calculates the gradient at the end of the function so I need to manually recalculate the gradient before moving to the next loop iteration. I haven't given this much thought, but would it make sense for update! to calculate the gradient before doing linesearch and moving the parameters?

Again, I think your example really shows why this refactoring is worth it, but we need to think about how to do this the best way.

We could move the gradient calls to the top. What is the issue with that? Well, then we have a superfluous gradient calculation in the first iteration, as we have already calculated the gradient to do the pre-iteration convergence check. If we don't do it, we revive the "linesearch error if initial point is a stationary point". Instead, we could remove the initial convergence check, and then turn the line search error into a line search warning. We could then add a flag, such that when the warning is thrown, a flag is changed to true, and as soon as update! is over, the major loop terminates.

This would then require some form of #263 to be implemented. I think that might work.

pkofod · 2016-09-19T07:34:03Z

Still really excited about this! One thing I'd like to try is to build on top of this to optimize problems with constraints.

I'm not sure if this will end up being a dead end, but I decided to start experiment by doing projected gradient descent to solve a non-negative least squares problem (a terrible way to solve it, but a useful toy problem nonetheless).

using Optim

const A = rand(10,10)
const b = rand(10)

f(x) = 0.5_sumabs2(A_x - b)
d = DifferentiableFunction(f)
x0 = randn(10)

project onto non-negative orthant

project!(x) = map!(z-> z<0 ? 0.0 : z, x)

gd = GradientDescent()
o = OptimizationOptions()

s = Optim.initial_state(gd, o, d, x0)

for iter = 1:iterations
Optim.update!(d, s, gd)
project!(s.x)
end

Right now I get this error:

ERROR: LoadError: AssertionError: c > 0
in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:186
in hz_linesearch!(::Optim.DifferentiableFunction, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Array{Float64,1}, ::Optim.LineSearchResults{Float64}, ::Float64, ::Bool, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Float64, ::Int64, ::Float64, ::Int64) at /Users/alex/.julia/v0.5/Optim/src/linesearch/hz_linesearch.jl:176 (repeats 4 times)
in update!(::Optim.DifferentiableFunction, ::Optim.GradientDescentState{Float64}, ::Optim.GradientDescent{Void}) at /Users/alex/.julia/v0.5/Optim/src/gradient_descent.jl:52
in macro expansion; at /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl:23 [inlined]
in macro expansion; at /Users/alex/.julia/v0.5/ProgressMeter/src/ProgressMeter.jl:473 [inlined]
in anonymous at ./:?
in include_from_node1(::String) at ./loading.jl:426
while loading /Users/alex/.julia/v0.5/Optim/examples/projected_gd.jl, in expression starting on line 22

I'll keep playing around with this, but am curious to hear feedback on whether this will mesh well with the intended approach here.

(I also want to mention #254 so that earlier discussion is linked to this PR)

This now works (as far as I can tell)

using Optim

const A = rand(10,10)
const b = rand(10)

f(x) = 0.5*sumabs2(A*x - b)
d = DifferentiableFunction(f)
x0 = randn(10)

# project onto non-negative orthant
project!(x) = map!(z-> z<0 ? 0.0 : z, x)

gd = GradientDescent()
o = OptimizationOptions()

s = Optim.initial_state(gd, o, d, x0)
iterations = 100
for iter = 1:iterations
    Optim.update_state!(d, s, gd)
    project!(s.x)
    Optim.update_g!(d, s, gd)
end

pkofod · 2016-09-19T07:35:12Z

att: @ahwillia

(btw, if I edit a post and add a mention, are people then pinged?)

pkofod · 2016-09-19T11:27:12Z

I believe most of the work in the PR is done by now. The only problem right now is the appearance of state. everywhere. We may want to unpack them ala Parameters.jl, but that is more of a prettifying thing. If it blocks work such as the linesearch PR, then I think it can wait. Comments are still very welcome.

ahwillia · 2016-09-19T17:21:48Z

👏 nice. I will play around with this later this week. Let me know if there is anything specific you'd like feedback on.

…nResults.

pkofod · 2016-09-26T20:48:36Z

Alright, I'll go ahead and merge this, but this doesn't have to be the final say. We can test it out, and tweak it before the next tag. Thanks for all the comments!

pkofod · 2016-09-29T11:35:31Z

I'm not sure if that's right either. For example, this LLVM output looks pretty dire for keyword arguments in extremely simple functions: https://gist.github.com/johnmyleswhite/7484870a0f3656c735ae38bd1d5a0268

I know this is closed now, but @johnmyleswhite we were not the first to be surprised/confused by the wording in the docs JuliaLang/julia#9551 (comment)

Evizero reviewed Aug 16, 2016
View reviewed changes

pkofod force-pushed the pkm/refactor branch 2 times, most recently from 700c597 to 8e53c71 Compare August 16, 2016 20:10

Evizero reviewed Aug 18, 2016
View reviewed changes

pkofod force-pushed the pkm/refactor branch 4 times, most recently from ac3f5dc to e496672 Compare August 26, 2016 08:20

pkofod force-pushed the pkm/refactor branch from c2eed4b to 2ca6e24 Compare August 27, 2016 08:49

johnmyleswhite reviewed Aug 29, 2016
View reviewed changes

anriseth mentioned this pull request Sep 7, 2016

RFC: catch linesearch errors #275

Merged

1 task

pkofod added 15 commits September 18, 2016 23:45

Fix accelerated GD.

859f931

Fix best_point-> x in particle swarm.

0da24e2

Collect traces, and add ParticleSwarm to the mix.

f011980

Simulated Annealing added.

b2eb8dc

Lower iterations in Large Polynomial test for NelderMead.

a8f8044

Add SimulatedAnnealing tests back in.

995fec4

Add NewtonTrustRegion.

ac7be51

Add generic fields macro.

69a5e16

Generic line search field macro.

53368d9

Change initialize to initial.

71746e5

Fix shrink in NelderMead.

98d1ed6

Reinstate callbacks.

c7b040e

Improve docs related to changes im the refactoring.

bb8a6f8

Cleanup.

89795e4

Fix initialize in fminbox.

c310490

pkofod force-pushed the pkm/refactor branch from 0972402 to c310490 Compare September 18, 2016 21:49

pkofod mentioned this pull request Sep 20, 2016

Maximum optimization time limit #117

Closed

Factor g and h updates out of the state updater, and add a time limit.

278b68f

pkofod force-pushed the pkm/refactor branch from f14bb37 to 278b68f Compare September 20, 2016 21:04

Remove elapsed from states and add h_calls to MultivariateOptimizatio…

b3dff26

…nResults.

ahwillia mentioned this pull request Sep 25, 2016

Changes to DifferentiableFunction API #287

Closed

pkofod merged commit 051641e into master Sep 26, 2016

anriseth mentioned this pull request Sep 27, 2016

minimum -> minimizer and f_minimum -> minimum. #288

Merged

1 task

pkofod deleted the pkm/refactor branch October 17, 2016 20:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] [RFC!] Refactoring... #266

[WIP] [RFC!] Refactoring... #266

pkofod commented Aug 16, 2016 •

edited

Loading

tbreloff commented Aug 16, 2016

KristofferC commented Aug 16, 2016 •

edited

Loading

pkofod commented Aug 16, 2016

Evizero Aug 16, 2016 •

edited

Loading

pkofod Aug 16, 2016

codecov-io commented Aug 16, 2016 •

edited

Loading

pkofod commented Aug 18, 2016

Evizero Aug 18, 2016 •

edited

Loading

pkofod Aug 18, 2016

pkofod commented Aug 26, 2016

pkofod commented Aug 27, 2016

Evizero commented Aug 27, 2016

pkofod commented Aug 29, 2016

johnmyleswhite Aug 29, 2016

pkofod Aug 29, 2016

pkofod Sep 19, 2016

ahwillia commented Sep 3, 2016 •

edited

Loading

pkofod commented Sep 3, 2016

ahwillia commented Sep 3, 2016 •

edited

Loading

johnmyleswhite commented Sep 6, 2016

pkofod commented Sep 7, 2016

pkofod commented Sep 19, 2016 •

edited

Loading

project onto non-negative orthant

pkofod commented Sep 19, 2016 •

edited

Loading

pkofod commented Sep 19, 2016

ahwillia commented Sep 19, 2016

pkofod commented Sep 26, 2016

pkofod commented Sep 29, 2016

[WIP] [RFC!] Refactoring... #266

[WIP] [RFC!] Refactoring... #266

Conversation

pkofod commented Aug 16, 2016 • edited Loading

tbreloff commented Aug 16, 2016

KristofferC commented Aug 16, 2016 • edited Loading

pkofod commented Aug 16, 2016

Evizero Aug 16, 2016 • edited Loading

Choose a reason for hiding this comment

pkofod Aug 16, 2016

Choose a reason for hiding this comment

codecov-io commented Aug 16, 2016 • edited Loading

Current coverage is 86.75% (diff: 91.16%)

pkofod commented Aug 18, 2016

Evizero Aug 18, 2016 • edited Loading

Choose a reason for hiding this comment

pkofod Aug 18, 2016

Choose a reason for hiding this comment

pkofod commented Aug 26, 2016

pkofod commented Aug 27, 2016

Evizero commented Aug 27, 2016

pkofod commented Aug 29, 2016

johnmyleswhite Aug 29, 2016

Choose a reason for hiding this comment

pkofod Aug 29, 2016

Choose a reason for hiding this comment

pkofod Sep 19, 2016

Choose a reason for hiding this comment

ahwillia commented Sep 3, 2016 • edited Loading

pkofod commented Sep 3, 2016

ahwillia commented Sep 3, 2016 • edited Loading

johnmyleswhite commented Sep 6, 2016

pkofod commented Sep 7, 2016

pkofod commented Sep 19, 2016 • edited Loading

project onto non-negative orthant

pkofod commented Sep 19, 2016 • edited Loading

pkofod commented Sep 19, 2016

ahwillia commented Sep 19, 2016

pkofod commented Sep 26, 2016

pkofod commented Sep 29, 2016

pkofod commented Aug 16, 2016 •

edited

Loading

KristofferC commented Aug 16, 2016 •

edited

Loading

Evizero Aug 16, 2016 •

edited

Loading

codecov-io commented Aug 16, 2016 •

edited

Loading

Evizero Aug 18, 2016 •

edited

Loading

ahwillia commented Sep 3, 2016 •

edited

Loading

ahwillia commented Sep 3, 2016 •

edited

Loading

pkofod commented Sep 19, 2016 •

edited

Loading

pkofod commented Sep 19, 2016 •

edited

Loading