Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misreported non-convergence of convex function #374

Closed
tpapp opened this issue Feb 21, 2017 · 11 comments
Closed

misreported non-convergence of convex function #374

tpapp opened this issue Feb 21, 2017 · 11 comments

Comments

@tpapp
Copy link
Contributor

tpapp commented Feb 21, 2017

This is an MWE I reduced from a bigger problem (finding ML estimates of a logistic regression using summary statistics). optimize always finds the same minimizer (as expected, since the function is convex; I also compared to GLM.jl). But for BFGS() without autodiff, it reports non-convergence, with autodiff it reports convergence. For ConjugateGradient, it reports non-convergence, with and without autodiff.

using Optim
using StatsFuns

N0 = [1179,2986,581352,2464,3334,5618,2954,496,503]
N1 = [656,563,9354,545,468,720,678,162,190]
X = [1.0 -0.333333 -0.333333 -0.333333 -0.333333;
     1.0 -0.333333 0.666667 0.666667 -0.333333;
     1.0 -0.333333 0.666667 -0.333333 0.666667;
     1.0 -0.333333 0.666667 -0.333333 -0.333333;
     1.0 -0.333333 -0.333333 -0.333333 0.666667;
     1.0 0.666667 -0.333333 -0.333333 0.666667;
     1.0 0.666667 -0.333333 0.666667 -0.333333;
     1.0 -0.333333 -0.333333 0.666667 -0.333333;
     1.0 0.666667 -0.333333 -0.333333 -0.333333]

function ℓ_logistic_regression(X, N0, N1, β)
    Xβ = X*β
    sum(-N0.*- (N0+N1).* log1pexp.(-Xβ))
end

function init_logistic_regression(X, N0, N1)
    X \ logit.(N1./(N0+N1))
end

Optim.optimize-> -ℓ_logistic_regression(X, N0, N1, β),
               init_logistic_regression(X, N0, N1),
               BFGS(), Optim.Options(autodiff = true)) # change the last line for various results

eg

julia> Optim.optimize-> -ℓ_logistic_regression(X, N0, N1, β),
       init_logistic_regression(X, N0, N1),
       BFGS())
Results of Optimization Algorithm
 * Algorithm: BFGS
 * Starting Point: [-1.7194691162922449,-0.2770378660884508, ...]
 * Minimizer: [-1.8220199098825927,-0.25189789184931655, ...]
 * Minimum: 5.882142e+04
 * Iterations: 12
 * Convergence: false
   * |x - x'| < 1.0e-32: false
   * |f(x) - f(x')| / |f(x)| < 1.0e-32: false
   * |g(x)| < 1.0e-08: false
   * f(x) > f(x'): true
   * Reached Maximum Number of Iterations: false
 * Objective Function Calls: 51
 * Gradient Calls: 51
@anriseth
Copy link
Contributor

You are hitting the allow_f_increases issue: Per default, solver stops if a step increases the objective value. If you use Optim.Options(allow_f_increases = true), the solver will ignore such issues and continue:

Optim.optimize-> -ℓ_logistic_regression(X, N0, N1, β),
                      init_logistic_regression(X, N0, N1),
                      BFGS(), Optim.Options(autodiff = false,allow_f_increases=true)) # change the last line for various results
Results of Optimization Algorithm
 * Algorithm: BFGS
 * Starting Point: [-1.7194691162922449,-0.277037866088451, ...]
 * Minimizer: [-1.8220201057317167,-0.2518974050806891, ...]
 * Minimum: 5.882142e+04
 * Iterations: 19
 * Convergence: true
   * |x - x'| < 1.0e-32: false
   * |f(x) - f(x')| / |f(x)| < 1.0e-32: true
   * |g(x)| < 1.0e-08: false
   * f(x) > f(x'): true
   * Reached Maximum Number of Iterations: false
 * Objective Function Calls: 76
 * Gradient Calls: 76

I don't do much regression, but it looks to me that your system is quite poorly scaled (minimum value larger than 10^4)

@pkofod
Copy link
Member

pkofod commented Feb 21, 2017

Maybe we should reword that output a bit.

@KristofferC
Copy link
Contributor

Definitely.

@anriseth
Copy link
Contributor

From the tolerances given by the user, the BFGS algorithm has not converged. It is close to the correct solution, but because of poor scaling(maybe?) and finite differences it has not reached the supplied tolerances.

@pkofod
Copy link
Member

pkofod commented Feb 21, 2017

That part Yes, but I think most users will fail to see the increased part in the output that was posted and realize what happened.

@tpapp
Copy link
Contributor Author

tpapp commented Feb 21, 2017

My bad — the optimization packages (not in Julia) I have been using up to now had termination criteria like

|x - x'| < ε⋅(1+|x|)
|g(x)| < δ⋅(1+|f(x)|)

which are less sensitive to scaling of the objective, and I did not realize that here they are relative, not absolute. I wonder if the above criteria would make sense for Optim.

@tpapp
Copy link
Contributor Author

tpapp commented Feb 25, 2017

I was wondering what the best way would be for incorporating the convergence criteria above into Optim, so that I could submit a PR. They generalize with parameters α, β as

|x - x'| < ε⋅(α+|x|)
|g(x)| < δ⋅(β+|f(x)|)

where α=β=0 is the current convergence criterion and α=β=1 is the proposed one.

@anriseth
Copy link
Contributor

anriseth commented Feb 26, 2017

The current convergence checks can be found here: https://github.com/JuliaNLSolvers/Optim.jl/blob/master/src/utilities/assess_convergence.jl

I agree that it would be useful to implement other types of convergence criteria (esp. relative ones). Maybe even make it modular so that the user can supply their own tests as well? (Or can this be done with callbacks already?)

where α=β=0 is the current convergence criterion and α=β=1 is the proposed one.

I believe the current check on the gradient the absolute |g(x)| < tol, and so it the check on displacement |x-x'|<tol, so alpha and beta zero will not cover either of the cases above.

@pkofod
Copy link
Member

pkofod commented Mar 23, 2017

My bad — the optimization packages (not in Julia) I have been using up to now had termination criteria like

What packages in particular?

@pkofod
Copy link
Member

pkofod commented Apr 20, 2017

(Or can this be done with callbacks already?)

Yes

@pkofod
Copy link
Member

pkofod commented Apr 21, 2017

Closing this, as it was a misunderstanding of the output. We can continue the discussion about stopping criteria in a dedicated issue.

@pkofod pkofod closed this as completed Apr 21, 2017
@pkofod pkofod mentioned this issue Apr 21, 2017
13 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants