Skip to content

FEAT: Add linprog_simplex #413

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 11 commits into from
Closed

FEAT: Add linprog_simplex #413

wants to merge 11 commits into from

Conversation

QBatista
Copy link
Member

@QBatista QBatista commented Jun 3, 2018

Implements the original simplex method using Bland's rule and accelerated with Numba to be used in the repeated games outerapproximation algorithm implementation.

Performance comparison

For assessing the performance, I picked one of the real-world problems from the NETLIB LP test problem set, which are available in Scipy (see here).

linprog_test = np.load('linprog_benchmark_files/ADLITTLE.npz')

M_ub, N = linprog_test['A_ub'].shape
M_eq, N = linprog_test['A_eq'].shape

tableau = make_tableau(linprog_test['c'], linprog_test['A_ub'], linprog_test['b_ub'], linprog_test['A_eq'], linprog_test['b_eq'])

%timeit linprog_simplex(tableau, N, M_ub, M_eq)
%timeit linprog(linprog_test['c'], linprog_test['A_ub'],
                          linprog_test['b_ub'], linprog_test['A_eq'],
                          linprog_test['b_eq'], method='simplex')
%timeit linprog(linprog_test['c'], linprog_test['A_ub'],
                          linprog_test['b_ub'], linprog_test['A_eq'],
                          linprog_test['b_eq'], method='interior-point')
1000 loops, best of 3: 328 µs per loop
10 loops, best of 3: 24.3 ms per loop
10 loops, best of 3: 16.8 ms per loop

Line profiling

%load_ext line_profiler

%lprun -f linprog_simplex linprog_simplex(tableau, N, M_ub, M_eq)
Timer unit: 1e-06 s

Total time: 0.001692 s
File: <ipython-input-258-890f100c68e6>
Function: linprog_simplex at line 16

Line #      Hits         Time  Per Hit   % Time  Line Contents
==============================================================
    16                                           def linprog_simplex(tableau, N, M_ub=0, M_eq=0, maxiter=10000, tol=1e-10):
    17                                               """
    18                                               Solve the following LP problem using the original Simplex method with
    19                                               Bland's rule:
    20
    21                                               Minimize:    c.T @ x
    22                                               Subject to:  A_ub @ x ≤ b_ub
    23                                                            A_eq @ x = b_eq
    24                                                            x ≥ 0
    25
    26                                               Jit-compiled in `nopython` mode.
    27
    28                                               Parameters
    29                                               ----------
    30                                               tableau : ndarray(float, ndim=2)
    31                                                   2-D array containing the standardized LP problem in detached
    32                                                   coefficients form augmented with the artificial variables and the
    33                                                   infeasibility form and with nonnegative constant terms. The
    34                                                   infeasibility form is assumed to be placed in the last row, and the
    35                                                   objective function in the second last row.
    36
    37                                               N : scalar(int)
    38                                                   Number of control variables.
    39
    40                                               M_ub : scalar(int)
    41                                                   Number of inequality constraints.
    42
    43                                               M_eq : scalar(int)
    44                                                   Number of equality constraints.
    45
    46                                               maxiter : scalar(int), optional(default=10000)
    47                                                   Maximum number of pivot operation for each phase.
    48
    49                                               tol : scalar(float), optional(default=1e-10)
    50                                                   Tolerance for treating an element as nonpositive.
    51
    52                                               Return
    53                                               ----------
    54                                               sol : ndarray(float, ndim=1)
    55                                                   A basic solution to the linear programming problem.
    56
    57                                               References
    58                                               ----------
    59
    60                                               [1] Dantzig, George B., Linear programming and extensions. Rand Corporation
    61                                                   Research Study Princeton Univ. Press, Princeton, NJ, 1963
    62
    63                                               [2] Bland, Robert G. (May 1977). "New finite pivoting rules for the
    64                                                   simplex method". Mathematics of Operations Research. 2 (2): 103–107.
    65
    66                                               [2] http://mat.gsia.cmu.edu/classes/QUANT/NOTES/chap7.pdf
    67
    68                                               """
    69         1          4.0      4.0      0.2      M = M_ub + M_eq
    70
    71                                               # Phase I
    72         1          1.0      1.0      0.1      num_vars = N + M_ub + M
    73         1        297.0    297.0     17.6      simplex_algorithm(tableau, M, num_vars, maxiter=maxiter)
    74
    75         1         12.0     12.0      0.7      if abs(tableau[-1, -1]) > tol:
    76                                                   raise ValueError(infeasible_err_msg)
    77
    78         1         20.0     20.0      1.2      nonpos_coeff = tableau[-1, :] <= tol
    79         1        141.0    141.0      8.3      tableau = tableau[:-1, nonpos_coeff]
    80
    81                                               # Phase II
    82         1         35.0     35.0      2.1      num_vars = nonpos_coeff[:-1].sum()
    83         1        998.0    998.0     59.0      simplex_algorithm(tableau, M, num_vars, maxiter=maxiter)
    84
    85         1        183.0    183.0     10.8      sol = _find_basic_solution(tableau, nonpos_coeff, N, M)
    86
    87         1          1.0      1.0      0.1      return sol

Limitations

  • The solver fails on some problems that can be solved using interior point methods such as SHARE1B.npz.

  • Bland's rule can slow down the process massively. On the BLEND.npz example, the solver does not even reach phase II after more than 100 million iterations while Scipy's implementation without Bland's rule converges after 130 iterations. Scipy's implementation with Bland's rule did not converge after 500K iterations.

  • Numerical instability causes issues for some real world problems. For example, AGG.npz fails with the default tolerance but gets very close to the solution if the tolerance is lowered to 1e-7. Scipy's interior-point method works on this problem with the default tolerance.

@oyamad Should we aim to solve all the problems in Netlib LP? This is beyond the aim of this PR but it might be a nice goal if we want a fast and reliable LP solver.

@coveralls
Copy link

coveralls commented Jun 3, 2018

Coverage Status

Coverage increased (+0.3%) to 94.678% when pulling 7f926d2 on QBatista:master into 72bd401 on QuantEcon:master.

@oyamad
Copy link
Member

oyamad commented Jun 3, 2018

A few comments at the moment:

  1. You should try to explore the reason of the failure in each case.
  2. Did you try other tie breaking rules, such as the lexico-minimum ratio test?
  3. Be sure to indicate the source whenever you copy-paste code form somewhere else (such as SciPy).
  4. Regarding the organization, better to create a submodule (subdirectory) under quantecon, possibly with the name optimize?
  5. I would return a specific status code, rather than raising an error, when the maximum number of iteration is reached.

@QBatista
Copy link
Member Author

QBatista commented Jun 4, 2018

Sure, I'll work on those. linprog might be better unless we want to include other types of optimizers as well?

@oyamad
Copy link
Member

oyamad commented Jun 4, 2018

See the comments following #216 (comment).

@cc7768
Copy link
Member

cc7768 commented Jun 4, 2018

This is an excellent contribution! I hope that this allows the repeated games code to be more effective than it initially was in Python.

@jstac is planning on having more numba based optimization routines (the discussion @oyamad linked to) and I agree with everyone that having an optimize submodule is a great idea.

@oyamad, let me know if you need another set of eyes while reviewing this code. A quick glance reveals that the documentation is well done and thorough, but I haven't carefully read the code.

@oyamad
Copy link
Member

oyamad commented Jun 5, 2018

@cc7768 Your review will be appreciated. One of the great things of this PR is that a big set of tests is already prepared (from SciPy). As long as the tests pass (they do), we can first discuss for the "best" public API, which should depend on possible use cases.

@QBatista Can you make available somewhere your Python notebook in which your linprog is used in the repeated game routine as an example?

@mmcky mmcky requested review from cc7768 and oyamad June 5, 2018 23:55
@QBatista
Copy link
Member Author

QBatista commented Jun 6, 2018

Here is the notebook: https://nbviewer.jupyter.org/github/QBatista/Notebooks/blob/master/linprog_simplex%20%5BWIP%5D.ipynb
I haven't cleaned up the code yet, I only replaced the solver.

I also fixed a bug that was related to a misunderstanding of how the already existing min_ratio_test function works. From the documentation, I implicitly thought that only argmins[:num_argmins] would be modified when running min_ratio_test. This is however not necessarily the case, which lead to the bug. Would it be helpful to clarify this in the documentation?

@oyamad
Copy link
Member

oyamad commented Jun 6, 2018

What variable does min_ratio_test modify other than argmins?

@QBatista
Copy link
Member Author

QBatista commented Jun 6, 2018

I meant that I thought only the num_argmins first elements of argmins would be modified.

@oyamad
Copy link
Member

oyamad commented Jun 6, 2018

If the documentation says an array is "modified in place", the one should not expect that a particular part of that array will not be modified.

@QBatista
Copy link
Member Author

QBatista commented Jun 6, 2018

Ok, sounds good.

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

@oyamad I am not sure how you want me to use the lexico-minimum ratio test. Bland's rule determines how both the pivot row and pivot column are chosen while the lexico-minimum test as implemented in lemke_howson takes the pivot column as an argument and chooses the pivot row. Do you have a specific reference in mind you would like me to implement? It seems that "lexicographic rule" usually refers to this, but it is possible to find others.

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

I also figured out what was going wrong with the BLEND.npz example. I had an older version of Scipy (1.0.0) on computer which I followed for the implementation of Bland's rule and which actually contained a bug. To fix this, I need to keep track of the set of basic variables in tableau, which will require some refactoring but should actually make some parts of the code easier to understand.

@oyamad
Copy link
Member

oyamad commented Jun 7, 2018

Which part of that paper should I look at?

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

Part 2 (up to the proof of theorem 5) describes how to solve the generalized matrix problem in which the LP problem is embedded. Part 3 describes one of the possible ways of embedding it. The term "lexicographic rule" is not explicitly used in the paper, but the variant is referred to as using the lexicographic rule, for example, by Pan (2014) which you showed me.

@oyamad
Copy link
Member

oyamad commented Jun 7, 2018

A quick Google search found this.

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

Ok, I can implement this. I think it's the same as in the paper but explained in much simpler terms. The pivot column is chosen arbitrarily among nonbasic variables; should I try different rules and compare them?

@oyamad
Copy link
Member

oyamad commented Jun 7, 2018

Is _lex_min_ratio_test in lemke_howson not usable?

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

It should be for choosing the pivot row, but my point is about the pivot column. For example, the pivot column can be chosen to be the index of the first nonbasic variable found or that of the one with the largest coefficient.

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

To clarify, _lex_min_ratio_test takes pivot as an argument which is what I am referring to as the pivot column, and returns row_min which I am referring to as the pivot row.

@oyamad
Copy link
Member

oyamad commented Jun 7, 2018

I don't see the issue. Why does the way to choose the pivot column matter (for the purpose of tie breaking), once the right hand side is perturbed?

@QBatista
Copy link
Member Author

QBatista commented Jun 7, 2018

It doesn't matter for the specific purpose of tie breaking, but wouldn't it interact with the rule for choosing the pivot column in terms of overall performance? This might not actually have much of an impact on the performance, especially if ties don't occur frequently, but I was intrigued because there is no explicit choice of a rule.

@oyamad
Copy link
Member

oyamad commented Jun 8, 2018

A quick Google search found this (for the revised simplex). Also Pan (2014) has a chapter on "Pivot Rule".

@QBatista
Copy link
Member Author

QBatista commented Jun 8, 2018

Thanks, this looks very useful!

@QBatista QBatista force-pushed the master branch 3 times, most recently from d7e6d56 to a57af7f Compare June 9, 2018 13:26
@QBatista
Copy link
Member Author

QBatista commented Jul 5, 2018

@oyamad I've reorganized the commit history -- is this good for you?

Copy link
Member

@cc7768 cc7768 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just went through entire code and have effectively nothing useful to say. I made one comment in a file about being consistent between Roman numerals and numbers. I didn't check the algorithm itself very carefully, but I'm convinced it is doing what we expect given that the code is giving the right answer to the (extensive) tests that you have included.

The only other thing that we might consider is moving this into the optimize sub-package . What do @oyamad and @mmcky think about this?


infeasible_err_msg = "The problem appears to be infeasible"
unbounded_obj = "The problem appears to be unbounded."
max_iter_p1 = "The maximum number of iterations has been reached in Phase I"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small thing, but in max_iter_p2 you use "Phase 2" and in max_iter_p1 you use "Phase I" -- Either change this to "Phase 1" or the other to "Phase II" just to be consistent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cc7768 Thanks!

@QBatista QBatista force-pushed the master branch 7 times, most recently from 4390252 to 5259498 Compare August 1, 2018 13:40
@mmcky
Copy link
Contributor

mmcky commented Aug 13, 2018

thanks @cc7768 for your review - greatly appreciated.

This is looking pretty complete now. @oyamad I will leave you to merge this when you're happy with it.

@QBatista QBatista force-pushed the master branch 2 times, most recently from df73c24 to 3f7ddd2 Compare September 12, 2018 22:43
@mmcky
Copy link
Contributor

mmcky commented Apr 3, 2020

@QBatista is this still being worked on? Can we close these PR's

@QBatista
Copy link
Member Author

QBatista commented Apr 8, 2020

@mmcky Yes, I'll close this PR for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants