-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault when running a test with custom pivot #105
Comments
/CCing @vbadanov and @BenChampion since it was their code which was fixing the memory leak (and introducing this segafult as an unpleasant side effect). Could you guys please take a look and perhaps propose a solution to this problem? Thank you! |
I've been looking into this today, and I've found a different segfault - my nullptr dereference is here:
Once I fixed this and ran the repro from the correct directory, I can confirm that I can repro @sphynx's segfault on Windows 10 too, my callstack is:
|
Any feedback on this, please? |
I finally got to tracking this down. Debugging auto-generated Cython code is painful to say the least. The proximal cause of the seg fault is that CyLP/cylp/cy/CyPivotPythonBase.pyx Lines 41 to 44 in e619c4b
Then because the pointer to that that CyClpSimplex is only stored in a local variable, it is deallocated again once that variable goes out of scope, which then deallocates the ClpSimplex object. But this object is still in use. It wasn't a problem when the ClpSimplex object wasn't being properly deallocated (and this is in fact probably the reason the original author commented out the deallocation). The exact place this happens in the code is line 5284 of cylp/cy/CyPivotPythonBase.cpp , which is in some cryptic auto-generated part of the code. It's hard to see what's going on, so I'm making an educated guess on all of this.
Obviously, one hacky fix would be just to undo the original patch and let the memory leak, which is what has been happening up until recently. Fixing this the right way would probably involve a lot of effort and some intricate knowledge of both Cython and Clp. What is there is already kind of hacky and I'm wondering how many people are actually using this capability of CyLP versus just wanting to be able to make all the tests pass. I can think of a few hacks to fix this, but so far, I don't see one obvious solution and my Cython knowledge is somewhat limited. If anyone has ideas, let me know. For posterity, here is the backtrace at the exact moment in the execution when
|
it seems like some well-paced Py_INCREF calls in the cython code should be able to fix this. I tried throwing some in blindly and was not successful, but likely it wasn't in the right spot. |
Yeah, I was thinking of something along those lines, but the problem is that anything like that would likely introduce another memory leak because you have to keep that CyLP/cylp/cy/CyPivotPythonBase.cpp Lines 5284 to 5286 in e619c4b
I would guess that commenting out line 5286 would fix the issue, but then introduce a new leak. A relatively clean way to fix this would be to clone It seems pretty clear that the |
(Understood. Putting on my linux packaging hat, I would prioritize getting the tests to pass in the default configuration, even if it means removing some likely unused functionality.) |
I just want to comment that this issue is HUGE! I just downloaded and installed CyLP for one single purpose: I wanted to test some custom pivot rules. There aren't many easy ways to jump right into pivot rules from Python. Alas, no luck. |
On current master (which has the fix for the memory leak: 72d66b5 ) I'm getting a segfault when running standard cylp tests (from the file
cylp/py/test_PySolve.py
).I've converted that test to a simple repro:
File
p0033.mps
comes fromcylp/input
directory.I'm getting the following output:
I could reproduce this after building and installing
cylp
usingpip install .
on both ArchLinux (with system-widecoin-or-cbc
package of version2.10.5-4
) and WSL and also when installing usingconda
.Looking at core dump I can see that it failed in
ClpSimplexPrimal::statusOfProblemInPrimal
function (due to null pointer dereferencing):This problem does not happen if I do any of the following:
cylp
version before memory leak fixes, i.e. this commit: 875d5d5initialSolve
instead ofprimal
in the repro.I tried to understand what exactly is causing the null pointer, but the function
ClpSimplexPrimal::statusOfProblemInPrimal
is around 900 lines long, it's not easy to understand and debug.In general, it looks like this memory leak fix (as described in #102 ) is causing subtle memory issues, so we are either getting the segfault or the memory leaks.
Could you please take a look at this?
Thank you!
The text was updated successfully, but these errors were encountered: