Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with PyPy on Travis #1730

Open
pganssle opened this issue Apr 2, 2019 · 11 comments · Fixed by #1746
Open

Issue with PyPy on Travis #1730

pganssle opened this issue Apr 2, 2019 · 11 comments · Fixed by #1746
Labels

Comments

@pganssle
Copy link
Member

pganssle commented Apr 2, 2019

All jobs are now failing on pypy, see this job. Locally I cannot replicate this with pypy2.7-X, but the same thing does fail with "Too many open files" in pypy3.6-7.1.0. It's very hard to debug this. @gaborbernat replicated it on a macbook, he similarly can't replicate the 2.7 failure, and the 3.6 failure is a different error message than the one I have.

There may be two problems here, possibly with similar causes. I think it's possible that Travis has scaled down the amount of memory available and that's biting us in the pypy2.7 job? Hard to tell.

If we want a short-term mitigation, we can maybe try:

  1. Try updating the pypy version in the job.
  2. Try migrating the pypy job in Azure Pipelines (though that would require getting Add Azure Pipelines #1721 working.

This is a major blocking issue IMO.

@webknjaz
Copy link
Member

webknjaz commented Apr 2, 2019

You can try pypy2.7-6.0.0 and pypy3.5-6.0.0 under xenial

@webknjaz
Copy link
Member

webknjaz commented Apr 2, 2019

@pganssle it's very poorly advertised, but you can ask Travis support to enable debug on repo: https://docs.travis-ci.com/user/running-build-in-debug-mode/
I've done it today myself and they enabled it almost instantly :)
This gives you ssh (via tmate) into the VM and also there's helpers to run travis steps.

@pganssle
Copy link
Member Author

pganssle commented Apr 2, 2019

With @webknjaz's suggestion fixing the CI, I have downgraded this from "blocker" to "minor". I'm leaving it open a bit longer because this seems to be a real issue that might bite us later, and apparently has other manifestations on pypy3.6.

@webknjaz
Copy link
Member

webknjaz commented Apr 3, 2019

pypy3.6 isn't yet accessible in Travis, is it?

@pganssle
Copy link
Member Author

pganssle commented Apr 3, 2019

@webknjaz To be honest I don't know, but pypy2.7 on Travis was failing with a resource error in the same tests that pypy3.6 is currently failing with different resource errors. That suggests to me that the tests we have may be more fragile or resource-intensive than we think. Without knowing why this is causing particular stress on pypy, it's hard to tell if this is something that will bite us in the future, so I'd like to investigate why the pypy3.6 problem (which is reproducible) is happening.

@webknjaz
Copy link
Member

webknjaz commented Apr 3, 2019

@pganssle can you show me the Travis job? I didn't see any pypy3.6 jobs on Travis...

@pganssle
Copy link
Member Author

pganssle commented Apr 3, 2019

@webknjaz It's not a Travis job. @gaborbernat and I both ran tox -e pypy3.6 (I used pypy3.6-7.1.0, not sure what Bernat used) and both failed, same tests, different errors.

@webknjaz
Copy link
Member

webknjaz commented Apr 3, 2019

Oh, I thought you were implying that it's possible on travis :)

@gaborbernat
Copy link
Contributor

pypy3 --version                                                                                                                                                          9.5s  Wed  3 Apr 10:51:14 2019
Python 3.6.1 (dab365a465140aa79a5f3ba4db784c4af4d5c195, Feb 18 2019, 10:53:27)
[PyPy 7.0.0-alpha0 with GCC 4.2.1 Compatible Apple LLVM 10.0.0 (clang-1000.11.45.5)]

@pganssle
Copy link
Member Author

Doing a bit more debugging for this. With PyPy 3.5.3-7.0.0, I ran:

tox -e pypy3 -- -x --pdb -k test_build_meta

And when it was in the breakpoint, in another window I ran:

$ ps -aux | grep pypy | wc -l

Just as it was running I was seeing numbers as high as 469, and when it was paused in the debugger, there were 351 processes open. I think this may be causing the various resource exhaustion problems. With CPython, I'm only seeing around 30-40 processes opened.

I'm not really sure why so many processes are being opened at all here. PEP 517 does require that the backend be run in a fresh process, but I got the impression that in each one of these builds, we're actually waiting for the execution to finish before moving on to the next test, so why are we seeing anything more than 2-3 processes at all? Anyone who knows more about concurrency / multiprocessing can weigh in? The relevant code is here.

pganssle added a commit to pganssle/setuptools that referenced this issue Apr 22, 2019
As a mitigation for pypa#1730, this commit limits the number of workers in
the ProcessPoolExecutor to 1 (default is the number of CPUs). On PyPy,
having a higher number of available workers dramatically increases the
number of concurrent processes, leading to some resource exhaustion
issues.

This does not address the root issue, but should improve the situation
until the root issue is addressed.
@pganssle pganssle reopened this Apr 22, 2019
pganssle added a commit to pganssle/setuptools that referenced this issue Apr 22, 2019
As a mitigation for pypa#1730, this commit limits the number of workers in
the ProcessPoolExecutor to 1 (default is the number of CPUs). On PyPy,
having a higher number of available workers dramatically increases the
number of concurrent processes, leading to some resource exhaustion
issues.

This does not address the root issue, but should improve the situation
until the root issue is addressed.
@webknjaz
Copy link
Member

webknjaz commented May 4, 2019

@pganssle FTR now you can use just pypy and pypy3 values in Travis CI config and it will get you Python 2.7.13 (8cdda8b8cdb8, Apr 14 2019, 14:06:44) [PyPy 7.1.1 with GCC 6.2.0 20160901] and Python 3.6.1 (784b254d6699, Apr 14 2019, 10:22:42) [PyPy 7.1.1-beta0 with GCC 6.2.0 20160901] there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants