Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance, large machines and few users: connection issues #486

Open
michelbl opened this issue Jan 21, 2020 · 6 comments
Open

Performance, large machines and few users: connection issues #486

michelbl opened this issue Jan 21, 2020 · 6 comments
Labels
bug Something isn't working

Comments

@michelbl
Copy link

michelbl commented Jan 21, 2020

I planned to use TLJH for a classroom of about 20 concurrent users. Following http://tljh.jupyter.org/en/latest/howto/admin/resource-estimation.html I estimated 4Gb of RAM would be enough with a good margin so I installed it on a shared OVH server of type s1-4 (4 Go, 1 vCore, 100 Mbit/s).

Whatever the number of concurrent users, launching a new server sometimes fails (maybe once out of 10 times). Retrying usually works. top does not report a high CPU usage and the available swap is not used.

When the 20 users begin to use the jupyterhub, some of them (maybe 5-7 of them), after creating a notebook, have connection issues with the server. As a result, the execution of cells hangs. Restarting the server and logging out and in does not solve the issue.

Because I anticipated there could be performance issues with that configuration, I was ready to deploy a new TLJH on a slightly more powerfull OVH server (b2-7: 7 Go, 2 guaranteed vCore, 250 Mbit/s). But it did not solve the issues previously described. Eventually I had to make them develop locally.

I don't know if this issue is a bug report of a feature request, but several things could help users in such situations:

  • Even with https://tljh.jupyter.org/en/latest/troubleshooting/index.html I am not able to find any hint about the bottleneck (is it a RAM issue, a CPU issue?). sudo journalctl -u juypyterhub does not report anything
  • Without tool to simulate the load of the whole class prior, so I had to test in live. A way to stress test a hub would be great
@willirath
Copy link
Collaborator

willirath commented Jan 21, 2020 via email

@michelbl
Copy link
Author

http://tljh.jupyter.org/en/latest/howto/admin/resource-estimation.html suggests the memory for each user is around 150-180MB (127MB reported by nbresuse with a margin of 20-40%). Why do you take the figure of 200MB?

I added a swap file, restarted all the servers of my users. The swap was not used, yet the same issues appeared.

Going to a server with 7GB of RAM did not change anything.

Do you know of a way to know for sure if the issue is caused by not enough RAM?

@manics
Copy link
Member

manics commented Jan 22, 2020

The figure of 200MB comes from your total memory (4000 MB) divided by your number of users (20). This leaves very little memory for the JupyterHub process and for the operating system.

If you've got dstat installed this you can get some clearer on CPU and memory usage than top shows, e.g. run dstat --vmstat 5 (this continually updates, but also averages over 5 readings which is better than an the instantaneous readings given by top)

@GeorgianaElena GeorgianaElena added the support Support questions (should be on discourse.jupyter.org instead) label Jan 28, 2020
@jesuisse
Copy link

jesuisse commented Dec 9, 2020

I'm having related problems, also with about 20 concurrent users, but on a much stronger machine (16 GB RAM, 4 cores). People keep getting "Your server isn't running" warnings and sometimes they have trouble even reaching the proxy server. It's definitely not a RAM issue, half of my RAM is free and swap is unused. I was thinking maybe it's a problem with the number of open files or concurrent network connections, but both my per-user file limit and the system file limit seem large enough. Unfortunately I don't know how to debug a "too many concurrent network connections" problem, and I'm also not sure how heavy Jupyterhub is on network connections. I'd love to debug this if I knew how...

@aolney
Copy link

aolney commented Jun 10, 2021

Wondering if I have the same issue - but my server has 64 cores and 256GB of memory, with only ~10 users. I've installed dstat and see if I can trace this.

@consideRatio consideRatio added bug Something isn't working and removed support Support questions (should be on discourse.jupyter.org instead) labels Oct 25, 2021
@consideRatio consideRatio changed the title Performance issues with ~20 users Performance, large machines and few users: connection issues Oct 25, 2021
@sawula
Copy link

sawula commented Dec 3, 2021

My connection to the server kept dropping out when i was the only person using the hub. digital blue, 4GB RAM. I was doing arithmetic in the cells.

Feels like the setting for timing out is super short. (I'm not a technical person, so if that's a ridiculous thing to say...apologies).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

8 participants