Estimating BinderHub cluster size

I'm trying to get a good estimate of the required cluster size using [this guide in Z2JH](https://tljh.jupyter.org/en/latest/howto/admin/resource-estimation.html#maximum-memory-allowed-per-user).

Here are my assumptions:

## Memory

- Max users = 50
- Max expected concurrent users = 60% * max users = 30 (because it is not likely that everyone will use at same time)
- Expected memory usage per user:
  - I used [nbresuse](https://github.com/yuvipanda/nbresuse) to estimate a user's memory usage in the notebook.
  - A notebook by itself is about 120mb I tried to take it to the extreme, executing all the code in multiple chapters and loading in plenty of datasets. I was pushing ~300mb memory usage.
  - A single chapter was more commonly 100-200mb (including data and plots).
  - Let's be conservative and assume 300mb (we can downgrade in future)
  - If a user uses more than the available amount of memory, their notebook kernel will restart and memory will be flushed.

`memory` = `max concurrent users` * `memory per user` + 128mb (for JH overhead) = 30 * 300mb + 128mb = ~9GB

## CPU

- This is harder to estimate but also less of an issue, if we're running low on CPU, things will just run slower but nothing will break.
- I took a look at the JupyterHub Tiffany set up for MDS and it's had a peak usage of just 5% since we started MDS so obviously a very conservative instance. 
- The JH is using a m5.12xlarge:

## Summary

To meet memory and CPU requirements I'm going to start with using 2 x m5.2xlarge instances (the cluster can scale to 4 if needed). I think this is conservative but we'll see. I'll report back.

Here's a comparison of the two instances I mentioned:

|Instance|CPU|RAM|Memory (GB)|
|---|---|---|---|
m5.2xlarge | 8 | 37 | 32 |
m5.12xlarge|48|168|192|






Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Estimating BinderHub cluster size #60

Memory

CPU

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Instance	CPU	RAM	Memory (GB)
m5.2xlarge	8	37	32
m5.12xlarge	48	168	192

Estimating BinderHub cluster size #60

Description

Memory

CPU

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions