Improving Network Performance

How can we optimally transfer assets to Jupyter clients (web browsers)?

**Hypothesis**: HTTP2 (i.e. no head of line blocking) and compression would meaningful improve page load and large notebook load performance.

**Experiment**: Create an nginx config that adds in ssl/http 2/compression and use it as a simple reverse proxy in front of a jupyterlab 2.x server. Then use chrome dev tools to understand changes to performance. In this setup my server and browser are not in the same physical location, but are connected by a high speed network. I had exactly one location block so static assets were still came via tornado.

**Conclusion**:  Surprisingly, these technologies when naively put on top of a jupyterlab@2.x server did not make a meaningful difference. The reverse proxy decreased the size of small assets, but increased the time for page load by ~10%. For large assets they clearly shrunk their size by a large amount 10x-23,000x (the latter is a generated very compressible test notebook) but the time to compress these on the fly meant there were minimal gains to be had. The ~10mb vendors bundle I had was compressed to 2.5Mb but took longer to get to the browser. A 33mb notebook shrank to 1.6kb still took about 30s either way. I'll note, most of my notebooks are small (<5MB).

I ran a second experiment where I put the notebook directly behind the same nginx server. In this case I was able to download the 33MB notebook in ~100ms!

In my view this experiment points to some large gains that can be had by letting assets skip the python server or thinning out the code path between the two. A few suggestions:
1. Create a document for how to configure nginx/apache in front of Jupyter Server. This document should include tips of the right settings to skip the jupyter server for certain assets (e.g. anything in `/static`).
2. If an asset is in `/static` jupyter should treat it as such as set the right headers (today I see `no-cache` set for example). Doing 1 should help people automatically do this, but doing this in jupyter_server may be useful for the average case.
3. Dig into where the 30s is going when sending a large notebook. My guess is we'll need to skip over some steps in python or preload them in memory, but that's just a hunch and we'll need more data.

Pictures are worth 1000 words:
No optimizations page load:
![image](https://user-images.githubusercontent.com/1813603/93905510-0cb47300-fcc9-11ea-9b38-6c8309cba470.png)

Nginx page load:
![image](https://user-images.githubusercontent.com/1813603/93906104-cad7fc80-fcc9-11ea-8659-6add35d10a73.png)

No optimizations large notebook:
![image](https://user-images.githubusercontent.com/1813603/93905738-47b6a680-fcc9-11ea-83c1-d104fc563db4.png)

Nginx large notebook:
![image](https://user-images.githubusercontent.com/1813603/93905725-438a8900-fcc9-11ea-90dd-92ebae1ae92c.png)


Directly sending the notebook (renamed to `foo.json`):
![image](https://user-images.githubusercontent.com/1813603/93905678-379ec700-fcc9-11ea-9268-0af92c2de539.png)

cc @goanpeca 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improving Network Performance #312

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improving Network Performance #312

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions