Skip to content

Proposal: Restore Notebook execution progress when a browser page is reloaded #1274

Open
@skukhtichev

Description

@skukhtichev

Restore Notebook execution progress when a browser page is reload

Problem

Jupyter Notebook/Lab does not restore execution progress after the page is reloaded. As a result there is no option to monitor execution progress and retrieve execution output for long running notebooks.

It happens due to the following reasons:

  1. Notebook/Lab UI generates a new session id every time when a notebook/lab page is reloaded.

    Jupyter Server supports replaying kernel messages to the client after the kernel session is reconnected. It is based on the session id set by the client while connecting to the kernel. Jupyter Notebook/Lab UI generates a new session id every time when the notebook page is loaded. So there is no way to replay buffered messages from Jupyter Server after the notebook page was re-opened because there is no way to reconnect to the existing kernel session.

  2. Message IDs for submitted code are not persisted in cells metadata

    Kernel message ids are not persisted in the notebook metadata and cleaned up after the notebook window or tab is reloaded. It means that Jupyter Frontend code is not able to link kernel messages to cells. As a result there is no way to show the output and an execution count.

  3. Unsaved cell’s output is missing after a notebook page is reloaded

    If the kernel message with the execution result is sent to the browser but the notebook is not saved then output will not be displayed after reloading a page

Proposed Solution

A proposal is to enable restoring sessions for kernel connections and to move a Notebook model (cells metadata) to Jupyter Server and synchronize changes triggered by a Notebooks/Lab UI and a kernel.

Restore kernel sessions

Make a kernel connection independent from the session id provided in web socket session_id url argument. A new session id is generated every time when the notebook is reloaded. In this case buffered kernel messages will not be replayed after the Notebook page/tab is re-opened. JupyterLab and Notebook 7 support collaboration mode which allows you to differentiate between notebook users. A user info and a notebook path could be used as a session identifier and will be mapped to the kernel id.

kernel_session_restore

Move a Notebook model (cells metadata) to Jupyter Server and synchronize it with a kernel and a notebook opened in UI

Storing a notebook model on Jupyter Server will allow:

  • restore message ids for all code cells with submitted code after the page is reloaded;
  • Restore execution progress and output for unsaved changes;

There are implementation notes for enabling Notebooks model on Jupyter Server:

  • JupyterLab/Notebook UI sends messages when the following cell's state is changed:
    • On Cell Changed
    • On Cell Inserted
    • On Cell Deleted
    • On Cell Cleared
    • On Cell Executed
      The messages are sent via kernel's web socket connection with a new message type (e.g. nb_state). Then ZMQChannelHandler parses incoming messages and forwards notebook state related messages to NotebooksStatesManager. NotebooksStatesManager is responsible for synchronizing a notebook state between UI and Jupyter Server. It will also handle kernel messages.
  • When the code is submitted for execution a message id (msg_id) is returned to the client (Browser). The client tracks an execution progress based on the message id. Currently the message id is stored in the browser and should not be persisted in the notebook ipynb file because it is relevant only during a runtime. Since each cell has a unique id then it is possible to map message id to cell id, store a message id for each submitted cell on the Jupyter Server and return it to the client when the notebook is reloaded. Then the client will be able to handle incoming kernel messages and display execution progress. Message id is not saved in the ipynb file and is available only during a runtime.
  • When the notebook is reloaded then cells metadata (including output) from the ipynb file will be merged with the cells metadata from the notebook model saved on the Jupyter Server.

Additional context

The image below shows components and data flow for execution restore logic:

  1. When the notebook is loaded for a first time ContentsManager creates a copy of a Notebook model on Jupyter Server and sends it to the client (Notebook/Lab UI)
  2. When a user edits the notebook then changes are sent to the Jupyter Server via kernel's web socket connection. ZMQChannelsHandler parses messages by the type (e.g. nb_state type) and forwards messages related to state changes to the NotebooksStatesManager. NotebooksStatesManager updates the notebook model stored on the server
  3. When the kernel sends the message ZMQchannelsHandler forwards message to the NotebooksStatesManager and to the Jupyter Notebook/Lab UI
  4. When the notebook page is reloaded ContentsManager loads a notebook file from the storage (file system, cloud storage, etc.) and merges it (including message ids for submitted code cells execution) with the notebook model stored on Notebook Server. It allows to identify which cells are in executing state and restore execution progress.
  5. When the user saves the notebook then contents manager removes message ids from the notebook model which should be saved in the file and saves ipynb file in the storage (msg_id parameter still exists in the Notebook model stored on Jupyter Server).

restore_execution_progress_components

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions