Add opt-in structured logging of executed notebook cells with user identity



### Problem

We host a platform that provides access to sensitive research data, and Jupyter is one of the environments offered to researchers. For audit and security purposes, we need to capture which notebook cells users execute, but **without logging output**, to minimize exposure to sensitive results.

Jupyter Server currently does not provide a built-in way to:
- Capture executed code cells
- Reliably associate those executions with authenticated users across common deployment setups like JupyterHub or oauth2_proxy

This gap makes auditing or tracing activity infeasible in secure or compliance-sensitive environments.

---

### Proposed Solution

Introduce a new configuration flag: `ServerApp.log_cell_execution = True`

When enabled:
- Log each incoming `execute_request` message from WebSocket clients.
- Extract cell source code only—**do not capture outputs**.
- Log as structured JSON:
  - `who`: user identity
  - `what`: executed code
  - `kernel_id`
  - `timestamp` (UTC, ISO 8601)

Identity resolution strategy:
1. Use `current_user.username` if available and meaningful.
2. If the value appears to be an opaque internal ID (e.g. UUID-style), and the connection is from a **trusted proxy** (`127.0.0.1`), attempt to extract identity from headers:
    - `X-Auth-Request-User`
    - `X-Auth-Request-Email`
3. If headers are missing or empty, fall back to the original username.

Implementation hooks into the `ZMQChannelsWebsocketConnection.handle_incoming_message()` method, where `execute_request` messages are already parsed and routed. This allows us to log cell input at the exact point of entry, without impacting downstream kernel logic.

---

### Additional context

This feature is scoped, backwards-compatible, and entirely opt-in.

It enables a range of downstream uses:
- Compliance with institutional audit requirements
- Improved incident response and forensics in multi-user Jupyter deployments
- Integration with log collection tools (e.g. ELK, CloudWatch) via JSON structure

We have a patch ready to submit as a pull request and welcome feedback on the approach.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add opt-in structured logging of executed notebook cells with user identity #1533

Problem

Proposed Solution

Additional context

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add opt-in structured logging of executed notebook cells with user identity #1533

Description

Problem

Proposed Solution

Additional context

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions