Skip to content

coarse grained performance monitoring in Jupyter

License

ScaDS/jumper_ipython_extension

Repository files navigation

Unit Tests Formatting Static Analysis

JUmPER: Jupyter meets Performance

JUmPER brings performance engineering to Jupyter. It consists of the two repositories:

  • JUmPER Ipython extension (this repository)

This extension is for real-time performance monitoring in IPython environments and Jupyter notebooks. It allows you to gather performance data on CPU usage, memory consumption, GPU utilization, and I/O operations for individual cells and present it in the notebook/IPython session either as text report or as a plot.

The Score-P kernel allows you to instrument, and trace or profile your Python code in Jupyter using Score-P for in-detail performance analysis tasks. The Score-P kernel and the IPython extension can be seamlessly integrated.

Table of Content

Installation

pip install jumper_extension

or install it from source:

pip install .

Optional GPU Support:

For NVIDIA GPU monitoring:

pip install pynvml

For AMD GPU monitoring:

pip install ADLXPybind

Both GPU libraries can be installed simultaneously to monitor mixed GPU systems.

Configuration

Environment Variables

  • JUMPER_LOG_DIR: Directory where JUmPER stores log files (info.log, debug.log, error.log)
    • Default: User's home directory
    • Example: export JUMPER_LOG_DIR=/path/to/logs

Quick Start

Load the Extension

%load_ext jumper_extension

Basic Usage

  1. Start monitoring:

    %perfmonitor_start [interval]

    interval is an optional argument for configuring frequency of performance data gathering (in seconds), set to 1 by default. This command launches a performance monitoring daemon.

  2. Run your code

  3. View performance report:

    %perfmonitor_perfreport
    %perfmonitor_perfreport --cell 2:5 --level user

    Will print aggregate performance report for entire notebook execution so far:

    ----------------------------------------
    Performance Report
    ----------------------------------------
    Duration: 11.08s
    Metric                    AVG      MIN      MAX      TOTAL   
    -----------------------------------------------------------------
    CPU Util (Across CPUs)    10.55    3.86     45.91    -       
    Memory (GB)               7.80     7.74     7.99     15.40   
    GPU Util (Across GPUs)    27.50    5.00     33.00    -       
    GPU Memory (GB)           0.25     0.23     0.32     4.00    
    

    Options:

    • --cell RANGE: Specify cell range (e.g., 5, 2:8, :5)
    • --level LEVEL: Choose monitoring level (process, user, system, slurm)
  4. Plot performance data:

    %perfmonitor_plot

    Opens an interactive plot with widgets to explore performance metrics over time, filter by cell ranges, and select different monitoring levels.

Direct plotting mode and exports

You can also run %perfmonitor_plot in a direct (non-widget) mode and save or export results.

  • Plot specific metrics (no ipywidgets):

    %perfmonitor_plot --metrics cpu_summary,memory
  • Choose monitoring level and cell range:

    %perfmonitor_plot --metrics cpu_summary --level user --cell 2:5
  • Save the plot as JPEG:

    %perfmonitor_plot --metrics cpu_summary,memory --save-jpeg performance_analysis.jpg
  • Export plot data to a pickle file (to reload later with full interactivity):

    %perfmonitor_plot --metrics cpu_summary --level user --pickle analysis_data.pkl

    The command prints a small Python snippet showing how to load the pickle and display the plot in a separate session.

Notes:

  • --metrics accepts a comma-separated list of metric keys (e.g., cpu_summary, memory, io_read, io_write, io_read_count, io_write_count, gpu_util_summary, gpu_band_summary, gpu_mem_summary).
  • --level supports the same levels as reports: process (default), user, system, and slurm (if available).
  • --cell supports formats like 5, 2:8, :5, 3:. Negative indices are supported (e.g., -3:-1).
  1. View cell execution history:

    %cell_history

    Shows an interactive table of all executed cells with timestamps and durations.

  2. Stop monitoring:

    %perfmonitor_stop
  3. Export data for external analysis:

    %perfmonitor_export_perfdata my_performance.csv --level system
    %perfmonitor_export_cell_history my_cells.json

    Export performance measurements for entire notebook and cell execution history with timestamps, allowing you to project measurements onto specific cells.

Metrics Collection

Performance Monitoring Levels

The extension supports four different levels of metric collection, each providing different scopes of system monitoring:

  • Process: Metrics for the current Python process only
  • User: Metrics for all processes belonging to the current user
  • System: System-wide metrics across all processes and users (if visible)
  • Slurm: Metrics for processes within the current SLURM job

Collected Metrics

Metric Description
cpu_util CPU utilization percentage
memory Memory usage in GB
io_read_count Total number of read I/O operations
io_write_count Total number of write I/O operations
io_read_mb Total data read in MB
gpu_util GPU compute utilization percentage across GPUs
gpu_band GPU memory bandwidth utilization percentage
gpu_mem GPU memory usage in GB across GPUs
io_write_mb Total data written in MB

Note: GPU metrics support both NVIDIA GPUs (via pynvml library) and AMD GPUs (via ADLXPybind library). Both GPU types can be monitored simultaneously. Memory limits are automatically detected from SLURM cgroups when available.

GPU Support Details:

  • NVIDIA GPUs: Full support for all monitoring levels (process, user, system, slurm) including per-process GPU memory tracking
  • AMD GPUs: System-level monitoring supported; per-process and per-user metrics are limited by AMD ADLX API capabilities

Available Commands

Command Description
%perfmonitor_fast_setup Fast setup of JUmPER. Starts monitor (1.0s interval), enables perfreports (--level process) and interactive plots (ipympl)
%perfmonitor_help Show all available commands with examples
%perfmonitor_resources Display available hardware resources
%perfmonitor_start [interval] Start monitoring (default: 1 second interval)
%perfmonitor_stop Stop monitoring
%perfmonitor_perfreport [--cell RANGE] [--level LEVEL] Show performance report for specific cell range and monitoring level
%perfmonitor_plot [--metrics LIST] [--cell RANGE] [--level LEVEL] [--save-jpeg FILE] [--pickle FILE] Interactive plot with widgets; direct plotting of selected metrics; optional export to JPEG or pickle
%cell_history Show execution history of all cells with interactive table
%perfmonitor_enable_perfreports Auto-generate reports after each cell
%perfmonitor_disable_perfreports Disable auto-reports
%perfmonitor_export_perfdata [--file filename] [--level LEVEL] Export performance data to dataframe. Export performance data to CSV if --file is set.
%perfmonitor_export_cell_history [filename] Export cell history to CSV/JSON

Contribution and Citing:

PRs are welcome. Feel free to use the pre-commit hooks provided in .githooks

If you publish some work using the kernel, we would appreciate if you cite one of the following papers:

Werner, E., Rygin, A., Gocht-Zech, A., Döbel, S., & Lieber, M. (2024, November).
JUmPER: Performance Data Monitoring, Instrumentation and Visualization for Jupyter Notebooks.
In SC24-W: Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis (pp. 2003-2011). IEEE.
https://www.doi.org/10.1109/SCW63240.2024.00250

About

coarse grained performance monitoring in Jupyter

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •