🎤 PunchPlot

Post-performance acoustic analysis for stand-up comedy.

PunchPlot is a Python tool that analyzes audio recordings of live stand-up comedy performances, extracting objective metrics about comedian timing, pauses, and audience acoustic response — laughs, applause, and silence.

Note: PunchPlot does not evaluate humor or comedic quality. It focuses strictly on measurable acoustic and temporal signals.

🚧 This project is actively under development. See Current Status and the Roadmap below.

Motivation

This project was born from a personal creative idea: what if we could visualize the "shape" of a stand-up comedy set through its acoustic fingerprint?

I started PunchPlot as a hands-on way to explore audio signal processing, feature extraction, and event detection — topics I wanted to deeply understand through building something tangible rather than just studying theory. The domain of live comedy is a fun and challenging testing ground: the audio is messy, the signals overlap, and the distinction between laughter, applause, and background noise requires real-world problem-solving.

Beyond the learning aspect, PunchPlot has a genuine use case — giving comedians an objective, data-driven look at how their performances land, complementing the subjective experience of being on stage.

What It Does

Given an audio recording (comedian + audience), PunchPlot detects and measures:

Signal	Detection
🗣️ Speech	When the comedian is speaking (Voice Activity Detection)
😂 Laughter	Audience laughter events — timing, duration, intensity
👏 Applause	Audience applause events (planned)
⏸️ Pauses	Gaps between speech — micro (<0.5s), short (0.5–2s), long (>2s)
📈 Rhythm	Speech rate estimation and vocal energy dynamics

Output

Metrics report — laughs/min, pause duration, speech ratio, audience response ratio, etc.
Timeline visualization — multi-track plot showing speech, laughter, pauses, and energy over time
JSON export — machine-readable metrics for further analysis

Current Status

PunchPlot follows a phased development approach — each phase is only started after the previous one is validated with real audio.

Phase	Description	Status
Phase 0	Project foundation — structure, audio loading, feature extraction, CLI skeleton	✅ Complete
Phase 1.0	Exploratory visualization — plotting features to understand real audio patterns	✅ Complete
Phase 1.1	Voice Activity Detection — adaptive RMS threshold + spectral centroid rescue	✅ Complete
Phase 1.2	Laughter detection — heuristic classifier using spectral flatness + centroid	✅ Complete
Phase 1.3	Pause detection & speech rate estimation	🔲 Next
Phase 1.4–1.6	Metrics calculation, timeline visualization, full CLI	🔲 Planned
Phase 2	Robustness — noise reduction, adaptive thresholds, ML-based classifiers	🔲 Planned
Phase 3	Presentation — web dashboard (Streamlit), interactive plots (Plotly)	🔲 Planned

For the full breakdown, see docs/DEVELOPMENT_PLAN.md.

Technical Approach

Pipeline Architecture

┌─────────┐    ┌──────────────┐    ┌──────────────┐    ┌────────────┐
│  INPUT   │───▶│ PRE-PROCESS  │───▶│   ANALYSIS   │───▶│   OUTPUT   │
│  Audio   │    │  & Features  │    │  & Detection │    │  Metrics   │
└─────────┘    └──────────────┘    └──────────────┘    │  + Visual  │
                                                        └────────────┘

Audio Loading — Accepts .wav, .mp3, .ogg, .flac; resamples to 16 kHz mono
Feature Extraction — RMS energy, spectral centroid, spectral flatness, zero-crossing rate, MFCCs
Voice Activity Detection — Adaptive threshold on RMS energy with spectral centroid rescue for soft-spoken segments
Audience Response Detection — Classifies non-speech regions as laughter using rolling-median spectral flatness stability and centroid variance
Metrics & Visualization — (in development)

Key Technical Concepts

Adaptive VAD with centroid-variance rescue to handle soft-spoken comedy styles
Heuristic laughter detection based on empirically observed spectral patterns (flatness consistency + centroid stability in laugh regions vs. speech)
Rolling median smoothing for noise-robust feature analysis
Modular architecture with independent detectors that can be tested and tuned separately

For a deep dive into the system design, see docs/ARCHITECTURE.md.

Getting Started

Prerequisites

Python 3.11+
ffmpeg (for loading non-WAV formats via librosa)

Installation

# Clone the repository
git clone https://github.com/semedooo/PunchPlot.git
cd PunchPlot

# Create and activate a virtual environment
python -m venv .venv
.venv\Scripts\activate       # Windows
# source .venv/bin/activate  # Linux/macOS

# Install in development mode
pip install -e ".[dev]"

Usage

# Analyze a stand-up comedy audio file
punchplot analyze my_set.wav

# Exploratory visualization (plot audio features)
python -m punchplot.scripts.explore my_set.wav
python -m punchplot.scripts.explore my_set.wav --start 30 --end 60  # specific time range

Project Structure

PunchPlot/
├── src/punchplot/
│   ├── core/              # Audio loading, preprocessing & feature extraction
│   ├── detectors/
│   │   ├── vad.py         # Voice Activity Detection (adaptive RMS + centroid rescue)
│   │   ├── audience.py    # Audience laughter detection (spectral heuristics)
│   │   └── rhythm.py      # Pause detection & speech rate (planned)
│   ├── metrics/           # Performance metrics calculation (planned)
│   ├── visualization/     # Timeline plot generation (planned)
│   ├── scripts/
│   │   └── explore.py     # Interactive feature exploration & plotting
│   ├── models.py          # Data models (segments, events, metrics)
│   ├── pipeline.py        # Analysis pipeline orchestration
│   └── cli.py             # Command-line interface (click)
├── tests/                 # Unit tests
├── docs/
│   ├── ARCHITECTURE.md    # System design & technical concepts
│   └── DEVELOPMENT_PLAN.md # Phased development roadmap
└── pyproject.toml         # Project configuration & dependencies

Tech Stack

Component	Technology
Language	Python 3.11+
Audio I/O	librosa, soundfile
Signal Processing	librosa, numpy
Data Modeling	dataclasses, pandas
Visualization	matplotlib
CLI	click
Testing	pytest

Roadmap

See the full Development Plan for details. High-level goals:

Short-term — Complete MVP: pause/rhythm detection, metrics calculation, timeline visualization, full CLI
Mid-term — Robustness: noise reduction, adaptive thresholds, ML-based laugh classifier (SVM/Random Forest)
Long-term — Web dashboard (Streamlit), interactive visualizations (Plotly), multi-performance comparison

Documentation

Architecture — System design, pipeline stages, technical concepts
Development Plan — Phased roadmap with validation criteria

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
docs		docs
src/punchplot		src/punchplot
tests		tests
.gitignore		.gitignore
MR1min.m4a		MR1min.m4a
MR1min.wav		MR1min.wav
MichaelRowland5Minute.mp3		MichaelRowland5Minute.mp3
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🎤 PunchPlot

Motivation

What It Does

Output

Current Status

Technical Approach

Pipeline Architecture

Key Technical Concepts

Getting Started

Prerequisites

Installation

Usage

Project Structure

Tech Stack

Roadmap

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🎤 PunchPlot

Motivation

What It Does

Output

Current Status

Technical Approach

Pipeline Architecture

Key Technical Concepts

Getting Started

Prerequisites

Installation

Usage

Project Structure

Tech Stack

Roadmap

Documentation

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages