Post-performance acoustic analysis for stand-up comedy.
PunchPlot is a Python tool that analyzes audio recordings of live stand-up comedy performances, extracting objective metrics about comedian timing, pauses, and audience acoustic response β laughs, applause, and silence.
Note: PunchPlot does not evaluate humor or comedic quality. It focuses strictly on measurable acoustic and temporal signals.
π§ This project is actively under development. See Current Status and the Roadmap below.
This project was born from a personal creative idea: what if we could visualize the "shape" of a stand-up comedy set through its acoustic fingerprint?
I started PunchPlot as a hands-on way to explore audio signal processing, feature extraction, and event detection β topics I wanted to deeply understand through building something tangible rather than just studying theory. The domain of live comedy is a fun and challenging testing ground: the audio is messy, the signals overlap, and the distinction between laughter, applause, and background noise requires real-world problem-solving.
Beyond the learning aspect, PunchPlot has a genuine use case β giving comedians an objective, data-driven look at how their performances land, complementing the subjective experience of being on stage.
Given an audio recording (comedian + audience), PunchPlot detects and measures:
| Signal | Detection |
|---|---|
| π£οΈ Speech | When the comedian is speaking (Voice Activity Detection) |
| π Laughter | Audience laughter events β timing, duration, intensity |
| π Applause | Audience applause events (planned) |
| βΈοΈ Pauses | Gaps between speech β micro (<0.5s), short (0.5β2s), long (>2s) |
| π Rhythm | Speech rate estimation and vocal energy dynamics |
- Metrics report β laughs/min, pause duration, speech ratio, audience response ratio, etc.
- Timeline visualization β multi-track plot showing speech, laughter, pauses, and energy over time
- JSON export β machine-readable metrics for further analysis
PunchPlot follows a phased development approach β each phase is only started after the previous one is validated with real audio.
| Phase | Description | Status |
|---|---|---|
| Phase 0 | Project foundation β structure, audio loading, feature extraction, CLI skeleton | β Complete |
| Phase 1.0 | Exploratory visualization β plotting features to understand real audio patterns | β Complete |
| Phase 1.1 | Voice Activity Detection β adaptive RMS threshold + spectral centroid rescue | β Complete |
| Phase 1.2 | Laughter detection β heuristic classifier using spectral flatness + centroid | β Complete |
| Phase 1.3 | Pause detection & speech rate estimation | π² Next |
| Phase 1.4β1.6 | Metrics calculation, timeline visualization, full CLI | π² Planned |
| Phase 2 | Robustness β noise reduction, adaptive thresholds, ML-based classifiers | π² Planned |
| Phase 3 | Presentation β web dashboard (Streamlit), interactive plots (Plotly) | π² Planned |
For the full breakdown, see docs/DEVELOPMENT_PLAN.md.
βββββββββββ ββββββββββββββββ ββββββββββββββββ ββββββββββββββ
β INPUT βββββΆβ PRE-PROCESS βββββΆβ ANALYSIS βββββΆβ OUTPUT β
β Audio β β & Features β β & Detection β β Metrics β
βββββββββββ ββββββββββββββββ ββββββββββββββββ β + Visual β
ββββββββββββββ
- Audio Loading β Accepts
.wav,.mp3,.ogg,.flac; resamples to 16 kHz mono - Feature Extraction β RMS energy, spectral centroid, spectral flatness, zero-crossing rate, MFCCs
- Voice Activity Detection β Adaptive threshold on RMS energy with spectral centroid rescue for soft-spoken segments
- Audience Response Detection β Classifies non-speech regions as laughter using rolling-median spectral flatness stability and centroid variance
- Metrics & Visualization β (in development)
- Adaptive VAD with centroid-variance rescue to handle soft-spoken comedy styles
- Heuristic laughter detection based on empirically observed spectral patterns (flatness consistency + centroid stability in laugh regions vs. speech)
- Rolling median smoothing for noise-robust feature analysis
- Modular architecture with independent detectors that can be tested and tuned separately
For a deep dive into the system design, see docs/ARCHITECTURE.md.
- Python 3.11+
ffmpeg(for loading non-WAV formats via librosa)
# Clone the repository
git clone https://github.com/semedooo/PunchPlot.git
cd PunchPlot
# Create and activate a virtual environment
python -m venv .venv
.venv\Scripts\activate # Windows
# source .venv/bin/activate # Linux/macOS
# Install in development mode
pip install -e ".[dev]"# Analyze a stand-up comedy audio file
punchplot analyze my_set.wav
# Exploratory visualization (plot audio features)
python -m punchplot.scripts.explore my_set.wav
python -m punchplot.scripts.explore my_set.wav --start 30 --end 60 # specific time rangePunchPlot/
βββ src/punchplot/
β βββ core/ # Audio loading, preprocessing & feature extraction
β βββ detectors/
β β βββ vad.py # Voice Activity Detection (adaptive RMS + centroid rescue)
β β βββ audience.py # Audience laughter detection (spectral heuristics)
β β βββ rhythm.py # Pause detection & speech rate (planned)
β βββ metrics/ # Performance metrics calculation (planned)
β βββ visualization/ # Timeline plot generation (planned)
β βββ scripts/
β β βββ explore.py # Interactive feature exploration & plotting
β βββ models.py # Data models (segments, events, metrics)
β βββ pipeline.py # Analysis pipeline orchestration
β βββ cli.py # Command-line interface (click)
βββ tests/ # Unit tests
βββ docs/
β βββ ARCHITECTURE.md # System design & technical concepts
β βββ DEVELOPMENT_PLAN.md # Phased development roadmap
βββ pyproject.toml # Project configuration & dependencies
| Component | Technology |
|---|---|
| Language | Python 3.11+ |
| Audio I/O | librosa, soundfile |
| Signal Processing | librosa, numpy |
| Data Modeling | dataclasses, pandas |
| Visualization | matplotlib |
| CLI | click |
| Testing | pytest |
See the full Development Plan for details. High-level goals:
- Short-term β Complete MVP: pause/rhythm detection, metrics calculation, timeline visualization, full CLI
- Mid-term β Robustness: noise reduction, adaptive thresholds, ML-based laugh classifier (SVM/Random Forest)
- Long-term β Web dashboard (Streamlit), interactive visualizations (Plotly), multi-performance comparison
- Architecture β System design, pipeline stages, technical concepts
- Development Plan β Phased roadmap with validation criteria
MIT