Skip to content

semedooo/PunchPlot

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

7 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🎀 PunchPlot

Post-performance acoustic analysis for stand-up comedy.

PunchPlot is a Python tool that analyzes audio recordings of live stand-up comedy performances, extracting objective metrics about comedian timing, pauses, and audience acoustic response β€” laughs, applause, and silence.

Note: PunchPlot does not evaluate humor or comedic quality. It focuses strictly on measurable acoustic and temporal signals.

🚧 This project is actively under development. See Current Status and the Roadmap below.


Motivation

This project was born from a personal creative idea: what if we could visualize the "shape" of a stand-up comedy set through its acoustic fingerprint?

I started PunchPlot as a hands-on way to explore audio signal processing, feature extraction, and event detection β€” topics I wanted to deeply understand through building something tangible rather than just studying theory. The domain of live comedy is a fun and challenging testing ground: the audio is messy, the signals overlap, and the distinction between laughter, applause, and background noise requires real-world problem-solving.

Beyond the learning aspect, PunchPlot has a genuine use case β€” giving comedians an objective, data-driven look at how their performances land, complementing the subjective experience of being on stage.


What It Does

Given an audio recording (comedian + audience), PunchPlot detects and measures:

Signal Detection
πŸ—£οΈ Speech When the comedian is speaking (Voice Activity Detection)
πŸ˜‚ Laughter Audience laughter events β€” timing, duration, intensity
πŸ‘ Applause Audience applause events (planned)
⏸️ Pauses Gaps between speech β€” micro (<0.5s), short (0.5–2s), long (>2s)
πŸ“ˆ Rhythm Speech rate estimation and vocal energy dynamics

Output

  • Metrics report β€” laughs/min, pause duration, speech ratio, audience response ratio, etc.
  • Timeline visualization β€” multi-track plot showing speech, laughter, pauses, and energy over time
  • JSON export β€” machine-readable metrics for further analysis

Current Status

PunchPlot follows a phased development approach β€” each phase is only started after the previous one is validated with real audio.

Phase Description Status
Phase 0 Project foundation β€” structure, audio loading, feature extraction, CLI skeleton βœ… Complete
Phase 1.0 Exploratory visualization β€” plotting features to understand real audio patterns βœ… Complete
Phase 1.1 Voice Activity Detection β€” adaptive RMS threshold + spectral centroid rescue βœ… Complete
Phase 1.2 Laughter detection β€” heuristic classifier using spectral flatness + centroid βœ… Complete
Phase 1.3 Pause detection & speech rate estimation πŸ”² Next
Phase 1.4–1.6 Metrics calculation, timeline visualization, full CLI πŸ”² Planned
Phase 2 Robustness β€” noise reduction, adaptive thresholds, ML-based classifiers πŸ”² Planned
Phase 3 Presentation β€” web dashboard (Streamlit), interactive plots (Plotly) πŸ”² Planned

For the full breakdown, see docs/DEVELOPMENT_PLAN.md.


Technical Approach

Pipeline Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  INPUT   │───▢│ PRE-PROCESS  │───▢│   ANALYSIS   │───▢│   OUTPUT   β”‚
β”‚  Audio   β”‚    β”‚  & Features  β”‚    β”‚  & Detection β”‚    β”‚  Metrics   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚  + Visual  β”‚
                                                        β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  1. Audio Loading β€” Accepts .wav, .mp3, .ogg, .flac; resamples to 16 kHz mono
  2. Feature Extraction β€” RMS energy, spectral centroid, spectral flatness, zero-crossing rate, MFCCs
  3. Voice Activity Detection β€” Adaptive threshold on RMS energy with spectral centroid rescue for soft-spoken segments
  4. Audience Response Detection β€” Classifies non-speech regions as laughter using rolling-median spectral flatness stability and centroid variance
  5. Metrics & Visualization β€” (in development)

Key Technical Concepts

  • Adaptive VAD with centroid-variance rescue to handle soft-spoken comedy styles
  • Heuristic laughter detection based on empirically observed spectral patterns (flatness consistency + centroid stability in laugh regions vs. speech)
  • Rolling median smoothing for noise-robust feature analysis
  • Modular architecture with independent detectors that can be tested and tuned separately

For a deep dive into the system design, see docs/ARCHITECTURE.md.


Getting Started

Prerequisites

  • Python 3.11+
  • ffmpeg (for loading non-WAV formats via librosa)

Installation

# Clone the repository
git clone https://github.com/semedooo/PunchPlot.git
cd PunchPlot

# Create and activate a virtual environment
python -m venv .venv
.venv\Scripts\activate       # Windows
# source .venv/bin/activate  # Linux/macOS

# Install in development mode
pip install -e ".[dev]"

Usage

# Analyze a stand-up comedy audio file
punchplot analyze my_set.wav

# Exploratory visualization (plot audio features)
python -m punchplot.scripts.explore my_set.wav
python -m punchplot.scripts.explore my_set.wav --start 30 --end 60  # specific time range

Project Structure

PunchPlot/
β”œβ”€β”€ src/punchplot/
β”‚   β”œβ”€β”€ core/              # Audio loading, preprocessing & feature extraction
β”‚   β”œβ”€β”€ detectors/
β”‚   β”‚   β”œβ”€β”€ vad.py         # Voice Activity Detection (adaptive RMS + centroid rescue)
β”‚   β”‚   β”œβ”€β”€ audience.py    # Audience laughter detection (spectral heuristics)
β”‚   β”‚   └── rhythm.py      # Pause detection & speech rate (planned)
β”‚   β”œβ”€β”€ metrics/           # Performance metrics calculation (planned)
β”‚   β”œβ”€β”€ visualization/     # Timeline plot generation (planned)
β”‚   β”œβ”€β”€ scripts/
β”‚   β”‚   └── explore.py     # Interactive feature exploration & plotting
β”‚   β”œβ”€β”€ models.py          # Data models (segments, events, metrics)
β”‚   β”œβ”€β”€ pipeline.py        # Analysis pipeline orchestration
β”‚   └── cli.py             # Command-line interface (click)
β”œβ”€β”€ tests/                 # Unit tests
β”œβ”€β”€ docs/
β”‚   β”œβ”€β”€ ARCHITECTURE.md    # System design & technical concepts
β”‚   └── DEVELOPMENT_PLAN.md # Phased development roadmap
└── pyproject.toml         # Project configuration & dependencies

Tech Stack

Component Technology
Language Python 3.11+
Audio I/O librosa, soundfile
Signal Processing librosa, numpy
Data Modeling dataclasses, pandas
Visualization matplotlib
CLI click
Testing pytest

Roadmap

See the full Development Plan for details. High-level goals:

  • Short-term β€” Complete MVP: pause/rhythm detection, metrics calculation, timeline visualization, full CLI
  • Mid-term β€” Robustness: noise reduction, adaptive thresholds, ML-based laugh classifier (SVM/Random Forest)
  • Long-term β€” Web dashboard (Streamlit), interactive visualizations (Plotly), multi-performance comparison

Documentation


License

MIT

About

🎀 Acoustic analysis tool for stand-up comedy β€” detects speech, laughter, pauses and timing from live performance audio. Built with Python, librosa & signal processing. Work in progress.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages