This repository contains code examples and applications that accompany the following paper:
- 📘 Peter Meier, Sebastian Strahl, Simon Schwär, Meinard Müller, and Stefan Balke
Pitch Estimation in Real Time: Revisiting SWIPE with Causal Windowing
In Proceedings of the International Symposium on Computer Music Multidisciplinary Research (CMMR), 2025, Accepted.
@inproceedings{MeierSSMB25_RealTimeSWIPE_CMMR,
author = {Peter Meier and Sebastian Strahl and Simon Schw{\"a}r and Meinard M{\"u}ller and Stefan Balke},
title = {Pitch Estimation in Real Time: Revisiting {SWIPE} With Causal Windowing},
booktitle = {Proceedings of the International Symposium on Computer Music Multidisciplinary Research ({CMMR})},
address = {London, UK},
year = {2025, Accepted}
}- 📘 Peter Meier, Meinard Müller, and Stefan Balke
A Multi-User Interface for Real-Time Intonation Monitoring in Music Ensembles
In Proceedings of the Workshop for Innovative Computer-Based Music Interfaces (ICMI): 1–5, 2025.
@inproceedings{MeierMB25_IntonationMonitoring_ICMI,
author = {Peter Meier and Meinard M{\"u}ller and Stefan Balke},
title = {A Multi-User Interface for Real-Time Intonation Monitoring in Music Ensembles},
booktitle = {Proceedings of the Workshop for Innovative Computer-Based Music Interfaces ({ICMI})},
address = {Chemnitz, Germany},
doi = "10.18420/muc2025-mci-ws06-202",
howpublished = "Mensch und Computer 2025 - Workshopband",
publisher = "Gesellschaft für Informatik e.V.",
year = {2025},
pages = {1--5},
}- 📘 Peter Meier, Simon Schwär, Gerhard Krump, and Meinard Müller
Evaluating Real-Time Pitch Estimation Algorithms for Creative Music Game Interaction
In: INFORMATIK 2023 — Designing Futures: Zukünfte gestalten, Gesellschaft für Informatik e.V.: 873–882, 2023.
@incollection{MeierSKM23_EvaluatingPitchGame_GI,
author = {Peter Meier and Simon Schw{\"a}r and Gerhard Krump and Meinard M{\"u}ller},
title = {Evaluating Real-Time Pitch Estimation Algorithms for Creative Music Game Interaction},
booktitle = {INFORMATIK 2023 -- Designing Futures: Zuk{\"u}nfte gestalten},
publisher = {Gesellschaft f{\"u}r Informatik e.V.},
address = {Bonn, Germany},
year = {2023},
doi = {10.18420/inf2023_97},
pages = {873--882}
}This project requires Python 3.12 or higher to be installed on your system. Please ensure you have the correct version before proceeding with installation and execution of the code. You can download Python from the official Python website or use a package manager suitable for your operating system.
To verify your Python version, run the following command in your terminal or command prompt:
python --versionThis project uses uv for fast and reliable Python package management. If you don't have uv installed, please visit the official uv installation guide for installation instructions for your operating system.
Navigate to the project directory and install the basic dependencies:
cd rtswipe
uv syncThis command will:
- Create a virtual environment in
.venv/. - Install the core dependencies (numpy, resampy, soundfile) used by the
rtswipemodule.
For specific applications, you'll need to install additional dependency groups:
- For the
cli.pyapplications (pitch estimation, OSC server):
uv sync --group cli- For the
game.pypitch game (pygame):
uv sync --group game- For the
web.pyapplications (FastAPI server, Websockets):
uv sync --group web- For development (testing, linting, Jupyter notebooks):
uv sync --group devNote: The Python package sounddevice requires Portaudio installed on your system to capture audio from a microphone.
To install all dependencies for the module and all applications, run:
uuv sync --all-groups- (Module)
src/rtswipe/rtswipe.py: Core real-timeSWIPEpitch estimation implementation. - (Application 1)
src/cli/cli.py: Command-line interface for multi-channel real-time pitch estimation and OSC output. - (Application 2)
src/game/game.py: Python-based pitch game implementation. - (Application 3)
src/web/web.py: Intonation monitoring system with FastAPI web application and real-time pitch visualization.
The rtswipe module provides a real-time pitch estimation algorithm based on the SWIPE method with causal windowing. It can process audio streams frame-by-frame and return pitch estimates with confidence values.
import numpy as np
from rtswipe import RTSwipe
# Initialize the RTSwipe estimator
swipe = RTSwipe(
fs=22050, # Sample rate
hop_len=256, # Hop length (frame size)
f_min=55.0, # Minimum frequency (Hz)
f_max=1760.0, # Maximum frequency (Hz)
num_channels=1, # Number of audio channels
delay=0.0 # Delay factor (0 = no delay, 1 = maximum delay)
)
# Process audio frames (shape: hop_len x num_channels)
audio_frame = np.random.randn(256, 1) # Example audio frame
freqs, confs = swipe(audio_frame)💡 How to use the rtswipe module inside an audio callback function is demonstrated in the cli.py, game.py, and web.py applications described below.
Make sure you have the cli dependencies installed:
uv sync --group cliGetting help:
uv run src/cli/cli.py --helpuv run src/cli/cli.py --help
usage: cli.py [-h] [-l] [--device ID] [--channel NUMBER] [--samplerate FS] [--blocksize SAMPLES] [--freq FMIN FMAX] [--ip IP] [--port PORT]
Pitch (C)ommand (L)ine (I)nterface.
options:
-h, --help show this help message and exit
-l, --list-devices show list of audio devices and exit
--device ID (None) device id for sounddevice input
--channel NUMBER (1) channel number for sounddevice input
--samplerate FS (44100) samplerate for sounddevice
--blocksize SAMPLES (512) blocksize for sounddevice
--freq FMIN FMAX ([55, 1760]) pitch range in Hz
--ip IP (0.0.0.0) ip address for OSC client
--port PORT (5005) port for OSC clientShow list of audio devices and exit:
uv run src/cli/cli.py -lStart the application listening to your system's default input sound device:
uv run src/cli/cli.pyTerminal output example:
OSC to 0.0.0.0:5005: time=20:23:06.972 freqs=[941.63, 941.63] confs=[0.6, 0.6]
OSC to 0.0.0.0:5005: time=20:23:06.994 freqs=[945.89, 945.89] confs=[0.6, 0.6]
OSC to 0.0.0.0:5005: time=20:23:07.004 freqs=[973.61, 973.61] confs=[0.56, 0.56]
OSC to 0.0.0.0:5005: time=20:23:07.015 freqs=[975.36, 975.36] confs=[0.59, 0.59]
OSC to 0.0.0.0:5005: time=20:23:07.026 freqs=[996.72, 996.72] confs=[0.55, 0.55]
OSC to 0.0.0.0:5005: time=20:23:07.036 freqs=[1030.57, 1030.57] confs=[0.58, 0.58]
OSC to 0.0.0.0:5005: time=20:23:07.047 freqs=[1102.74, 1102.74] confs=[0.6, 0.6]
OSC to 0.0.0.0:5005: time=20:23:07.058 freqs=[1098.77, 1098.77] confs=[0.46, 0.46]
OSC to 0.0.0.0:5005: time=20:23:07.068 freqs=[1188.52, 1188.52] confs=[0.59, 0.59]
OSC to 0.0.0.0:5005: time=20:23:07.078 freqs=[1220.04, 1220.04] confs=[0.59, 0.59]- The
freqsandconfslists provide a snapshot of the audio analysis for each audio frame for each audio channel. - This data is both displayed in the terminal and transmitted as OSC messages, enabling integration with other systems or applications that support OSC for further processing or visualization.
The game.py application is an interactive pitch-based game where players control a character not only with traditional controllers but also with their singing voice.
For detailed information and gameplay examples, please refer to the Dagstuhl Music Game Experiment Website.
Make sure you have the game dependencies installed:
uv sync --group gameStart the game:
uv run src/game/game.pyGetting help:
uv run src/game/game.py --helpusage: game.py [-h] [-l] [--device ID] [--channel NUMBER] [--samplerate FS] [--blocksize SAMPLES] [--freq FMIN FMAX] [--rootnote NUMBER] [--user ID] [--level ID]
Pitch Game: Sing Your Way
options:
-h, --help show this help message and exit
-l, --list-devices show list of audio devices and exit
--device ID (None) device id for sounddevice input
--channel NUMBER (1) channel number for sounddevice input
--samplerate FS (48000) samplerate for sounddevice
--blocksize SAMPLES (512) blocksize for sounddevice
--freq FMIN FMAX ([55, 1760]) pitch range in Hz
--rootnote NUMBER (48) MIDI note number for root note in game level
--user ID (0) ID number of the user
--level ID (0) ID number of the game level- The game listens to your system's default input sound device or you can specify a different device with
--device ID. - This game implementation is designed to run experiments with multiple users, each having their own game level and pitch range.
- The
--rootnote NUMBERargument specifies the MIDI note number for the root note of the game level (default:48). Singing this pitch will make a pitch line appear just above the ground level in the game. - For experiments involving multiple users, you can set the user ID with
--user ID(default:0) and the game level with--level ID(default:0). - Starting a game automatically saves game information (user ID, level ID, timestamp, audio, events, pitchdata) in the
src/game/usersdirectory for later analysis. - This repository includes levels 0 to 5 from the Dagstuhl Music Gaming Experiment. For more details, visit Dagstuhl Music Game Experiment Website.
- You can also create your own levels or modify existing ones using the Tiled Map Editor.
The game supports multiple input methods:
⌨️ Keyboard Controls
- Left/Right Arrow Keys: Move character left/right
- Up Arrow: Fly mode (the vertical position is controlled by the pitch you sing)
- Down Arrow: Delete note block (while landing on it)
- Space: Jump
- F: Toggle mute (silence note block)
- D: Toggle drone sound (this plays the root note you set with
--rootnoteas a reference pitch)
🎮 Gamepad Controls
- D-Pad Left/Right (buttons 13/14): Move character left/right
- D-Pad Up (button 11): Fly mode (the vertical position is controlled by the pitch you sing)
- D-Pad Down (button 12): Delete note block (while landing on it)
- Button A (button 0): Jump
- Button B (button 1): Toggle mute (silence note block)
- Button X (button 2): Toggle drone sound (this plays the root note you set with
--rootnoteas a reference pitch) - Button Y (button 3): Fly mode (the vertical position is controlled by the pitch you sing)
🎤 Voice Controls
- Singing: A pitch line appears on screen showing your current vocal pitch relative to the game world.
- The
--rootnoteparameter sets the reference pitch (default: MIDI note 48 = C4) with which the pitch line aligns to the bottom of the game world. - Fly Mode: Control the character's vertical position by singing different pitches.
- Singing higher pitches moves the character up, lower pitches move down.
The game assets for game.py are from the "Pixel Adventure" and Pixel Adventure 2 by Pixel Frog and are licensed under Creative Commons CC0. Thank you!
The web application provides a web-based visualization interface for multi-channel pitch estimation and real-time intonation monitoring using FastAPI and WebSockets.
Make sure you have the web dependencies installed:
uv sync --group webGetting help:
uv run src/web/web.py --helpuv run src/web/web.py --help
usage: web.py [-h] [-l] [--device ID_OR_NAME] [--channel NUMBER_OR_RANGE] [--samplerate FS] [--blocksize SAMPLES] [--freq FMIN FMAX] [--ip IP] [--port PORT] [--tuning TUNING] [--channel-names NAME [NAME ...]] [--smoothing FRAMES] [--gate DBFS] [--websocket-fps FPS]
Pitch (C)ommand (L)ine (I)nterface.
options:
-h, --help show this help message and exit
-l, --list-devices show list of audio devices and exit
--device ID_OR_NAME device id (int) or name (str) for sounddevice input
--channel NUMBER_OR_RANGE
channel number or range for sounddevice input (e.g. 1 or 9-12)
--samplerate FS (44100) samplerate for sounddevice
--blocksize SAMPLES (512) blocksize for sounddevice
--freq FMIN FMAX ([55, 1760]) pitch range in Hz
--ip IP (0.0.0.0) ip address for OSC client
--port PORT (5005) port for OSC client
--tuning TUNING (440.0) tuning reference frequency in Hz
--channel-names NAME [NAME ...]
(None) channel names (e.g. --channel-names A B C D)
--smoothing FRAMES (15) smoothing window size in frames for moving average
--gate DBFS (-60.0) gate threshold in dBFS; channels below this RMS level have confidence set to zero
--websocket-fps FPS (30) WebSocket update rate in frames per second; lower values reduce network trafficStart the web server with default parameters:
uv run src/web/web.pyPitch Estimation Web App: {'list_devices': False, 'device': None, 'channel': '1', 'samplerate': 44100, 'blocksize': 512, 'freq': [55, 1760], 'ip': '0.0.0.0', 'port': 5005, 'tuning': 440.0, 'channel_names': None, 'smoothing': 15, 'gate': -60.0, 'websocket_fps': 30}
INFO: Started server process [6319]
INFO: Waiting for application startup.
INFO: Application startup complete.
INFO: Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)Open your browser and navigate to http://localhost:8000 to access the web interface.
Set specific parameters:
uv run src/web/web.py --device MixDevice --channel 1-4 --tuning 442 --channel-names Soprano Alto Tenor BassThis example uses the audio device named MixDevice, listens to channels 1 to 4, sets the tuning reference frequency to 442 Hz, and assigns custom names (Soprano, Alto, Tenor, Bass) to the channels to be displayed in the web interface.
This project is licensed under the MIT License - see the LICENSE file for details.
The International Audio Laboratories Erlangen are a joint institution of the Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) and Fraunhofer Institute for Integrated Circuits IIS. This work was funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under grant numbers 500643750 (MU 2686/15-1) and 555525568 (MU 2686/18-1).


