TypeScript Audio Encoder Blueprint

This guide shows how to replicate the Python librosa_encoder.py functionality using TypeScript and ffmpeg.

Goals

Preprocess with FFmpeg

ffmpeg -y -i input.wav -ac 1 -ar 44100 -t 12 -filter:a loudnorm=I=-16:LRA=11:TP=-1 output.wav

Parse WAV
- Read the PCM samples from the processed file.
Compute Spectral Features
- Use a naive DFT to calculate magnitudes for each frame.
- Average the magnitudes into nMels buckets to approximate a mel‑spectrogram.

backend/audio_encoder.ts implements this flow. Run it with Bun:

bun backend/audio_encoder.ts input.wav processed.wav metadata.json

It writes the normalized clip and a JSON metadata file containing duration and an averaged mel-spectrogram array.