Browser-only audio extraction and waveform visualization from large video files. Upload videos up to 5 GB, extract audio with ffmpeg.wasm, and render interactive waveforms — no server-side processing.
Try it live: https://audio-waveform.roomler.live/
https://github.com/gjovanov/audio-waveform/raw/master/audio-waveform-intro.mp4
Drop a video, extract audio, visualize the waveform — all in the browser. Zero server processing. Your files never leave your machine.
The intro video demonstrates a 3 GB MP4 file being processed entirely in the browser. The sample video is a 4x concatenation of the Blender Reel 2013 (Creative Commons Attribution 3.0, re-encoded to H.264).
| Category | Feature | Status |
|---|---|---|
| Upload | Drag-and-drop or file picker for video files | ✅ |
| Chunked IndexedDB storage (50 MB chunks, up to 5 GB) | ✅ | |
| Storage quota monitoring | ✅ | |
| Persistent across page reloads | ✅ | |
| Extraction | Audio track extraction via ffmpeg.wasm (stream copy, no re-encoding) | ✅ |
| WORKERFS mount for files > 1.5 GB (avoids ArrayBuffer limit) | ✅ | |
| Multi-threaded with single-threaded fallback | ✅ | |
| Progress reporting | ✅ | |
| Analysis | Adaptive downsampling (16 kHz default, 8 kHz for >2hr files) | ✅ |
| Peak extraction for waveform rendering | ✅ | |
| Memory-efficient — never loads full audio into heap | ✅ | |
| Visualization | Canvas-based mirrored waveform | ✅ |
| Amplitude color coding (blue/green/orange/red) | ✅ | |
| Zoom in/out and fit-to-width | ✅ | |
| Click-to-seek with playback cursor | ✅ | |
| Auto-scroll during playback | ✅ | |
| Playback | Audio playback via <audio> element |
✅ |
Synchronized cursor with requestAnimationFrame |
✅ | |
| Play/pause and seek controls | ✅ |
| Layer | Technology |
|---|---|
| Frontend | Vanilla JS (ES modules), HTML5, CSS3 |
| Audio Extraction | ffmpeg.wasm 0.12 (WebAssembly) |
| Audio Analysis | ffmpeg downsampling (16 kHz adaptive) |
| Rendering | HTML5 Canvas (device-pixel-ratio aware) |
| Storage | IndexedDB (chunked Blob storage) |
| Dev Server | Node.js or Bun (COOP/COEP headers) |
graph TB
subgraph Browser["Browser (zero server-side processing)"]
UI["Vanilla JS SPA<br/>Drag-and-drop · Controls · Log"]
IDB[("IndexedDB<br/>50 MB Blob chunks")]
FFMPEG["ffmpeg.wasm 0.12<br/>WebAssembly · WORKERFS"]
ANALYZER["Audio Analyzer<br/>Peak extraction from PCM"]
CANVAS["Canvas Renderer<br/>Waveform · Cursor · Zoom"]
AUDIO["<audio> Element<br/>Playback · Seek"]
end
subgraph Server["Dev Server (static files only)"]
NODE["Node.js / Bun<br/>COOP + COEP headers"]
FILES["Static Files<br/>HTML · JS · CSS · WASM"]
end
UI -->|"File.slice()"| IDB
IDB -->|"Blob reassembly"| FFMPEG
FFMPEG -->|"-vn -c:a copy"| FFMPEG
FFMPEG -->|"AAC audio Blob"| ANALYZER
FFMPEG -->|"16kHz mono f32le"| ANALYZER
ANALYZER -->|"peaks [{min,max}]"| CANVAS
UI -->|"click-to-seek"| AUDIO
AUDIO -->|"currentTime"| CANVAS
NODE --> FILES
style UI fill:#4fc3f7,color:#1a1a2e
style IDB fill:#78909c,color:#fff
style FFMPEG fill:#e65100,color:#fff
style ANALYZER fill:#00695c,color:#fff
style CANVAS fill:#0f3460,color:#fff
style AUDIO fill:#1565c0,color:#fff
style NODE fill:#ff9800,color:#fff
style FILES fill:#78909c,color:#fff
# Install dependencies (ffmpeg.wasm served locally)
bun install # or: npm install
# Start dev server (pick one)
bun server.bun.js # Bun
node server.js # Node.js
# Open http://localhost:3000Note: The dev server sets
Cross-Origin-Opener-Policy: same-originandCross-Origin-Embedder-Policy: require-corpheaders, required for ffmpeg.wasm's SharedArrayBuffer support.
| Command | Description |
|---|---|
bun server.bun.js |
Start Bun static server on port 3000 |
node server.js |
Start Node.js static server on port 3000 |
bun run start |
Alias for node server.js |
bun run start:bun |
Alias for bun server.bun.js |
| Requirement | Why |
|---|---|
| SharedArrayBuffer | ffmpeg.wasm multi-threaded mode |
| COOP/COEP headers | SharedArrayBuffer prerequisite |
| IndexedDB | Chunked file storage |
| WebAssembly | ffmpeg.wasm runtime |
| Canvas 2D | Waveform rendering |
Tested on: Chrome 120+, Firefox 120+, Edge 120+. Safari has limited SharedArrayBuffer support — ffmpeg falls back to single-threaded mode.
| Document | Description |
|---|---|
| Architecture | Processing pipeline, data flow, memory strategy |
| Storage | IndexedDB chunking, quota management, Blob reassembly |
| Audio Extraction | ffmpeg.wasm loading, WORKERFS mount, codec strategies |
| Waveform | Analysis, peak extraction, canvas rendering, playback sync |
- Files > 2 GB require WORKERFS mount (automatic) — extraction speed depends on browser I/O
Blob.arrayBuffer()limited to ~2 GB in most browsers — handled via WORKERFS fallback- ffmpeg.wasm single-threaded mode is significantly slower (used when SharedArrayBuffer is unavailable)
- Audio analysis decodes to 16 kHz mono (8 kHz for files over 2 hours) — sufficient for smooth waveform visualization, not for high-fidelity analysis
The app is deployed at audio-waveform.roomler.live via Nginx + systemd. Deployment scripts: audio-waveform-deploy.
MIT
