Skip to content
/ audx Public

A flexible audio processing tool for transcoding, noise reduction, and filtering. Features a modular processor-based pipeline for professional audio workflows

License

Notifications You must be signed in to change notification settings

rizukirr/audx

Repository files navigation

audx

A flexible audio processing tool for transcoding, noise reduction, and filtering. Features a modular processor-based pipeline for professional audio workflows.

Demo

Listen to the noise reduction in action:

Before (Noisy) After (Clean)
▶️ Play audio ▶️ Play audio

Features

  • Noise Reduction: RNNoise integration with embedded model and Voice Activity Detection, plus support for custom models
  • Universal Format Support: Decode/encode MP3, AAC, Opus, FLAC, ALAC, WAV, and any FFmpeg-supported format
  • Audio Filtering: Tempo, volume, EQ, normalization, effects, and full FFmpeg filter chain support
  • Quality Control: Explicit bitrate control for all codecs
  • Modular Architecture: Extensible processor pipeline for custom audio processing workflows
  • Automatic Conversion: Sample rate and channel layout conversion handled transparently

Installation

Dependencies

Component Version Description
CMake 4.0+ Build system
GCC/Clang C99+ Compiler
FFmpeg 8.0+ Audio codec libraries
RNNoise 0.2+ Noise reduction library

Quick Build

Debian/Ubuntu:

sudo apt install cmake build-essential libavcodec-dev libavformat-dev \
  libavutil-dev libavfilter-dev libswresample-dev libswscale-dev
./scripts/install-rnnoise.sh
./scripts/build.sh

Arch Linux:

sudo pacman -S cmake gcc ffmpeg
./scripts/install-rnnoise.sh
./scripts/build.sh

Output: build/bin/audx

Usage

audx <input> <output> [OPTIONS]

Options

Option Short Description Example
--help -h Show help message
--version -v Show version information
--codec=<name> -c Output codec (auto-detects from extension if not specified) --codec=libopus
--bitrate=<rate> -b Bitrate (e.g., 192k, 320k) --bitrate=192k
--sample-rate=<hz> -r Output sample rate in Hz --sample-rate=48000
--channels=<n> Output channels (1=mono, 2=stereo, up to 8) --channels=2
--filter=<expr> -f FFmpeg filter expression --filter="atempo=1.25,volume=0.5"
--denoise -d Enable RNNoise denoising (uses embedded model) -d
--model=<path> Custom RNNoise model file (requires -d/--denoise) --model=/path/to/model.bin
--denoise-vad-threshold=<n> Voice Activity Detection threshold (0.0-1.0, default: 0.5) --denoise-vad-threshold=0.7
--list-models List available RNNoise models

Supported Codecs

Lossy Codecs

Codec Name Quality Bitrates
MP3 libmp3lame 128k / 192k / 256k / 320k
AAC aac 96k / 160k / 256k / 320k
Opus libopus 96k / 128k / 192k / 256k

Lossless Codecs

Codec Name Compression Levels
FLAC flac 5 / 8 / 10 / 12
ALAC alac 5 / 8 / 10 / 12

Raw Audio

Format Name Description
WAV/PCM pcm_s16le 16-bit signed little-endian PCM

Examples

Noise Reduction

Remove background noise from voice recording:

audx noisy_audio.wav clean.mp3 -d --codec=libmp3lame --bitrate=256k

Aggressive noise removal for podcast:

audx podcast.wav output.mp3 -d --denoise-vad-threshold=0.7 \
  --codec=libmp3lame --bitrate=192k

Denoise with custom model:

audx input.wav output.wav -d --model=/path/to/custom_model.bin

Denoise and normalize:

audx interview.wav clean.mp3 -d \
  --filter="highpass=f=80,loudnorm=I=-16:TP=-1.5" \
  --codec=libmp3lame --bitrate=256k

Format Conversion

MP3 to Opus:

audx input.mp3 output.opus --codec=libopus --bitrate=192k

FLAC to high-quality MP3:

audx input.flac output.mp3 --codec=libmp3lame --bitrate=320k

Convert to lossless FLAC:

audx input.mp3 output.flac --codec=flac

Audio Processing

Speed up audio by 25%:

audx input.mp3 output.mp3 --codec=libmp3lame --bitrate=256k --filter="atempo=1.25"

Reduce volume by 50%:

audx input.mp3 output.mp3 --codec=libmp3lame --bitrate=256k --filter="volume=0.5"

Chain multiple filters:

audx input.mp3 output.mp3 --codec=libmp3lame --bitrate=320k \
  --filter="atempo=1.25,volume=0.8,highpass=f=100"

Loudness normalization:

audx input.mp3 output.mp3 --codec=libmp3lame --bitrate=256k \
  --filter="loudnorm=I=-16:LRA=11:TP=-1.5"

Common Filters

Filter Purpose Example
atempo Change playback speed atempo=1.5 (50% faster)
volume Adjust volume volume=0.5 (50% quieter)
loudnorm Loudness normalization loudnorm=I=-16:TP=-1.5:LRA=11
highpass High-pass filter (cut low freq) highpass=f=200 (cut below 200Hz)
lowpass Low-pass filter (cut high freq) lowpass=f=3000 (cut above 3kHz)
aecho Echo effect aecho=0.8:0.88:60:0.4
equalizer Parametric EQ equalizer=f=1000:width=200:g=-10
aresample Resample audio aresample=48000

Full documentation: https://ffmpeg.org/ffmpeg-filters.html#Audio-Filters

Technical Notes

Opus Sample Rates

Opus only supports: 48000, 24000, 16000, 12000, 8000 Hz

For other sample rates, add resampling filter:

audx input.mp3 output.opus --codec=libopus --quality=high --filter="aresample=48000"

RNNoise Processing

  • Default model: Uses embedded RNNoise model (no external files needed)
  • Custom models: Support for custom RNNoise models via --model flag
  • Internal processing: 48kHz mono (automatic resampling)
  • Frame size: 480 samples (handled automatically via buffering)
  • Multi-channel: Each channel processed independently
  • VAD threshold: 0.5 default; lower = more aggressive, higher = more conservative

Default Behavior

Without --codec, audx outputs raw PCM data for backward compatibility.

Architecture

audx uses a modular processor-based pipeline architecture:

Input File → Decoder → [Denoiser] → [Filter] → Encoder → Output File
             AVFrame    AVFrame       AVFrame    AVFrame

Processors

Processor Location Purpose
Decoder src/core/processors/decoder.c FFmpeg decoding + format conversion
Denoiser src/core/processors/denoiser.c RNNoise with VAD support
Filter src/core/processors/ffmpeg_filter.c FFmpeg filtergraph integration
Encoder src/core/processors/encoder.c FFmpeg encoding with quality presets

Key Features

  • Plugin Architecture: Easy to extend with custom processors
  • Format Negotiation: Automatic conversion between processor requirements
  • Frame Buffering: Handles fixed-size requirements (RNNoise 480 samples, codec frame sizes)
  • Pipeline Orchestration: Manages processor lifecycle and error handling

For developers: See REDESIGN.md for full architecture documentation and CLAUDE.md for development guidelines.

License

audx source code is licensed under the MIT License.

FFmpeg libraries used by audx are licensed under the LGPL 2.1 or later.

Important Legal Information

For Commercial Use: Some audio codecs (particularly AAC) may require patent licenses in certain jurisdictions, independent of copyright licensing. Consult with a lawyer before using audx in commercial products.

Complete licensing information, FFmpeg compliance details, and patent notices: See LEGAL_NOTICES.md

Quick Compliance Summary

  • Personal/Non-Commercial Use: Free to use with all codecs
  • Commercial Use: Verify patent requirements for your jurisdiction
  • Distribution: Must include LICENSE and LEGAL_NOTICES.md files
  • Modifications: Must maintain MIT license for audx, LGPL for FFmpeg libraries

For FFmpeg legal information: https://www.ffmpeg.org/legal.html

References

About

A flexible audio processing tool for transcoding, noise reduction, and filtering. Features a modular processor-based pipeline for professional audio workflows

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published