Audio QC API

AI-powered audio processing: transcribe speech, remove noise, and trim audio automatically.

What You Need

Must have:

Linux computer (Ubuntu/Debian recommended)
Docker installed

Nice to have:

NVIDIA GPU (makes it way faster)

Setup

Step 1: Install Docker

sudo apt update
sudo apt install docker.io docker-compose
sudo usermod -aG docker $USER

Log out and back in after this step

Step 2: GPU Support (Optional but Recommended)

sudo apt install nvidia-container-toolkit
sudo systemctl restart docker

Step 3: Run the API

git clone <your-repo-url>
cd audio-api
mkdir models workspace temp
docker-compose up --build

That's it! The API will be running at http://localhost:8000

What It Does

Transcribe - Converts speech to text with timestamps
Denoise - Removes background noise from audio
Clean - Does all of the above + trims audio to just the speech

How to Use

Go to http://localhost:8000/docs in your browser for an easy web interface.

Or use these commands:

# Just transcribe
curl -X POST "http://localhost:8000/transcribe/" \
  -F "file=@your-audio.wav" \
  -F "expected_text=what you think it says"

# Clean everything (recommended)
curl -X POST "http://localhost:8000/clean/" \
  -F "file=@your-audio.wav" \
  -F "expected_text=what you think it says" \
  -o cleaned-audio.wav

Important Notes

First time will be slow (downloads AI models ~2GB)
Works with WAV, MP3, FLAC files
GPU makes it 5-10x faster
Without GPU it still works, just slower

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
src		src
.dockerignore		.dockerignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Audio QC API

What You Need

Setup

What It Does

How to Use

Important Notes

About

Uh oh!

Releases

Packages

Languages

Pyroghy/audio-qc-api

Folders and files

Latest commit

History

Repository files navigation

Audio QC API

What You Need

Setup

What It Does

How to Use

Important Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages