ad-segment-trimmer

self-hosted REST API (with frontend) to remove ads from audio/video files using OpenAI's Whisper and LLMs

How does it work?

A transcript is made with an API from Fireworks AI, running Whisper (specifically Whisper-v3-large-turbo), an open-source ASR model, which returns an entire transcription, and also word level timestamps, then the entire transcription is sent to an LLM (Gemini 2.0 Flash) to extract the entire advertisement segments, then the start_time and end_time of each segment is used to create an FFmpeg command to remove the segments from the original audio file, then return the cleaned audio file to the user.

How much does it cost?

Whisper is billed at $0.0009 per audio minute (billed per second), and Gemini 2.0 Flash is billed at $0.40 per million output tokens ($0.0000004 per token), so for an hour long podcast, the process is billed at around 0.11 USD.

Usage with Docker (recommended)

Prerequisites

Clone the repository and navigate to the directory:

git clone https://github.com/nocdn/ad-segment-trimmer.git
cd ad-segment-trimmer/

Copy the .env.example file to .env:

cp .env.example .env

Make sure you have an Gemini API key, as an environment variable called GEMINI_API_KEY in the .env file.
Make sure you have a Fireworks AI API key, as an environment variable called FIREWORKS_API_KEY in the .env file.
Set any rate limits you want in the .env file (optional).
Build the and run the Docker image:

docker compose up -d --build

(the -d flag runs the container in detached mode, and the --build flag rebuilds the image if there are any changes)

There now should be a frontend running at port 6030, and the API running at port 7070.

To access the API, you can use the following curl command:

curl -F "file=@audio.mp3" -OJ http://localhost:7070/process

(replace audio.mp3 with the path to your audio file, the -OJ flag will save the file with the returned name with the _edited suffix)

Installation for local development

Prerequisites

Python 3.10+

Clone the repository and navigate to the directory:

git clone https://github.com/nocdn/ad-segment-trimmer.git
cd ad-segment-trimmer/

Fill out the .env file by copying the .env.example file:

cp .env.example .env

Make sure you have an Gemini API key, as an environment variable called GEMINI_API_KEY in the .env file.
Make sure you have a Fireworks AI API key, as an environment variable called FIREWORKS_API_KEY in the .env file.
Set any rate limits you want in the .env file (optional).

backend

Install the dependencies:

cd backend
pip install -r requirements.txt

Run the backend:

python app.py

frontend

Install the dependencies:

cd frontend
npm install

Run the frontend:

npm run dev

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
backend		backend
frontent		frontent
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bun.lock		bun.lock
docker-compose.yaml		docker-compose.yaml
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ad-segment-trimmer

How does it work?

How much does it cost?

Usage with Docker (recommended)

Prerequisites

Installation for local development

Prerequisites

backend

frontend

License

About

Releases

Packages

Languages

License

nocdn/ad-segment-trimmer

Folders and files

Latest commit

History

Repository files navigation

ad-segment-trimmer

How does it work?

How much does it cost?

Usage with Docker (recommended)

Prerequisites

Installation for local development

Prerequisites

backend

frontend

License

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages