Skip to content

Blueprint for training and deploying a machine learning model that effectively detects synthetic and modified audio content.

License

Notifications You must be signed in to change notification settings

mozilla-ai/fake-audio-detection

Repository files navigation

Project logo

Project logo

Lightweight Machine Learning Method for Audio Forgery Detection

This blueprint guides you through training and deploying a machine learning model that effectively detects synthetic and modified audio content.

The primary objective of this model is to provide a lightweight alternative to deep learning approaches, allowing for easier training and deployment while delivering superior detection results. This approach makes audio forgery detection more accessible for applications with limited computational resources.

Warning: Despite its advantages, this model has inherent limitations and may not detect all types of audio manipulations.

Quick-start

Try out our demo on HF Spaces: Try on Spaces

Try it out the demo locally

Install dependencies with pip:

pip install .

Run the demo using the run.sh script:

# run this in the fake_audio_detection root directory
./demo/run.sh

How It Works

This demo uses an SVM model trained with FOR_rerec and FOR_2sec datasets. You can retrieve these datasets from UncovAI's HuggingFace page to train your own model. For a detailed guide, please check out the Step-by-step guide.

Features Extraction

In the fake-audio-detection/ folder, you'll find extract_features.py, which contains functions to extract features like MFCC, IMFCC, and spectral information from raw audio and datasets. There are many features you can add to improve model performance!

Training

model.py contains the basics for training and making predictions using your model. You can modify it to find the best performance for your use case.

Results

For a detailed overview of the datasets and the results of this method, check out the Results section.

License

This project is licensed under the Apache 2.0 License. See the LICENSE file for details.

Contributing

Contributions are welcome! To get started, you can check out the CONTRIBUTING.md file.

About

Blueprint for training and deploying a machine learning model that effectively detects synthetic and modified audio content.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Languages