This blueprint guides you through training and deploying a machine learning model that effectively detects synthetic and modified audio content.
The primary objective of this model is to provide a lightweight alternative to deep learning approaches, allowing for easier training and deployment while delivering superior detection results. This approach makes audio forgery detection more accessible for applications with limited computational resources.
Warning: Despite its advantages, this model has inherent limitations and may not detect all types of audio manipulations.
Try out our demo on HF Spaces:
Install dependencies with pip:
pip install .
Run the demo using the run.sh
script:
# run this in the fake_audio_detection root directory
./demo/run.sh
This demo uses an SVM model trained with FOR_rerec and FOR_2sec datasets. You can retrieve these datasets from UncovAI's HuggingFace page to train your own model. For a detailed guide, please check out the Step-by-step guide.
In the fake-audio-detection/
folder, you'll find extract_features.py
, which contains functions to extract features like MFCC, IMFCC, and spectral information from raw audio and datasets. There are many features you can add to improve model performance!
model.py
contains the basics for training and making predictions using your model. You can modify it to find the best performance for your use case.
For a detailed overview of the datasets and the results of this method, check out the Results section.
This project is licensed under the Apache 2.0 License. See the LICENSE file for details.
Contributions are welcome! To get started, you can check out the CONTRIBUTING.md file.