Skip to content

This repository contains different CNN methods for audio classification. It starts with canceling noise from audio. Then it converts the audio into a mel-spectrogram and trains with CNN models.

Notifications You must be signed in to change notification settings

awal-ahmed/AudioViT

Repository files navigation

AudioViT

Prerequisite

  • Anaconda

Code with default settings

Download the audio from here.
Extract the dataset and move the DangerDetection inside the AudioViT folder.

To run all files with default settings follow these instructions

conda env create -f audioViT_env.yml
conda activate audioViT_env
python noise_cancel.py 
python train_audio.py
python test_audio.py

Code with custome settings

Prepare your own dataset. Move it under AudioViT folder. If you need to crop all your audio in the same length follow the instructions mentioned in this repository.

Managing virtual environment

Create a conda environment:

conda env create -f audioViT_env.yml

Activate the environment:

conda activate audioViT_env

Reduce noise

To run with the default value

python noise_cancel.py 

To change the noise reduction models:

python noise_cancel.py --noise_reducer=mfcc_up
or
python noise_cancel.py -nr=mfcc_up

Options to customize noise reduction

  • --src_root: Path till the root of the dataset with noise.
         If the folder structure of the dataset is like this:
         |---DangerDetection
         |  |---test
         |  |  |---Child
         |  |  |---Normal
         |  |  |---Women
         |  |---train
         |  |  |---Child
         |  |  |---Normal
         |  |  |---Women
       Example: --src_root=./DangerDetection
  • --dst_root: Path till the root of the dataset where it will be saved.
         It will create folder structure like this:
         |---CleanData
         |  |---test
         |  |  |---Child
         |  |  |---Normal
         |  |  |---Women
         |  |---train
         |  |  |---Child
         |  |  |---Normal
         |  |  |---Women
       Example: --dst_root=CleanData
  • --noise_reducer, -nr: Mention the name of noise reduction needed to be used.
         Options: butter, noise_reduce, deNoise, power, centroid_s, centroid_mb, mfcc_up, mfcc_down, median
       Example: --noise_reducer=median or -nr=median
  • --sr: Mention the sampling rate you want to resize the audio.
       Example: --sr=40000

Train audio signal

To train with the default parameters

python train_audio.py

To change the noise reduction models:

python train_audio.py --model_type=audiovit

Options to customize audio training

  • --model_type: Mention the model name you wanna use for training.
         Options: conv1d, mobilenetv2, inceptionv3, xception, dencenet, resnet50, resnet101, lstm, audiovit, conv2d, vgg19
       Example: --model_type=audiovit
  • --training_root: Mention the root of the training folder.
       Example: --training_root=./CleanData/train
  • --batch_size: Mention the batch size for trining.
       Example: --batch_size=4
  • --delta_time: Mention the length in seconds of each audio for training.
       Example: --delta_time=1.0
  • --sr: Mention the sampling rate you have resampled your in noise reduction.
       Example: --sr=40000
  • --noise_reduce: Mention the noise reducer name you have resampled your in noise reduction.
         Options: butter, noise_reduce, deNoise, power, centroid_s, centroid_mb, mfcc_up, mfcc_down, median
       Example: --noise_reduce=median
  • --old: Mention the folder name where you want to save your models.
       Example: --old=./testoutput

Test audio signal

To test with the default parameters

python test_audio.py

To change the noise reduction models:

python test_audio.py --model_type=audiovit

Options to customize audio testing

  • --model_type: Mention the model name you wanna test your data with.
         Options: conv1d, mobilenetv2, inceptionv3, xception, dencenet, resnet50, resnet101, lstm, audiovit, conv2d, vgg19
       Example: --model_type=audiovit
  • --test_dir: Mention the root of the testing folder.
       Example: --test_dir=./CleanData/test
  • --noise_reduce: Mention the noise reducer name you have resampled your in noise reduction.
         Options: butter, noise_reduce, deNoise, power, centroid_s, centroid_mb, mfcc_up, mfcc_down, median
       Example: --noise_reduce=median
  • --old: Mention the folder name where you want to save your models.
       Example: --old=./testoutput

About

This repository contains different CNN methods for audio classification. It starts with canceling noise from audio. Then it converts the audio into a mel-spectrogram and trains with CNN models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages