A Python project that was developed as a university assignment for the subject of Signal Processing and Voice Recognition. The goal of this assignment was to make an ASR system that predict digits from a voice signal using Neural Network. The dataset that was used for the purpose of this assigment is AudioMNIST.
The steps of the algorithm are :
- We train a simple Feed Forward Neural Network model using only Mel Spectogram as features.
- Seperate foreground from background information using REPET algorithm.
- In the foreground signal,we extract digits information using sliding window technique.
- Finally we feed our model with these digits and make predictions.
To run this project :
- You should download the necessary libraries from requirement.txt and also the audio dataset.
- Run Dataset.py first and after run Network.py.
- Finally you should run the prediction.py and insert the file path when prompted.