C# GUI feature extractor is in AudioFeaturesGUI, pytorch model code is in pytorchML
Assignment chosen - I. Option 1 is to implement an audio classification system that involves audio signal
processing and machine learning. It may require some research and system integration
(with existing open source machine learning programs).
This project utilizes a pytorch machine learning model to differentiate speech and music audio files.\
Format for test data - #, f1, f2, ..., fn, label
where # lists the file name, f1, ..., fn are n features extracted from this file, and
label has value “yes” if this file is a music clip and “no” if it is a speech file.\
- How to run the code, how to use the system, functionalities of each program file:
● To run the GUI: run the .exe file
● For the ML model, we’ve included a jupyter notebook file (FINAL MODEL.ipynb) to easily
visualize - the notebook should automatically install the necessary libraries involved, so you just
need to make sure you can run it.
○ https://jupyter.org/install\ ● NOTE: Make sure there’s an export.csv file after exporting the data using the GUI application
● Simply open the ipynb file after opening jupyter notebook and run through all of the cells, and
you should see by the end a visualization of our process and our final result.
● The notebook however is mostly a formality, as we’ve documented the process and results within
this report, so no need to actually run it unless you want to verify our code.\ - Brief introduction of libraries/tools/techniques you used in your development:
● C# Feature extraction GUI
○ NAudio - used to process and extract amplitude.
○ AForge.Math - used to process and extract frequencies and magnitude
○ Newtonsoft.Json - used in exporting features for ML code to read
○ Winforms - used as the GUI framework
● Python ML model
○ Pytorch - ML library used to train our model
○ Scikit-Learn - Used for a little bit of cross validation in the ML process
○ Pandas - Used to help manipulate dataset easily for eventual Pytorch manipulation
○ Matplotlib - data visualization library to help show results\ - List of features and their values for the audio files and show what files are used as training data, and
what is tested:
The features being used are:
● Zero Crossing Rate
● Average Energy
● Bandwidth
● Spectral Centroid (over frequency)\
The audio files used to collect features for training data are in the audio.zip in the Assignment 4 folder of
the CSS 484 Canvas course files.
We are testing the features mentioned above with our ML model - our pytorch model processes the above
4 features using pandas into tensors.
The data is processed using linear regression under an adams optimizer, with the layers structured through
an input layer > h1 (hidden layer) > h2 (hidden layer) > final output layer (binary output, either 0 or 1 aka
yes this is a music file or no it is not)
About 2/3rds of the data is relegated for training, while the other is relegated for the testing phase\