This Project is Completed in MATLAB. All experiments and model traing is performed in MATLAB Language and Software.
Key Terms : Machine Learning, Audio Synthesis, Multimedia Forensics, AI Synthesised Audio, Human Voice, Spectral Analyisis.
Abstarct :
Digital technology has made possible unimaginable applications come true. It seems exciting to have a handful of tools for easy editing and manipulation, but it raises alarming concerns that can propagate as speech clones, duplicates, or maybe deep fakes. Validating the authenticity of a speech is one of the primary problems of digital audio forensics. We propose an approach to distinguish human speech from AI synthesized speech exploiting the Bi-spectral and Cepstral analysis. Higher-order statistics have less correlation for human speech in comparison to a synthesized speech. Also, Cepstral analysis revealed a durable power component in human speech that is missing for a synthesized speech. We integrate both these analyses and propose a machine learning model to detect AI synthesized speech.
Work Done :
This works include extraction of the features from various audio samples based on Spectral and Cepstral Analysis. Then used those features to identify AI synthesised speech and Human speech using Machine Learning. Trained the various models of Machine Learning like Quadratic and Linear SVM, KNN, Logistic Regression etc. Found the Highest Cross Validation Acuraacy in Quadratic SVM. Able to classify AI synthesised speech from Human speech with accurracy of 98.5 % on test data.