AudioClassifier is a full-stack application for real-time environmental sound classification, featuring a custom ResNet-style deep CNN that converts audio into Mel Spectrograms for robust feature extraction and classification. The backend, built with Python, PyTorch, and FastAPI, is optimized for scalable, serverless GPU inference using Modal, while the interactive frontend - developed with Next.js, React, and Tailwind CSS - enables users to upload audio, view predictions with confidence scores, and visualize internal model feature maps, waveforms, and spectrograms, making it a comprehensive platform for both audio AI
- 🧠 Deep Audio CNN for sound classification
- 🧱 ResNet-style architecture with residual blocks
- 🎼 Mel Spectrogram audio-to-image conversion
- 🎛️ Data augmentation with Mixup & Time/Frequency Masking
- ⚡ Serverless GPU inference with Modal
- 📊 Interactive Next.js & React dashboard
- 👁️ Visualization of internal CNN feature maps
- 📈 Real-time audio classification with confidence scores
- 🌊 Waveform and Spectrogram visualization
- 🚀 FastAPI inference endpoint
- ⚙️ Optimized training with AdamW & OneCycleLR scheduler
- 📈 TensorBoard integration for training analysis
- 🛡️ Batch Normalization for stable & fast training
- 🎨 Modern UI with Tailwind CSS & Shadcn UI
- ✅ Pydantic data validation for robust API requests
Follow these steps to install and set up the project.
git clone https://github.com/sid995/AudioClassifier.gitDownload and install Python if not already installed. Use the link below for guidance on installation: Python Download
Create a virtual environment with Python 3.12.
Navigate to folder:
cd AudioClassifierInstall dependencies:
pip install -r requirements.txtModal setup:
modal setupRun on Modal:
modal run main.pyDeploy backend:
modal deploy main.pyInstall dependencies:
cd frontend
npm iRun:
npm run dev