Skip to content

Gauri-Tripathi/Multi_Modal_MA_

Repository files navigation

Multi-Modal AI Music Assistant

An AI-powered music recommendation system that uses facial emotion detection, contextual awareness, and advanced music analysis to deliver personalized music recommendations.

Live Demo: https://multi-modal-ma.onrender.com

Documentation Status

A sophisticated recommendation engine that personalizes music recommendations based on facial emotion detection, contextual factors, and user preferences.

Core Features

  • Emotion Detection: Real-time facial emotion analysis using deep learning
  • Context-Aware Recommendations: Incorporates weather, time, and location
  • Advanced Music Analysis: Uses Matern kernel similarity for music matching
  • Spotify Integration: Direct playback through Spotify
  • Personalized Learning: Adapts to user feedback and preferences

Algorithmic Workflow

  1. Emotion Detection Process * User uploads a facial image through the web interface * The EmotionDetector class processes the image * A pre-trained model classifies the emotional state (happy, sad, angry, etc.) * The detected emotion serves as the primary input for the recommendation process

  2. Context Acquisition Flow * Weather data is fetched from OpenWeather API using coordinates * Time features are extracted based on the user's timezone * All contextual data is aggregated into a meta-features object

  3. Recommendation Pipeline * User preferences are retrieved from historical feedback data * Songs matching the detected emotion are filtered from the database * Multiple scoring components are calculated:

    • Content-based similarity to user preferences
    • Collaborative filtering scores (if sufficient user data exists)
    • Contextual relevance scores (based on meta-features)
    • Final scores are computed as a weighted combination
    • The highest-scoring tracks are presented to the user with Spotify integration

Advanced Machine Learning Components

  • Matrix Factorization * Implemented using Singular Value Decomposition (SVD) * Decomposes the user-item interaction matrix into latent factors

  • Similarity Calculations * Audio features are processed using multiple kernel functions:

    • Matern kernel (with varying smoothness parameters)
    • Radial Basis Function (RBF) kernel
    • Rational Quadratic kernel
  • Bayesian Rating * Implements beta distribution to handle uncertainty in small sample sizes * Provides more reliable estimates when limited feedback is available

Context-Aware Adjustments

The system applies context-based weights to recommendations:

  1. Weather Conditions * Clear weather → Boost happy and energetic songs * Rain → Increase acoustic and sad songs * Snow → Enhance instrumental and acoustic tracks * Thunderstorm → Favor energetic and loud songs
  2. Time of Day * Morning → Boost energetic and positive songs * Afternoon → Increase danceable and energetic tracks * Evening → Favor instrumental and acoustic songs * Night → Enhance acoustic and instrumental tracks

Technology Stack

  • Backend: Flask, Python 3.10+
  • ML/AI: TensorFlow, OpenCV, scikit-learn
  • Database: Supabase
  • Frontend: HTML5, CSS3, JavaScript
  • APIs: Spotify API, Weather API
  • Deployment: Docker, Gunicorn

Quick Start

  1. Clone the repository
  2. Install dependencies: pip install -r requirements.txt
  3. Set up environment variables in .env
  4. Run with Docker: docker-compose up
  5. Visit http://localhost:8000

Contributing

Contributions welcome! Please read our Contributing Guidelines.

License

MIT License - see LICENSE file for details.

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published