Skip to content

A deep learning-based Speech Emotion Recognition (SER) model trained primarily on Indian languages. Designed for applications in call centers, sentiment analysis, and accessibility tools.

License

Notifications You must be signed in to change notification settings

aaivu/KuralNet

Repository files navigation

🔊 KuralNet: Multilingual Speech Emotion Recognition (SER) Model

Version 1.0 | Trained on 13 Languages & 14 Datasets

Multilingual Speech Emotion Recognition model primarily focused on South Indian languages, designed to detect emotions from speech, enhancing call centers, sentiment analysis, and accessibility tools.

License Version Paper Hugging Face PyPI Docker

🚀 Key Features

  • Emotion Detection: Capable of detecting emotions from speech in multiple Indian languages.
  • Use Cases: Call centers, sentiment analysis, and accessibility tools.
  • Optimized Performance: Designed for real-time emotion analysis.

🎯 Purpose

This model aims to enhance user experiences by detecting emotions from speech across multilingual datasets. The focus is to apply it in industries like customer service, where emotional tone plays a crucial role.

🧠 Model Architecture

Model uses a dual-branch approach: a Whisper branch for contextual embeddings and a traditional feature branch (MFCC, etc.) for acoustic details. An AttentionFusion module then dynamically combines these features using cross-attention. Finally, classification layers predict the emotion.

In essence: Raw Audio -> (Whisper + Traditional Features) -> AttentionFusion -> Emotion Prediction.

Model

📈 Performance Overview

Radar Chart by Language

Radar Chart by Language

📊 Detailed Performance Table

Detailed Table

📜 Citation

If you are using this model or research findings, please cite the following paper:

@article{placeholder2024,
  author    = {Author(s)},
  title     = {Paper Title},
  journal   = {Conference/Journal},
  year      = {2024},
  volume    = {X},
  number    = {Y},
  pages     = {ZZ-ZZ},
  doi       = {10.XXXX/placeholder},
}

📬 Contact

🏷️ Name 📧 Email 🔗 LinkedIn 📚 Google Scholar
Luxshan Thavarasa luxshan.20@cse.mrt.ac.lk LinkedIn Google Scholar
Jubeerathan Thevakumar jubeerathan.20@cse.mrt.ac.lk LinkedIn Google Scholar
Thanikan Sivatheepan thanikan.20@cse.mrt.ac.lk LinkedIn Google Scholar
Uthayasanker Thayasivam rtuthaya@cse.mrt.ac.lk LinkedIn Google Scholar

🙏 Acknowledgment

We would like to thank Dr. Uthayasanker Thayasivam for his guidance as my supervisor, Braveenan Sritharan for his mentorship, and all the dataset owners for making their datasets available for us through open access or upon request. Your support has been invaluable.

About

A deep learning-based Speech Emotion Recognition (SER) model trained primarily on Indian languages. Designed for applications in call centers, sentiment analysis, and accessibility tools.

Topics

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •