Version 1.0 | Trained on 13 Languages & 14 Datasets
Multilingual Speech Emotion Recognition model primarily focused on South Indian languages, designed to detect emotions from speech, enhancing call centers, sentiment analysis, and accessibility tools.
- Emotion Detection: Capable of detecting emotions from speech in multiple Indian languages.
- Use Cases: Call centers, sentiment analysis, and accessibility tools.
- Optimized Performance: Designed for real-time emotion analysis.
This model aims to enhance user experiences by detecting emotions from speech across multilingual datasets. The focus is to apply it in industries like customer service, where emotional tone plays a crucial role.
Model uses a dual-branch approach: a Whisper branch for contextual embeddings and a traditional feature branch (MFCC, etc.) for acoustic details. An AttentionFusion module then dynamically combines these features using cross-attention. Finally, classification layers predict the emotion.
In essence: Raw Audio -> (Whisper + Traditional Features) -> AttentionFusion -> Emotion Prediction.
If you are using this model or research findings, please cite the following paper:
@article{placeholder2024,
author = {Author(s)},
title = {Paper Title},
journal = {Conference/Journal},
year = {2024},
volume = {X},
number = {Y},
pages = {ZZ-ZZ},
doi = {10.XXXX/placeholder},
}
🏷️ Name | 📚 Google Scholar | ||
---|---|---|---|
Luxshan Thavarasa | luxshan.20@cse.mrt.ac.lk | Google Scholar | |
Jubeerathan Thevakumar | jubeerathan.20@cse.mrt.ac.lk | Google Scholar | |
Thanikan Sivatheepan | thanikan.20@cse.mrt.ac.lk | Google Scholar | |
Uthayasanker Thayasivam | rtuthaya@cse.mrt.ac.lk | Google Scholar |
We would like to thank Dr. Uthayasanker Thayasivam for his guidance as my supervisor, Braveenan Sritharan for his mentorship, and all the dataset owners for making their datasets available for us through open access or upon request. Your support has been invaluable.