Mental State Recognizer

This project predicts a person's mental state based on their speech using a deep learning model. It utilizes a hybrid model combining BiLSTM and CNN to analyze features extracted from audio files. You can read more about the published paper here.

Technologies Used

Python: Primary language for development
Flask: For backend and serving the model
TensorFlow: For model building and inference
librosa: For audio processing (MFCC extraction)
NumPy, SciPy: For mathematical operations
scikit-learn: For evaluation and metrics

Installation

Clone the repository:

git clone https://github.com/axlerquiza/mental-state-recognizer.git
cd mental-state-recognizer

Set up a virtual environment (optional but recommended):

python -m venv venv
source venv/bin/activate  # For Windows, use `venv\Scripts\activate`

Install the required dependencies:
```
pip install -r requirements.txt
```
Run the Flask server:
```
python server.py
```
Open your browser and navigate to http://127.0.0.1:5000/ to start using the web interface.

Project Structure

The project follows this directory structure:

/mental-state-recognizer
├── /assets
│   ├── /scripts            # JavaScript functionalities
│   │   ├── app.js
│   ├── /styles
│   │   ├── style.css       # CSS styling
├── /models                 # Folder containing the trained models
├── /mfcc_images            # Folder for storing MFCC images (generated from audio)
├── /preprocessed_audio     # Folder with sample preprocessed audio files
│   ├── 308_AUDIO_processed.wav  # Level 4 example
│   ├── 346_AUDIO_processed.wav  # Level 1 example
├── /uploads                # Folder for storing uploaded audio files
├── index.html              # Main HTML file for the web interface
├── server.py               # Flask backend server script
└── requirements.txt        # List of dependencies

Usage

Upload an audio file (preferably a 10-second speech recording) via the web interface (index.html).
The backend processes the audio file, extracting MFCC features and passing them through the trained model.
The model predicts the mental state, and the result is displayed on the webpage.

Example

Click the "Upload Audio" button on the web interface.
Choose a 10-second audio file (preferably a speech recording).
Wait for the model to process the file and display the predicted mental state.

Sample Audio Files

The /preprocessed_audio directory contains sample audio files:

308_AUDIO_processed.wav: A Level 4 example (indicating a specific mental state).
346_AUDIO_processed.wav: A Level 1 example (indicating a different mental state).

These files can be used for testing the system or understanding different levels of mental states.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Mental State Recognizer

Technologies Used

Installation

Project Structure

Usage

Example

Sample Audio Files

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
assets		assets
mfcc_images		mfcc_images
models		models
preprocessed_audio		preprocessed_audio
uploads		uploads
.gitignore		.gitignore
README.md		README.md
index.html		index.html
requirements.txt		requirements.txt
server.py		server.py

axlerquiza/mental-state-recognizer

Folders and files

Latest commit

History

Repository files navigation

Mental State Recognizer

Technologies Used

Installation

Project Structure

Usage

Example

Sample Audio Files

About

Topics

Resources

Stars

Watchers

Forks

Languages