A Penny for Your Thoughts: Decoding Speech from Inexpensive, Non-Invasive Brain Signals

Background and Objectives

We investigate whether neural networks can approximate a decoding function by converting brain signals in the form of EEG recordings into speech (brain-to-speech decoding).

Given EEG data recorded while a subject listened to audio, we train our model using a contrastive CLIP loss that takes in the embeddings generated by our models from passing through the EEG data and embeddings from the audio passed through a pre-trained transformer-based English speech model. We contribute three proposed alterations to the current state-of-the-art architecture, two of which improved performance in our experiments: (i) adding an attention mechanism to the subject layer (0.29% improvement relative to baseline), (ii) personalizing the spatial attention score for each subject (1.28% improvement relative to baseline), and (iii) using a dual path RNN in combination with attention layers (6.13% reduction in performance relative to baseline).

Our results are promising for applications in brain-computer interaction, such as speech-impaired accessibility.

It is implemented as part of CMU 11-785 Course on Deep Learning.

Datasets

We work with EEG data as it is non-invasive and inexpensive to record compared to other brain signal recording methods such as MEG or fMRI. We rely on data from Brennan and Hale. This dataset contains EEG data collected using 62 sensors from 33 subjects, totaling approximately 6.7 hours of recordings.

Data Access:

Raw EEG and audio data: DeepBlue Dataset
Audio embeddings (embedded with Wav2Vec): Google Drive

Repository Structure

This repository contains the following files and directories:

brainmagick_updated folder: Includes the original Meta's code and our modifications to the model architecture:
- common.py
- simpleconv.py
pretrained-models folder: Contains the pre-trained model Wav2Vec used for audio embeddings.
notebooks folder: A collection of Jupyter notebooks:
- mne-eeg-preprocessing.ipynb: Includes preprocessing of raw EEG data.
- baseline model_runner (submitted as mid-term).ipynb: Contains the runner used to obtain baseline results.
- complete-training-pipeline-eda.ipynb: Features the custom pipeline to run experiments, as well as exploratory data analysis.
- experiments-1.ipynb: Provides the runner for conducting ablations.
- experiments-2.ipynb: Includes another runner for conducting ablations.
- metrics-plots.ipynb: Displays plots with the results of ablations.

Note: All data used in this project is sourced ethically, and the analysis adheres to the highest standards of research integrity and ethical guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
brainmagick_updated		brainmagick_updated
notebooks		notebooks
pretrained-models		pretrained-models
README.md		README.md
final-report.pdf		final-report.pdf
video-presentation-final.pdf		video-presentation-final.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

A Penny for Your Thoughts: Decoding Speech from Inexpensive, Non-Invasive Brain Signals

Background and Objectives

Datasets

Repository Structure

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

kshapovalenko/DL-EEG-Speech-Decoder

Folders and files

Latest commit

History

Repository files navigation

A Penny for Your Thoughts: Decoding Speech from Inexpensive, Non-Invasive Brain Signals

Background and Objectives

Datasets

Repository Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages