We investigate whether neural networks can approximate a decoding function by converting brain signals in the form of EEG recordings into speech (brain-to-speech decoding).
Given EEG data recorded while a subject listened to audio, we train our model using a contrastive CLIP loss that takes in the embeddings generated by our models from passing through the EEG data and embeddings from the audio passed through a pre-trained transformer-based English speech model. We contribute three proposed alterations to the current state-of-the-art architecture, two of which improved performance in our experiments: (i) adding an attention mechanism to the subject layer (0.29% improvement relative to baseline), (ii) personalizing the spatial attention score for each subject (1.28% improvement relative to baseline), and (iii) using a dual path RNN in combination with attention layers (6.13% reduction in performance relative to baseline).
Our results are promising for applications in brain-computer interaction, such as speech-impaired accessibility.
It is implemented as part of CMU 11-785 Course on Deep Learning.
We work with EEG data as it is non-invasive and inexpensive to record compared to other brain signal recording methods such as MEG or fMRI. We rely on data from Brennan and Hale. This dataset contains EEG data collected using 62 sensors from 33 subjects, totaling approximately 6.7 hours of recordings.
Data Access:
- Raw EEG and audio data: DeepBlue Dataset
- Audio embeddings (embedded with Wav2Vec): Google Drive
This repository contains the following files and directories:
brainmagick_updatedfolder: Includes the original Meta's code and our modifications to the model architecture:common.pysimpleconv.py
pretrained-modelsfolder: Contains the pre-trained model Wav2Vec used for audio embeddings.notebooksfolder: A collection of Jupyter notebooks:mne-eeg-preprocessing.ipynb: Includes preprocessing of raw EEG data.baseline model_runner (submitted as mid-term).ipynb: Contains the runner used to obtain baseline results.complete-training-pipeline-eda.ipynb: Features the custom pipeline to run experiments, as well as exploratory data analysis.experiments-1.ipynb: Provides the runner for conducting ablations.experiments-2.ipynb: Includes another runner for conducting ablations.metrics-plots.ipynb: Displays plots with the results of ablations.
Note: All data used in this project is sourced ethically, and the analysis adheres to the highest standards of research integrity and ethical guidelines.