GitHub - yzhang511/bit: A cross-species neural foundation model for end-to-end speech decoding

BraIn-to-Text (BIT)

BraIn-to-Text (BIT) is an end-to-end speech BCI framework that connects a transformer-based neural encoder with audio LLMs to directly generate sentences from neural activity.

Paper
Setup
Training
Evaluation

Setup

conda env create -f env.yaml

Brain2Text 25

Download data from DRYAD and rename it to brandman_2024_text.

Brain2Text 24

Download competitionData.tar.gz from DRYAD and rename it to willett_2023_text.

Training

Update trainer YAML to use your own data and checkpoint path. For example, change the following entries in configs/finetune/phoneme/ndt/trainer.yaml:

dirs:
  data_dir: YOUR_DATA_DIR
  checkpoint_dir: YOUR_CHECKPOINT_DIR
  log_dir: YOUR_LOG_DIR

Run the following command to train a model:

python train.py --training_mode MODE \
                --dataset DATASET \
                --features FEATURES \
                --encoder ENCODER \
                --task TASK \
                [--ft_ckpt CKPT] \
                [--ds_config DS_CONFIG] \
                [--kwargs KEY=VALUE ...]

--training_mode: train_from_scratch, finetune
--encoder: ndt
--task: phoneme, sentence
--dataset: willett_2023_text, brandman_2024_text
--features: all, tx1, spikePow
--ft_ckpt: path to finetuned model checkpoint (optional)

Example

Train from scratch for phoneme decoding:

python train.py --training_mode train_from_scratch \
                --dataset brandman_2024_text \
                --features all \
                --encoder ndt \
                --task phoneme

Fine-tune the above model for sentence decoding:

python train.py --training_mode finetune \
                --dataset brandman_2024_text \
                --features all \
                --encoder ndt \
                --task sentence \
                --ft_ckpt YOUR_MODEL_PATH

Evaluation

Once you have the fine-tuned model, you can generate sentence predictions in two stages:

Run the following command to predict phonemes:

python eval_phoneme.py --model_path YOUR_MODEL_PATH --eval_split val

--eval_split: val, test, holdout
"val" corresponds to the benchmark test set; use "holdout" for the holdout set of the competition
{eval_split}_phoneme_logits.pt is saved for use in language model rescoring

Run the following command to predict sentences using an LLM:

python eval_llm.py --model_path YOUR_MODEL_PATH --eval_split val

--nbest: number of candidate sentences for nucleus sampling (optional)
--phoneme_logits_path: path to saved phoneme logits (optional)

Citation

Please cite our paper if you use this code in your own work:

@inproceedings{zhangcross,
  title={A cross-species neural foundation model for end-to-end speech decoding},
  author={Zhang, Yizi and He, Linyang and Fan, Chaofei and Liu, Tingkai and Yu, Han and Le, Trung and Li, Jingyuan and Linderman, Scott and Duncker, Lea and Willett, Francis R and others},
  booktitle={The Fourteenth International Conference on Learning Representations}
}

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
configs		configs
data		data
deepspeed		deepspeed
models		models
utils		utils
LICENSE		LICENSE
README.md		README.md
env.yaml		env.yaml
eval_llm.py		eval_llm.py
eval_phoneme.py		eval_phoneme.py
registry.py		registry.py
train.py		train.py
vocab.json		vocab.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BraIn-to-Text (BIT)

Setup

Brain2Text 25

Brain2Text 24

Training

Example

Evaluation

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

BraIn-to-Text (BIT)

Setup

Brain2Text 25

Brain2Text 24

Training

Example

Evaluation

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages