Skip to content

Latest commit

 

History

History
executable file
·
162 lines (116 loc) · 10.3 KB

README.md

File metadata and controls

executable file
·
162 lines (116 loc) · 10.3 KB

👂 💉 EHRSHOT

A benchmark/dataset for few-shot evaluation of foundation models for electronic health records (EHRs). You can read the paper here.

Whereas most prior EHR benchmarks are limited to the ICU setting, EHRSHOT contains the full longitudinal health records of 6,739 patients from Stanford Medicine and a diverse set of 15 classification tasks tailored towards few-shot evaluation of pre-trained models.

📖 Table of Contents

  1. Quick Start
  2. Pre-trained Foundation Model
  3. Dataset + Tasks
  4. Comparison to Prior Work
  5. Citation

Use the following steps to run the EHRSHOT benchmark.

1): Install EHRSHOT

conda create -n EHRSHOT_ENV python=3.10 -y
conda activate EHRSHOT_ENV

git clone https://github.com/som-shahlab/ehrshot-benchmark.git
cd ehrshot-benchmark
pip install -r requirements.txt

2): Install FEMR

For our data preprocessing pipeline we use FEMR (Framework for Electronic Medical Records), a Python package for building deep learning models with EHR data.

You must also have CUDA/cuDNN installed (we recommend CUDA 11.8 and cuDNN 8.7.0).

Note that this currently only works on Linux machines.

pip install femr==0.0.21
pip install --upgrade "jax[cuda11_pip]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

3): Download dataset + model from Redivis here and place the results in a directory called EHRSHOT_ASSETS/.

4): Run the benchmark end-to-end with:

bash run_all.sh

Folder Structure

Your final folder structure should look like this:

  • ehrshot-benchmark/
    • EHRSHOT_ASSETS/
      • database/
        • We provide this asset from Redivis, which contains deidentified EHR data as a FEMR extract.
      • labels/
        • We provide this asset from Redivis, which contains labels and few-shot samples for all our tasks.
      • models/
        • We provide this asset from Redivis, which contains our pretrained foundation model for EHRs.
      • splits.csv
        • We provide this asset from Redivis, which determine which patient corresponds to which split.
    • ehrshot/
      • We provide the scripts to run the benchmark here

Access: The model is on HuggingFace here and requires signing a research usage agreement.

We publish the model weights of a 141 million parameter clinical foundation model pre-trained on the deidentified structured EHR data of 2.57M patients from Stanford Medicine.

We are one of the first to fully release such a model for coded EHR data; in contrast, most prior models released for clinical data (e.g. GatorTron, ClinicalBERT) only work with unstructured text and cannot process the rich, structured data within an EHR.

We use Clinical Language-Model-Based Representations (CLMBR) as our model. CLMBR is an autoregressive model designed to predict the next medical code in a patient's timeline given previous codes. CLMBR employs causally masked local attention, ensuring forward-only flow of information which is vital for prediction tasks and is in contrast to BERT-based models which are bidirectional in nature. We utilize a transformer as our base model with 141 million trainable parameters and a next code prediction objective, providing minute-level EHR resolution rather than the day-level aggregation of the original model formulation.

Access: The EHRSHOT-2023 dataset is on Redivis here and requires signing a research usage agreement.

EHRSHOT-2023 contains:

  • 6,739 patients
  • 41.6 million clinical events
  • 921,499 visits
  • 15 prediction tasks

Each patient consists of an ordered timeline of clinical events taken from the structured data of their EHR (e.g. diagnoses, procedures, prescriptions, etc.).

Each task is a predictive classification task, and includes a canonical train/val/test split. The tasks are defined as follows:

Task Type Prediction Time Time Horizon
Long Length of Stay Binary 11:59pm on day of admission Admission duration
30-day Readmission Binary 11:59pm on day of discharge 30-days post discharge
ICU Transfer Binary 11:59pm on day of admission Admission duration
Thrombocytopenia 4-way Multiclass Immediately before result is recorded Next result
Hyperkalemia 4-way Multiclass Immediately before result is recorded Next result
Hypoglycemia 4-way Multiclass Immediately before result is recorded Next result
Hyponatremia 4-way Multiclass Immediately before result is recorded Next result
Anemia 4-way Multiclass Immediately before result is recorded Next result
Hypertension Binary 11:59pm on day of discharge 1 year post-discharge
Hyperlipidemia Binary 11:59pm on day of discharge 1 year post-discharge
Pancreatic Cancer Binary 11:59pm on day of discharge 1 year post-discharge
Celiac Binary 11:59pm on day of discharge 1 year post-discharge
Lupus Binary 11:59pm on day of discharge 1 year post-discharge
Acute MI Binary 11:59pm on day of discharge 1 year post-discharge
Chest X-Ray Findings 14-way Multilabel 24hrs before report is recorded Next report

Most prior benchmarks are (1) limited to the ICU setting and (2) not tailored towards few-shot evaluation of pre-trained models.

In contrast, EHRSHOT contains (1) the full breadth of longitudinal data that a health system would expect to have on the patients it treats and (2) a broad range of tasks designed to evaluate models' task adaptation and few-shot capabilities:

Benchmark Source EHR Properties Evaluation Reproducibility
Dataset ICU/ED Visits Non-ICU/ED Visits # of Patients # of Tasks Few Shot Dataset via DUA Preprocessing Code Model Weights
EHRSHOT Stanford Medicine 7k 15
MIMIC-Extract MIMIC-III -- 34k 5 -- --
Purushotham 2018 MIMIC-III -- 35k 3 -- --
Harutyunyan 2019 MIMIC-III -- 33k 4 -- --
Gupta 2022 MIMIC-IV * 257k 4 -- --
COP-E-CAT MIMIC-IV * 257k 4 -- --
Xie 2022 MIMIC-IV * 216k 3 -- --
eICU eICU -- 73k 4 -- --
EHR PT MIMIC-III / eICU -- 86k 11 --
FIDDLE MIMIC-III / eICU -- 157k 3 -- --
HiRID-ICU HiRID -- 33k 6 -- --
Solares 2020 CPRD 4M 2 -- -- -- --

Citation

If you find this project helpful, please cite our paper:

@article{wornow2023ehrshot,
      title={EHRSHOT: An EHR Benchmark for Few-Shot Evaluation of Foundation Models}, 
      author={Michael Wornow and Rahul Thapa and Ethan Steinberg and Jason Fries and Nigam Shah},
      year={2023},
      eprint={2307.02028},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

The source code of this repo is released under the Apache License 2.0. The model license are listed on their corresponding webpages.