Disparities in MIMIC-III

Here we recreate plots from "Can AI Help Reduce Disparities in General Medical and Mental Health Care?" by Chen, Szolovits, and Ghassemi 2019 (AMA Journal of Ethics)

Because of data proprietary, we cannot share the psychiatric dataset. The same code is used for both datasets.

We demonstrate:

Data hetereogeneity in the MIMIC clinical notes through LDA topic modeling and disparities in topics by race, gender, and insurance type
Disparities in predictive accuracy by race, gender, and insurance type

Recreating results

Get MIMIC notes from make_mimic_notes.py. You will need to adjust the username and location of MIMIC data.
Get Mallet topics from the notes. We convert the notes into separate text files in make_mallet_data.py. We then run Mallet in run_mallet_topics.sh.
Create plots in Recreate_Plots.ipynb

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
.gitignore		.gitignore
README.md		README.md
Recreate_Plots.ipynb		Recreate_Plots.ipynb
load_mimic.py		load_mimic.py
make_mallet_data.py		make_mallet_data.py
make_mimic_notes.py		make_mimic_notes.py
mimic.py		mimic.py
requirements.txt		requirements.txt
run_error.py		run_error.py
run_mallet_topics.sh		run_mallet_topics.sh