SQL scripts to establish an ICU patient cohort receiving intravenous meropenem in the MIMIC-IV database and extract covariates for machine-learning-based risk-factor analysis.
This repository provides:
- A cohort definition based on meropenem administered via IV.
- Reusable SQL for covariate extraction suitable for downstream ML and statistical modeling.
- Notes for running on either PostgreSQL (local) or Google BigQuery.
MIMIC-IV is a large, de-identified critical care database hosted on PhysioNet. Access requires credentialing and a data use agreement. (PubMed, PhysioNet)
-
Complete human-subjects/privacy training (CITI Program “Data or Specimens Only Research”). (PhysioNet)
-
Request PhysioNet credentialed access and sign the Data Use Agreement (DUA) for MIMIC-IV. (PhysioNet)
-
Choose your compute environment:
Cite MIMIC-IV: Johnson AEW, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10(1):1. doi:10.1038/s41597-022-01899-x. (PubMed, Nature)
.
├── sql/
│ ├── sql_for_meropenem_iv_cohort.sql # cohort + covariates
│ └── utils/ # optional helpers (e.g., views)
├── README.md
└── LICENSE
-
Inclusion
- ICU stays in MIMIC-IV with documented meropenem administration via IV.
- Index time = first IV meropenem start during the ICU stay.
-
Exclusions (example; adjust as needed)
- Age < 18 years at admission.
- Missing key timestamps or implausible intervals.
Adapt filters to your study protocol and local IRB requirements.
The script demonstrates how to join cohort rows to commonly used MIMIC tables to derive:
- Demographics & admission details
- Comorbidity summaries (e.g., Charlson)
- Severity scores (e.g., SOFA/OASIS at or prior to index)
- Vitals/labs around index window
- Organ support (e.g., ventilation) and early interventions
Exact fields and windows are configurable near the top of the SQL.
- Upload
sql_for_meropenem_iv_cohort.sqlto BigQuery Console or run viabqCLI. - Set project + dataset and execute the script against MIMIC-IV v3.1 (
mimiciv_v3_1_hosp,mimiciv_v3_1_icu). (PhysioNet)
- Load MIMIC-IV into Postgres (see community loaders). (GitHub, PhysioNet)
- Set your
search_pathto the MIMIC schemas, then run:
psql 'dbname=mimic4 user=<you> options=--search_path=mimiciv' -f sql/sql_for_meropenem_iv_cohort.sql- Versioning: Note the MIMIC-IV version and date (e.g., v3.1). Schema names/tables can change between versions. (PhysioNet)
- Determinism: Seed any stochastic steps downstream (e.g., train/val/test splits).
- Provenance: Record your SQL commit hash and BigQuery job IDs or Postgres dump checksum.
- Use only for approved research purposes consistent with the MIMIC-IV DUA.
- Do not attempt re-identification.
- Follow institutional IRB/ethics guidance.
- MIMIC-IV dataset page (v3.1 / v2.2): versions, schema notes, and updates. (PhysioNet)
- PhysioNet credentialing & DUA: steps to request access. (PhysioNet)
- CITI course instructions for PhysioNet: recommended training track. (PhysioNet)
- Postgres loading examples: community loaders and scripts. (GitHub, PhysioNet)
- MIMIC-IV paper (Sci Data, 2023): canonical reference. (PubMed, Nature)
- MIMIC-IV: Johnson AEW, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10(1):1. doi:10.1038/s41597-022-01899-x. (PubMed)
This code is released under the MIT License. Note that MIMIC-IV data remain governed by the PhysioNet DUA and are not redistributed here.