Skip to content

SQL scripts to establish an ICU patient cohort from the MIMIC-IV database for machine learning-based risk factor analysis.

License

Notifications You must be signed in to change notification settings

MahbubAlam231/SQL-mimic-meropenem-iv-cohort

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 

Repository files navigation

SQL : Meropenem IV Cohort (MIMIC-IV)

SQL scripts to establish an ICU patient cohort receiving intravenous meropenem in the MIMIC-IV database and extract covariates for machine-learning-based risk-factor analysis.

Overview

This repository provides:

  • A cohort definition based on meropenem administered via IV.
  • Reusable SQL for covariate extraction suitable for downstream ML and statistical modeling.
  • Notes for running on either PostgreSQL (local) or Google BigQuery.

MIMIC-IV is a large, de-identified critical care database hosted on PhysioNet. Access requires credentialing and a data use agreement. (PubMed, PhysioNet)


Data Access (MIMIC-IV)

  1. Complete human-subjects/privacy training (CITI Program “Data or Specimens Only Research”). (PhysioNet)

  2. Request PhysioNet credentialed access and sign the Data Use Agreement (DUA) for MIMIC-IV. (PhysioNet)

  3. Choose your compute environment:

    • BigQuery (recommended for ease & scale): MIMIC-IV v3.1 is available as mimiciv_v3_1_hosp / mimiciv_v3_1_icu. (PhysioNet)
    • Local PostgreSQL: load MIMIC-IV using community scripts (see links below). (GitHub, PhysioNet)

Cite MIMIC-IV: Johnson AEW, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10(1):1. doi:10.1038/s41597-022-01899-x. (PubMed, Nature)


Repository Structure

.
├── sql/
│   ├── sql_for_meropenem_iv_cohort.sql   # cohort + covariates
│   └── utils/                            # optional helpers (e.g., views)
├── README.md
└── LICENSE

Cohort Definition (Summary)

  • Inclusion

    • ICU stays in MIMIC-IV with documented meropenem administration via IV.
    • Index time = first IV meropenem start during the ICU stay.
  • Exclusions (example; adjust as needed)

    • Age < 18 years at admission.
    • Missing key timestamps or implausible intervals.

Adapt filters to your study protocol and local IRB requirements.


Covariates (Examples)

The script demonstrates how to join cohort rows to commonly used MIMIC tables to derive:

  • Demographics & admission details
  • Comorbidity summaries (e.g., Charlson)
  • Severity scores (e.g., SOFA/OASIS at or prior to index)
  • Vitals/labs around index window
  • Organ support (e.g., ventilation) and early interventions

Exact fields and windows are configurable near the top of the SQL.


Quick Start

Option A: BigQuery

  1. Upload sql_for_meropenem_iv_cohort.sql to BigQuery Console or run via bq CLI.
  2. Set project + dataset and execute the script against MIMIC-IV v3.1 (mimiciv_v3_1_hosp, mimiciv_v3_1_icu). (PhysioNet)

Option B: Local PostgreSQL

  1. Load MIMIC-IV into Postgres (see community loaders). (GitHub, PhysioNet)
  2. Set your search_path to the MIMIC schemas, then run:
psql 'dbname=mimic4 user=<you> options=--search_path=mimiciv' -f sql/sql_for_meropenem_iv_cohort.sql

Reproducibility

  • Versioning: Note the MIMIC-IV version and date (e.g., v3.1). Schema names/tables can change between versions. (PhysioNet)
  • Determinism: Seed any stochastic steps downstream (e.g., train/val/test splits).
  • Provenance: Record your SQL commit hash and BigQuery job IDs or Postgres dump checksum.

Ethics & Compliance

  • Use only for approved research purposes consistent with the MIMIC-IV DUA.
  • Do not attempt re-identification.
  • Follow institutional IRB/ethics guidance.

Helpful Links

  • MIMIC-IV dataset page (v3.1 / v2.2): versions, schema notes, and updates. (PhysioNet)
  • PhysioNet credentialing & DUA: steps to request access. (PhysioNet)
  • CITI course instructions for PhysioNet: recommended training track. (PhysioNet)
  • Postgres loading examples: community loaders and scripts. (GitHub, PhysioNet)
  • MIMIC-IV paper (Sci Data, 2023): canonical reference. (PubMed, Nature)

Citation

  • MIMIC-IV: Johnson AEW, et al. MIMIC-IV, a freely accessible electronic health record dataset. Sci Data. 2023;10(1):1. doi:10.1038/s41597-022-01899-x. (PubMed)

License

This code is released under the MIT License. Note that MIMIC-IV data remain governed by the PhysioNet DUA and are not redistributed here.


About

SQL scripts to establish an ICU patient cohort from the MIMIC-IV database for machine learning-based risk factor analysis.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published