This project ingests NEMSIS-compliant XML files into a dynamic PostgreSQL database schema, creating tables or colmns dynamically based based on XML structure. It is designed for scalable, flexible EMS data warehousing and analysis from exported NEMSIS compliant software vendors. As long as the software is set to export agency specfic custom question as well they will included in the database for analysis.
The hope is that by allowing agnecies to build data lakes or datawarehouses internally it will reduce the need for formbuilding or external reporting mechanisms for KPI gathering and quality management systems.
While this tool is useful for creating an accessible database for use as a datalake and to ensure up-to-date and accurate KPI creation and visual analysis of data, it is crucial to remember that implementation must involve a proper compliance team specialized in healthcare IT. These records DO contain Protected Health Information (PHI) and must be handled in accordance with the principle of minimum necessary to achieve the task at hand. Compliance with HIPAA and the 21st Century CURES Act is necessary. Always ensure that all data handling, storage, and access are reviewed and approved by qualified compliance professionals to protect patient privacy and meet all legal and regulatory requirements.
- Dynamic Table Creation: Tables are created based on XML tag structure.
- UUID-based Overwrite: Data is keyed by PatientCareReport UUID for safe updates.
- Table Comments: Each table stores its XML path as a PostgreSQL table comment.
- Bulk Ingestion: Easily process all XML files in a directory.
- Python 3.8+
- PostgreSQL (with a database you can connect to)
- Python packages:
psycopg2
python-dotenv
Install dependencies:
pip install psycopg2 python-dotenv
Create a .env
file in the project root with your PostgreSQL connection details:
PG_HOST=localhost
PG_PORT=5432
PG_DATABASE=your_database
PG_USER=your_user
PG_PASSWORD=your_password
- Create the PostgreSQL database (if it does not exist):
Connect to your PostgreSQL server and run:
CREATE DATABASE your_database;
- Create tables and initialize schema:
This will create the required tables in your database.
python database_setup.py
To ingest a single XML file:
python main_ingest.py nemsis_xml/your_file.xml
To ingest all XML files in a directory (PowerShell example):
Get-ChildItem -Path .\nemsis_xml\*.xml | ForEach-Object { python main_ingest.py $_.FullName }
Or (CMD example):
for %f in (nemsis_xml\*.xml) do python main_ingest.py "%f"
To generate or refresh all normalized SQL views for the NEMSIS data, run:
python create_views.py [--verbose]
- The
--verbose
flag (optional) will print the generated SQL for each view. - This script will also update the ElementDefinitions and FieldDefinitions tables from the latest NEMSIS sources.
To manually (re)populate the ElementDefinitions
and FieldDefinitions
tables from the official NEMSIS sources:
python create_definitions.py
- This will download the latest definitions and update the tables in your database.
To import vendor-specific Excel exports into your database, use:
python vendor_import.py -file_path <path_to_excel> -vendor <vendor_name> -source <source_name>
- Example:
python vendor_import.py -file_path "./vendor_data.xlsx" -vendor imagetrend -source new_hampshire
- The script will create tables named
<source>_<sheetname>
for each supported sheet, using only non-blank rows and the columns specified in the script for that vendor. - Sheet and column mappings are hardcoded per vendor in the script. Update
VENDOR_SPECS
invendor_import.py
to add or modify vendor logic.
- Processed XML files are archived in the
processed_xml_archive/
directory. - Data is available in your PostgreSQL database, with dynamic tables for each XML tag type and vendor import.
- SQL views are available for normalized, analysis-ready querying.
- Element and field definitions are available in the
ElementDefinitions
andFieldDefinitions
tables.
- The ingestion script will skip or update data based on the PatientCareReport UUID.
- Table comments in PostgreSQL will contain the XML path for each table.
- You can use standard SQL tools to query and analyze the ingested data.
- Database connection errors: Ensure your
.env
is correct and the database exists. - Missing dependencies: Install with
pip install psycopg2 python-dotenv
. - Permission errors: Make sure your PostgreSQL user has rights to create tables and comments.
docker run -p 5432:5432 -d -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=nemsis postgres
MIT License