-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One-Click Installation Universal Script for MIMIC-III Clinical Database CareVue subset with PostgreSQL on MAC OS, Windows, and Linux #1471
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This clearly took a lot of effort and I am very appreciative of the PR!
I like the idea of a very easy installation (we've always tried to make it simpler). We did start off with a Makefile for creating MIMIC, but in the end I found it was very difficult to keep things friendly across operating systems, and opted for a fully SQL approach.
With that in mind, after a quick look, there are some important changes.
- I'd rather not split mimic-iii into mimic-iii and mimic-iii-carevue. Since mimic-iii-carevue is a subset of mimic-iii, it's easier to keep the one folder and use it for mimic-iii-carevue. You should be able to use most of the mimic-iii scripts to build the data. The only catch is the validation script, which we might need a new version of.
- You'll have to use the existing concepts, and not create new SQL files. That should only require judicious use of relative paths.
- I don't like having a registry file to register a new menu item in Windows. Instead, I'd ask the user to install the bash software you need, and tell them they need to launch a terminal in the folder. That avoids the awkwardness of having a registry file and it means all the operating systems have a mostly equivalent starting point.
- For install.sh, I've made comments inline above
- Internationalization of bash scripts is possible with some additional work, which seems like a better approach than the back and forth if statements.
Before you go ahead and put more work into this though, I'd like to discuss this with some colleagues (@tompollard / @briangow) - I'm not 100% sure we'd want to commit to maintaining something like this. It is hard enough to keep this repo up to date as is!
user="postgres" | ||
test_db_login=false | ||
ptn=" reject$| md5$| password$| scram-sha-256$| gss$| sspi$| ident$| peer$| pam$| ldap$| radius$| cert$" | ||
sed_ptn="s/$ptn/ trust/g" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's a good idea to change the PostgreSQL security for inexperienced users to trust. Instead, try to get this to work for OS authentication (either via md5 or scram-sha-256). I think the best approach is:
- Detect the type of auth. If it's incompatible (e.g. reject), ask the user if you can modify it to OS-level authentication (scram-sha-256). If they refuse, exit script.
- If it's password, offer to export the PGPASSWORD env var at the start of the run.
sed_version="$(which sed)" | ||
|
||
# reset the mode of local connection as "trust" | ||
if [[ "$os" == "Darwin" ]] && [[ "$sed_version" == "/usr/bin/sed" ]]; then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use "$os" == "Darwin"
because users may install a different version of sed
on their platform, instead you can check if --version has a valid return with gnu_sed=`sed --version 2>/dev/null`
-> if $gnu_sed is non-empty, then it's GNU sed
# the database user with the same name as the system user exists | ||
echo "" | ||
else | ||
# the database user with the same name as the system user does not exist |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments which repeat the below echo statement are redundant, and can be removed
else | ||
# the database user with the same name as the system user does not exist | ||
echo "--- The database user with the same name as the system user does not exist. ---" | ||
echo "--- The database user with the same name as the system user will be created. ---" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change this to carry on from the last sentence, e.g. "A new user will be created"
echo "" | ||
|
||
# create database user with the same name as the system user | ||
psql -U postgres -c "create user $(whoami) with superuser createdb createrole replication bypassrls password '$(whoami)';" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prompt the user for a password, rather than default it to the username.
build_starttime=$(date +%s) | ||
|
||
# drop database if exists | ||
psql -c "drop database if exists mimic3_carevue;" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The database name, schema name, and derived schema name should be environment variables set at the top of the script. This will allow you to add a --carevue flag which changes these variables to carevue specific ones.
I would suggest the database remains simple (mimic
), but the schema is updated to mimiciii_cv
, and the derived schema is also changed mimiciii_cv_derived
.
Thank you, Johnson! I think I can solve the problems you pointed out in the next version according to your requirements. But I need to check with you a few questions before I PR the next version of the shell script:
|
Dear Administrator of MIMIC Team:
I have completed the re-edition of SQL codes to install MIMIC-III Clinical Database CareVue subset(I delete the SQL query for Metavision subset and convert BigQuery function to Postgresql function), and I have written a shell script that can install the database and Concepts with Postgresql on Mac, Linux, and Windows.
MIMIC new beginners only need to type
bash install.sh
in the terminal (For windows, the terminal was provided bybash.exe
of Git software inC:\Program Files\Git\bin
), and then the database can be automatically configured and installed. The whole process does not require any further operation.Sepsis3 has been added to the Concepts, which was edited based on A Comparative Analysis of Sepsis Identification Methods in an Electronic Database. sepsis3-mimic created byAlistair Johnson。
Charlson Comorbidity Index has been added to the Concepts, which was edited based onCharlson in Concepts of MIMICIV。
The readme profile contains detailed illustrations, which should be clear enough for beginners. Although the process is simple, I plan to upload a tutorial for MAC Linux, and Windows on Youtube. But the video tutorial is still being edited, and the video link will be added to the readme profile later. I hope the PR I submitted can be accepted, if you have any further questions, please get in touch with me. I'm looking forward to hearing back from you.
@alistairewj @tompollard
Chris Ning
ningyile@qq.com