|
|
|
Codes for pre-processing, dataloader creating, and Self-Supervised Learning evaluation over current ENIGMA-PTSD dataset. For accessing ENIGMA-PTSD data please contact contact Dr. Xi Zhu.
This repository contains:
- Preprocessing pipelines to generate in this folder:
- structural features (sMRI-derived)
- fALFF/ReHo (rs-fMRI-derived)
- RSData (resting-state 4D time-series–derived)
- Dataloader creation utilities for downstream SSL evaluation/training in this folder.
For start running the preprocessing and DataLoader creation follow the next steps in sequence:
git clone https://github.com/BRAINLAB-UTA/ENIGMA-PTSD.git
cd ENIGMA-PTSDPlease install pip before anything
Using Conda
conda create -n enigma-ptsd python=3.11 -y
conda activate enigma-ptsd
pip install -r requirements.txtUsing venv
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtFor a full installation guide for all the OS following this Installation Guide
Here we describe the intended flow for producing the derived files needed by the dataloader.
After you get access to the ENIGMA dataset you will see the main folder structure like this after you use ls -lrth command. Take into account this data is NOT structured in BIDS format. This main folder represent the input subfolders location for RSData, Structure, and falff_ReHo absolute paths in your Python codes.
The data structure is composed of a series of distributed inputs (per subject) on each modality folder as follows:
- The thickness, volume, and surface .xlsx files in the folder Structural
- Resting-state fMRI (4D NIfTI) + metadata TR .tsv files, and the corresponding .json files in the folder RSData
- ALFF/fALFF/ReHo 3D images in the folder falff_reHo also denoted as the aCompcorOnly inner folder.
- Site/subject mapping tables (IDs, site names), as used by the ENIGMA project in main ENIGMA anotation spreadsheet denoted as ENIGMA-PGC_master_v1.3.1.xlsx.
The structure of each inner folder per modality will be like this:
/DATA_Modality/
siteA/
sub-XXXX/
..rest.tsv
or
..structure_thick_vol_surf.xlsx
or
..falff_reho.nii.gz
siteB/
sub-YYYY/
...
Again this data does not follow a standard BIDS format. The current code can handle different annotation and suffixes per site and modality this data comes with. Data is in process of re-annotation and BIDS data re-structuring.
To create the 3D structural images on the Destrieux and Desikan-Killiany-Tourville (DKT) atlases run the follow Python command in the preprocessing folder
python create_parcellation_structural.py <structural_path>This code generates three different type of 3D image in the Structural folder with vol (Volume), thick (Cortical Thickness), and surf (Surface) data and the corresponding suffixes per subject and site.
This command an take a couple of minutes, so be patient depending on the power of your CPU processing. Take into account you must change the absolute or relative path directories for all the modalities before running this command and the following.
This code generates projected 4D nii.gz images from the preprocessed .tsv files in the RSData folder. The code generates 4D images from Schaefer and Brainnetome atlases with the original resolution of the MNI mask MNI152NLin2009cAsym from freesurfer 193 x 215 x 193 in X, Y, Z voxel dimensions. You will need freesurfer to have those masks and atlases. To install freesurfer select the adequate tarball here and follow the instructions.
python create_parcellation_images_mni.py <rsdata_path> <atlases_path>For creating alternative 4D images with a decimated factor as an integer that resample the images on X, Y, Z voxels dimension you must run the following command.
python create_parcellation_images_mni_smaller_resample.py <decimation_factor> <rsdata_path> <atlases_path>This code will generate the interim projected or projected/resampled images in the RSData folder with the corresponding Atlas suffix, such as, schaefer and brainnetome per subject and site.
This images are already derived from the aCompcorOnly folder there are not need to be processed before running the dataloader.
For doing a visual QC process after the images are generated per site and subject
First install FSL following the steps in the officinal website https://fsl.fmrib.ox.ac.uk/fsl/docs/install/.
For installing FSL download the fslinstaller.py file run this commands and set up a folder in your local machine to get the bin files.
python fslinstaller.pyNow load the bin folder in your bashrc system using the .sh code here and running this. First locate where your FSL files are located in your local machine and modify the bash file with the right path.
source fsl_load.shNow you can inspect the quality of your data using fsleyes based on the following bash command from the same subject and site having the absolute path of the 3D Structural, 4D RSData, and 3D falff.ReHo. The paths in the example are for site AMC. In fsleyes change the opacity of the 4D images to 0.5, and change the 3D images parcellation colormap from grayscale to HSV.
fsleyes ../../AMC/sub-1132/sub-1132_schaefer_4d_mni_image.nii.gz ../../AMC/sub-1132/sub-1132_brainnetome_4d_image.nii.gz ../../AMC/sub-1132/sub-1132_schaefer_mni_image.nii.gz ../../AMC/sub-1132/sub-1132_brainnetome_mni_image.nii.gz ../../AMC/sub-1132/1132_Destrieux_thick_struct3D.nii.gz
Click here for better visualization
Check the timeseries plots and the aligment of the different ROIs on each image to be sure the code is working as you expect.
After concatenating all subjects that define RSData, Structural, and fALFF/ReHo data we obtained a final overlap of 1665 subjects for the sites ['Duke', 'Muenster', 'WacoVA', 'Capetown', 'AMC', 'Lawson', 'Vanderbilt', 'Ghent', 'MinnVA', 'Milwaukee', 'Emory', 'Masaryk', 'Beijing', 'UWash', 'Grupe', 'McLean', 'NanjingYixing', 'Tours', 'Toledo', 'Groningen']
For testing the quality of your code please follow the bash commands here as a standardized pylinter. Do that for each subfolder you have running the following bash command.
source standardize_template.shCustomize your toml file following the rules here and be sure to obtain a minimum grade of 8 out of 10 for your static code evaluation per folder.