Skip to content
Alexandre Routier edited this page Jun 10, 2020 · 12 revisions

Data handling tools

This page describes data handling tools provided by Clinica for BIDS and CAPS compliant datasets. These tools provide easy interaction mechanisms with datasets, including generating subject lists or merging all tabular data into a single TSV for analysis with external statistical software tools.

##create-subjects-visits - Generate the list all subjects and visits of a given dataset A TSV file with two columns (participant_id and session_id) containing the list of visits for each subject can be created as follows:

clinica iotools create-subjects-visits <bids_directory> <output_tsv>

where:

  • bids_directory: input folder of a BIDS compliant dataset,
  • output_tsv: output TSV file containing the subjects with their sessions.

Here is an example of the file generated by this tool:

participant_id   session_id
sub-01           ses-M00
sub-02           ses-M24
sub-03           ses-M24
...

!!! note The format of the participant ID and the session ID follows the BIDS standard.

!!! example Text clinica iotools create-subjects-visits /home/ADNI_BIDS/ adni_participants.tsv

##check-missing-modalities - Check missing modalities for each subject Starting from a BIDS compliant dataset, this command creates:

  1. A TSV file for each session available with the list of the modalities found for each subject. The name of the files produced will be <prefix>_ses-<session_label>.tsv.
  2. A text file containing the number and the percentage of modalities missing for each session. The name of the files produced will be <prefix>_summary.txt.

If no value for <prefix> is specified by the user, the default will be missing_mods.

clinica iotools check-missing-modalities <bids_directory> <output_directory> [-op]

where:

  • bids_directory: input folder of a BIDS compliant dataset
  • output_directory: output folder
  • -op / --output_prefix (Optional): prefix used for the name of the output files. If not specified the default value will be missing_mods

If, for example, only the session M00 is available and the parameter -op is not specified, the command will create the files:

  • missing_mods_ses-M00.tsv
  • missing_mods_summary.txt.

The content of missing_mods_ses-M00.tsv will look like:

participant_id   T1w   DWI
sub-01           1       1
sub-02           1       0
sub-03           1       0

Where the column participant_id contains all the subjects found and the following columns correspond to the list of all the modalities available for the given dataset. The availability is expressed by a boolean value.

The nomenclature of the modalities tries to follow, as much as possible, the one proposed by the BIDS standard.

!!! example Text clinica iotools check-missing-modalities /Home/ADNI_BIDS/ /Home/ clinica iotools check-missing-modalities /Home/ADNI_BIDS/ /Home/ -op new_name

##merge-tsv - Gather BIDS and CAPS data into a single TSV file BIDS and CAPS datasets are composed of multiple TSV files for the different subjects and sessions. While this has some advantages, it may not be convenient when performing statistical analyses (with external statistical software tools for instance). This command merges all the TSV files into a single larger TSV file and can be run with the following command line:

clinica iotools merge-tsv bids_directory  output_tsv

where:

  • bids_directory is the input folder containing the dataset in a BIDS hierarchy.
  • output_tsv is the path of the output tsv. If a directory is specified instead of a file name, the default name for the file created will be merge-tsv.tsv.

The optional arguments allow the user to also merge data from a CAPS directory, which will be concatenated to the BIDS summary. The main optional arguments are the following:

  • -caps: input folder of a CAPS compliant dataset

If a CAPS folder is given, data generated by the pipelines of Clinica (regional measures) will be merged to the output file, and a summary file containing the names of the atlases merged will be generated in the same folder.

  • -tsv: input list of subjects and sessions

If an input list of subjects and sessions is given, the merged file will only gather information from the pairs of subjects and sessions specified.

!!! example Text clinica iotools merge-tsv /Home/ADNI_BIDS /Home/merge-tsv.tsv -caps /Home/ADNI_CAPS -tsv /Home/list_subjects.tsv

The output file will contain one row for each visit:
```Text
participant_id   session_id   date_of_birth   ...   ..._ROI-0   ..._ROI-1  ...
sub-01           ses-M00      25/04/41        ...   9.824750    0.023562
sub-01           ses-M18      25/04/41        ...   8.865353    0.012349
sub-02           ses-M00      09/01/91        ...   9.586342    0.027254
...
```

##center-nifti - Center NIfTI files of a BIDS directory Your BIDS dataset may contain NIfTI files whose origin does not correspond to the center of the image (i.e. the anterior commissure). SPM is especially sensitive to this case, and segmentation procedures may result in blank images, or even fail. To mitigate this issue, we propose a simple tool that convert your BIDS dataset into a dataset with centered NIfTI files for the selected modalities. Only NIfTI volumes whose center is at more than 50 mm from the origin of the world coordinate system are centered (this can be changed by the --center_all_files flag). This threshold has been chosen empirically after a set of experiments to determine at which distance from the origin SPM segmentation and coregistration procedures stop working properly. By default, this tool will only center T1w images but you can specify other modalities.

clinica iotools center-nifti <bids_directory> <new_bids_directory> [--modality modality] [--center_all_files]

where:

  • bids_directory is the input folder containing the dataset in a BIDS hierarchy.
  • new_bids_directory is the output path to the new version of your BIDS dataset, with faulty NIfTI centered. This folder can be empty or nonexistent.

Optional arguments:

  • --modality is a parameter that defines which modalities are converted. (Only T1w images are centered by default.)
  • --center_all_files is an option that forces Clinica to center all the files of the modalities selected with the --modality flag.

!!! note The images contained in the input bids_directory folder that do not need to be centered will also be copied to the output folder new_bids_directory.

If you want to convert FDG PET images (e.g. with `_acq-fdg` key/value in PET filename), use:
```Text
clinica iotools center-nifti bids_directory new_bids_directory --modality "fdg_pet"
```

If you want to convert AV45 PET images and T1w:

```Text
clinica iotools center-nifti bids_directory new_bids_directory --modality "av45_pet t1w"
```

To know if a NIfTI image must be centered, the algorithm checks the filenames of the NIfTI images. For example, regarding the file `bids/sub-01/ses-M0/anat/sub-01_ses-M0_T1w.nii`:

 - The filename is `sub-01_ses-M0_T1w.nii`.
 - The algorithm tests (in a case insensitive way) if the string `fdg_pet` is in the filename: False.
 - The algorithm tests (in a case insensitive way) if the string `t1w` is in the filename: True!
 - The algorithm tests if the volume has its center at more than 50 mm (Euclidian distance) from the origin: True.
 - This file will be centered by the algorithm.

 Understanding this, you can now center any modality you want! If your files are named following this pattern : `sub-X_ses-Y_magnitude1.nii.gz`, specify the modality as follows:`--modality "magnitude1"`.

 The list of the converted files will appear in a text file in `new_bids_directory/centered_nifti_list_TIMESTAMP.txt`.
Clone this wiki locally