Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REF TEST] Rework DWI EPI pipeline #902

Merged

Conversation

NicolasGensollen
Copy link
Member

@NicolasGensollen NicolasGensollen commented Mar 31, 2023

This PR proposes to refactor the epi_pipeline which is a sub-workflow of the dwi_preprocessing_using_t1 pipeline. The "Description" section below gives an overview of the changes made in this PR and explains the reasons behind them.

Description

The base idea was to split the epi_pipeline in two workflows: perform_ants_registration and perform_dwi_epi_correction in order to simplifies things.
The epi_pipeline now does nothing more than calling these two sub-workflows with proper parameter passing.

These workflows are tested in isolation and therefore have now proper writing capabilities implemented through the output_dir argument. When a value is provided for this argument, a data sink node is added at the end of the workflow and the workflow's outputs are written to the corresponding location.

Perform ants registration

Testing the perform_ants_registration pipeline gave me some troubles because it relies on the ANTS RegistrationSynQuick utility which is stochastic. In order to be able to compare the output to a reference value, I had to add a random_seed parameter which is exposed to the user through the CLI, leveraging the recent work done in #916. This means that the non regression test of perform_ants_registration can only work with recent enough versions of ANTS supporting the random seed option.

Perform dwi epi correction

The perform_dwi_epi_correction pipeline is where things get bad. The pipeline basically work on each DWI direction separately and produces intermediary data for them (warped image, warp field, jacobian...). The amount of intermediary data produced is quite large, mostly for the following reasons:

  • The number of DWI directions (139 with the PREVDEMALS data used in the CI)
  • The data type which is float64 by default

We could tackle the first point by switching to a new test dataset with fewer directions, but this is not considered in this PR and we can always do that later independently of what was done here.

For the second point, I used the fact that some, not all unfortunately, utilities expose a parameter for precision. The main problem comes from the AntsApplyTransforms utility which only exposes a float boolean flag defaulting to False (i.e. computations are done in double 64bits precision). If set to True, it uses 32bits precision. I therefore added a use_double_precision boolean parameter to control this. I haven't exposed it to users through the CLI but we can do it if needed.

This mitigates the issue to some extent but the pipeline still needs a lot of disk space to run. The delete_cache option, implemented in #766 , enables the pipeline to clean these intermediary data at the end of the perform_dwi_epi_correction pipeline. Nonetheless, if the pipeline crashes a few times without cleaning stuffs, it will very quickly fill up the temp folder of any machine we'll use for CI.

A possible improvement that I didn't explore would be to specify a working directory to ANTS utilities (which is different from the Clinica's working_dir), such that they would write the intermediary data to a dedicated mass storage volume.

The perform_dwi_epi_correction was previously using a custom function to call antsApplyTransform as a subprocess. This function was deleted and the Nipype interface was used instead, which simplifies the code.

The perform_dwi_epi_correction produces a single output: a (208, 256, 256, 139) nifti image weighting 5.8G when using 64bits precision. The non regression test compares this output to a reference value. Because of the size of the image, the usual similarity_measure function that we use for tests was crashing (I suppose because of memory issues but I haven't investigated much...). I ended up implementing a few testing utility functions in clinica.utils.testing_utils to compare two nifti images. The assert_large_nifti_almost_equal function will compare each pair of (208, 256, 256) volumes iteratively, only loading two such volumes at the same time in memory. This is still very slow, so I might end up sampling volumes instead of comparing them all...

Required data for testing

Requires data PR https://github.com/aramis-lab/clinica_data_ci/pull/51

@NicolasGensollen NicolasGensollen force-pushed the rework-dwi-epi-pipeline branch 4 times, most recently from 294c093 to 7c5874c Compare July 6, 2023 07:13
@NicolasGensollen NicolasGensollen marked this pull request as ready for review July 7, 2023 09:42
@NicolasGensollen
Copy link
Member Author

Alright, I believe this is finally ready for reviews.
The PR is quite large so I did my best to explain the different changes and reasons for them in the opening message of the PR.
Let me know if there are unclear things.

Copy link
Contributor

@MatthieuJoulot MatthieuJoulot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! Just one little thing !
Also, I will test with the new data.

Copy link
Contributor

@MatthieuJoulot MatthieuJoulot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM !

@NicolasGensollen NicolasGensollen merged commit ada5b61 into aramis-lab:dev Oct 1, 2023
18 checks passed
@NicolasGensollen NicolasGensollen deleted the rework-dwi-epi-pipeline branch October 1, 2023 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants