-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[REF TEST] Rework DWI EPI pipeline #902
[REF TEST] Rework DWI EPI pipeline #902
Conversation
cb4ce9e
to
712860c
Compare
712860c
to
8b43ae2
Compare
294c093
to
7c5874c
Compare
Alright, I believe this is finally ready for reviews. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM ! Just one little thing !
Also, I will test with the new data.
clinica/pipelines/dwi_preprocessing_using_t1/dwi_preprocessing_using_t1_workflows.py
Show resolved
Hide resolved
98c3870
to
df206b9
Compare
13f9f8f
to
26c5258
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM !
This PR proposes to refactor the
epi_pipeline
which is a sub-workflow of thedwi_preprocessing_using_t1
pipeline. The "Description" section below gives an overview of the changes made in this PR and explains the reasons behind them.Description
The base idea was to split the
epi_pipeline
in two workflows:perform_ants_registration
andperform_dwi_epi_correction
in order to simplifies things.The
epi_pipeline
now does nothing more than calling these two sub-workflows with proper parameter passing.These workflows are tested in isolation and therefore have now proper writing capabilities implemented through the
output_dir
argument. When a value is provided for this argument, a data sink node is added at the end of the workflow and the workflow's outputs are written to the corresponding location.Perform ants registration
Testing the
perform_ants_registration
pipeline gave me some troubles because it relies on theANTS
RegistrationSynQuick
utility which is stochastic. In order to be able to compare the output to a reference value, I had to add arandom_seed
parameter which is exposed to the user through the CLI, leveraging the recent work done in #916. This means that the non regression test ofperform_ants_registration
can only work with recent enough versions ofANTS
supporting the random seed option.Perform dwi epi correction
The
perform_dwi_epi_correction
pipeline is where things get bad. The pipeline basically work on each DWI direction separately and produces intermediary data for them (warped image, warp field, jacobian...). The amount of intermediary data produced is quite large, mostly for the following reasons:PREVDEMALS
data used in the CI)float64
by defaultWe could tackle the first point by switching to a new test dataset with fewer directions, but this is not considered in this PR and we can always do that later independently of what was done here.
For the second point, I used the fact that some, not all unfortunately, utilities expose a parameter for precision. The main problem comes from the
AntsApplyTransforms
utility which only exposes afloat
boolean flag defaulting toFalse
(i.e. computations are done in double 64bits precision). If set toTrue
, it uses 32bits precision. I therefore added ause_double_precision
boolean parameter to control this. I haven't exposed it to users through the CLI but we can do it if needed.This mitigates the issue to some extent but the pipeline still needs a lot of disk space to run. The
delete_cache
option, implemented in #766 , enables the pipeline to clean these intermediary data at the end of theperform_dwi_epi_correction
pipeline. Nonetheless, if the pipeline crashes a few times without cleaning stuffs, it will very quickly fill up thetemp
folder of any machine we'll use for CI.A possible improvement that I didn't explore would be to specify a working directory to
ANTS
utilities (which is different from the Clinica'sworking_dir
), such that they would write the intermediary data to a dedicated mass storage volume.The
perform_dwi_epi_correction
was previously using a custom function to callantsApplyTransform
as a subprocess. This function was deleted and the Nipype interface was used instead, which simplifies the code.The
perform_dwi_epi_correction
produces a single output: a(208, 256, 256, 139)
nifti image weighting5.8G
when using 64bits precision. The non regression test compares this output to a reference value. Because of the size of the image, the usualsimilarity_measure
function that we use for tests was crashing (I suppose because of memory issues but I haven't investigated much...). I ended up implementing a few testing utility functions inclinica.utils.testing_utils
to compare two nifti images. Theassert_large_nifti_almost_equal
function will compare each pair of(208, 256, 256)
volumes iteratively, only loading two such volumes at the same time in memory. This is still very slow, so I might end up sampling volumes instead of comparing them all...Required data for testing
Requires data PR https://github.com/aramis-lab/clinica_data_ci/pull/51