-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
can pipeline be more robust to handling missing data? #992
Comments
I've also run into this issue... I remember that BIDS was considering this
as problematic datasets but it's
such a prevalent usecase...
… Message ID: ***@***.***>
|
@sappelhoff Do you know if and how BIDS supports data with missing sessions? |
Reading the specs: I don't see anything that would disallow missing sessions. But this use case isn't explicitly mentioned anywhere, either… |
To the best of my knowledge, it is not disallowed. I also remember preparing a mixed design experiment once, where half of participants had a session that the other half did not have, and that led to a warning (to check whether this was intended), but not an error. |
Yep that would be my proposal. However we also have runs within sessions and then stuff gets really complicated, at least I remember seeing some rather convoluted code handling runs within subjects and sessions … maybe we should re-think all of this and create a clean, clear lookup-table early on in the pipeline run and then just refer to this downstream. |
+1
it's great if it's just a warning. I have definitely seen this warning in
the past.
… Message ID: ***@***.***>
|
was that warning from the BIDS validator, or from MNE-BIDS-Pipeline though? If the standard allows it, IMO the pipeline shouldn't error out. |
It was a warning from the BIDS validator.
agreed |
Sure this would work. Or (equivalently) something like And I think it would be good to have a |
another "I have a longitudinal study" question here. I have some subjects with missing timepoints (AKA missing sessions). I want the pipeline to look at the config file (which says
subjects = "all"
andsessions = ["a", "b", "c"]
) and then look at the bids data tree and process all the subjectXsession combinations that are present, without crashing because subject 101 is missing session "b". In practice, such errors take the form of:which is confusing; it errors when looking for a split partial file, when the real problem is that there isn't even a
.../sub-101/ses-b/
directory.Proposed solutions
just make the error message better, and do something complicated in the config where I figure out which subject-session combos I have, and run the pipeline multiple times (e.g. once for each session, with different lists of subjects each time)
blithely add an
allow_missing=True
here, and hope that it doesn't have catastrophic downstream effects:mne-bids-pipeline/mne_bids_pipeline/steps/init/_02_find_empty_room.py
Line 45 in 61f2961
it seems like there are a lot of places where we do
for subject in subjects
andfor session in sessions
wheresubjects
andsessions
are taken straight from the config file each time. So it won't work to do a simplePath(...).exists()
type of check to see if the entire session folder is missing, right about here:mne-bids-pipeline/mne_bids_pipeline/steps/init/_01_init_derivatives_dir.py
Lines 79 to 81 in 61f2961
But we could do that check, but then store in a private config location the subjectXsession combos that are actually present, and use that info to get subjects/sessions to loop over, throughout the codebase.
Open to other ideas!
The text was updated successfully, but these errors were encountered: