Alternative proposal to BEP038 #1856

oesteban · 2024-06-12T21:31:59Z

Rendered proposal: https://bids-specification--1856.org.readthedocs.build/en/1856/derivatives/atlas.html

This is a proposal to achieve the aims of BEP 38 as derivative datasets, without the need to introduce a new DatasetType. The structure of derivatives is well-established in BIDS, and this would require the introduction of fewer new concepts and less code complexity in supporting all BIDS DatasetTypes.

To demonstrate, the proposal has several examples with 'classical' templates/atlases. In addition to that, I'm working on uploading PS13 to templateflow (further supporting the practical implementation of the proposal), and, if accepted, I can commit to generating bids-examples following the proposal for PS13.

I incorporate the BEP metadata from this spec, for the moment unchanged as the absence of macros made it really hard to carefully edit (that said, I think that part of the current proposal is okay, and I would possibly suggest some additions only).

This proposal only includes the following new entities:

seg: derived from the original BEP038 proposal
scale: multi-scale atlases are specified (which is fully lacking in the current draft)

As I understand it, the atlas BEP has taken the shape it has because it was felt that derivatives datasets would not satisfy the needs. I believe I have shown that it can with only minor modifications.

I include a skeleton of the new terms in this PR. If we agree in principle, we can work on schematizing these changes and tightening up the text.

My proposal is issued to address three central issues and other relevant problems. For the interested, I include arguments against specific choices in the BEP as-is, but I hope my arguments for this proposal stand on their own. To see these arguments, please unfold this paragraph by clicking the initial arrow.

Issue 1: Opening `DatasetType` to values other than `raw` and `derivatives` should have its own BEP.

Adding values to DatasetType is a major change to the specification that should be broadly discussed by the community, with a preliminary analysis of potential side-effects by the SG and/or Maintainers.

It took a long while before derivatives became a relevant part of BIDS and many years of discussion about them. I contend that DatasetType should keep its special status and be discussed separately. After DatasetType is agreed as the appropriate mechanism, the BEP leads intending to add values should state it when presenting the BEP draft to the SG before it becomes listed as an active BEP.

For instance, it would not be crazy to contemplate the possibility of having a DatasetType such as freesurfer, which has a very stable and standardized data structure, to allow it as a standalone dataset. Opening DatasetType means opening BIDS to the creation of standards within the standard. Where to draw the line between raw and derivative has traditionally be a contention point, so enabling more options should be considered very carefully, and provided with prescriptions of how to do it and how to decide beforehand. Otherwise, BEPs proposing new dataset types will creep up as we all tend to think that our area of specialization is special.

Please note that this issue does not enter into the actual value of atlas proposed by the BEP. That is reviewed next.

Proposed solution: (1) drop this part of the proposal; (2) discuss the issue as BIDS prescribes; (3) establish whether the intent of DatasetType may be open to other dataset types.

Issue 2: The new value `atlas` for `DatasetType` evades the actual problem.

Evading *the* problem that exists. By creating the new DatasetType metadata, the overarching problem is escaped: the fact that BIDS-Derivatives has not been developed far enough to represent "second-level" analyses, as in, analyses where data from several subjects, or sessions, or runs, are pooled together. Instead, the current BEP proposal cordons off the problem by creating its own little island.

Solving a problem that does not exist. The use of the new DatasetType is justified to enable the sharing of "atlas", as stated in the initial paragraph, and later:

This will allow sharing existing atlases as stand-alone datasets,
validating them via the BIDS validator and enabling their integration as sub-datasets of other BIDS datasets.

which suggests that, if a dataset is of derivative type then the following is not supported:

The sharing of the dataset stand-alone (which is factually false, derivative datasets are already standalone)
The validation of a derivative dataset (which is circumstancial because the vision is that derivatives are validated as raw one day)
The derivative dataset cannot be integrated as a sub-dataset of another BIDS dataset (which is factually false).

Therefore, this approach seems to indicate that atlases are somewhere in between "raw" and "derivative" and hence they require their own DatasetType.

Proposed solution: My proposal encodes atlas-derived results and atlas-generating pipelines results within current BIDS-derivatives specifications. If I'm reviewing a paper corresponding to a new template and/or atlas, I would feel better equipped to understand the pipeline and the results if delivered as BIDS-Derivatives, with the most salient intermediate steps there (or transformations so that I can replicate them) instead of a final structure that looks like templateflow's resources putting atlas- first. The first reports the atlas creation process, while the second is a fast-track mechanism to emancipate the blobs a researcher wants be reused from the outputs and reporting of the generating pipeline. My understanding of BIDS is that it wants to achieve the first. The act of sharing data and ensuring FAIRness in the delivery of the service is more of a responsibility of other players such as OpenNeuro or TemplateFlow.

Issue 3: the folder structure is inconsistent with current BIDS raw and derivatives

This PR proposes an alternative that is consistent with current BIDS. While for raw and first-level analyses derivatives the spatial reference is established by that of individual subjects, for higher-than-first-level analyses this PR proposes the concept of template, which is the aggregation of feature maps that serve for reference at the individual level (e.g., aggregation of runs, sessions or sets of subjects). That allows for a more consistent organization, which has been already tested in the wild with TemplateFlow.

In addition, there are several aspects of atlases (and templates) that this BEP did not cover:

Problem 1: longitudinal templates (and atlases)

The cohort entity of templateflow could resolve this. I can update my PR if it is accepted to contemplate this.

Problem 2: multi-scale atlases

My proposal includes a new scale- entity.

Problem 3: probabilistic surface parcellations.

This would require finding a GIFTI encoding of FreeSurfer's GCS format. This is not really a problem of atlas, but BIDS-Derivatives in general.

Proposed solution: Implemented by this PR against BEP038.

Other issues

Downstream problems of the proposed DatasetType. It seems the intent is to have these datasets uploaded to BIDS-compatible platforms such as OpenNeuro as a new means of disseminating and distributing atlases. OpenNeuro does implement FAIR pretty comprehensively, which is fundamental for this intent not to become extremely dangerous, but at the outskirt, the BIDS specifications should refrain from suggesting OpenNeuro should be used for sharing. These atlases will likely be shared through other venues where data versioning, accessibility, etc. are not as transparent or available and that will have the opposite effect that is intended in this BEP (undermined reproducibility and limited reusability of the atlas). But even assuming OpenNeuro as the mechanism for redistribution, there are other issues that are covered in our TemplateFlow paper, which will be problematic if not exacerbated:

Lack of a controled vocabulary for templates' and atlases' names: no one can avoid that two templates are given the same label to the atlas entity, and I don't think it would be good for BIDS to attempt to control that. The experience would revive the issues hit with template specifications (https://bids-specification.readthedocs.io/en/stable/appendices/coordinate-systems.html). I also provided an example of this problem within BEP Proposal: Atlas specification #1281.
Existing templates and atlases will not adopt this. The main way of disseminating templates and atlases remains software packages. It is highly unlikely that software packages will adopt this standard because it adds insecurity (what if BIDS changes the standard? what if my atlases cannot be represented with this specification?) at a very low turnover (because here the sharing is with yourself as a developer, you organize the data as it is most convenient for your application).
Upcoming atlases will not adopt this. If an atlas creator wants their template be reused, they either distribute it with the format of a popular tool (e.g., FreeSurfer or AFNI) or it is unlikely to be adopted (except for applications that can query TemplateFlow).
Unfortunately, many template/atlas generators set copyleft and (worse) no-derivs restrictions on the license, which conflict with the purpose of sharing the resource (since these resources are meant to create derived works). That defeats the noble purpose of "sharing" standalone (even if that were a problem). If a derivative is protected with no-derivs (or the raw, like the HCP data), that is within the scope of possibilities. However, DatasetType atlas allows people to mark a resource as atlas and confusingly set no-derivs (and maybe request royalties after use?). For derivative it is not assumed that you can create further derivatives and the license is checked.

Intro of the proposal misses the point. The introduction of the current proposal is largely devoted to explain what an atlas is. BIDS should not be a neuroimaging handbook, and therefore, BEPs should not require such justifications. I believe this is a consequence of issue 2 to justify the choice.

oesteban · 2024-06-12T21:45:04Z

cc @jdkent @melanieganz @CPernet @dorahermes @Remi-Gau @effigies @ericearl @francopestilli

There are some wrinkles to iron out (e.g., missing glossary definitions breaking documentation building), but this is a general summary of how I see this. Happy to discuss use cases that are not immediately clear how they would be encoded under this proposal.

Thank you all for your patience, this PR was long overdue.

effigies · 2024-06-13T00:19:36Z

To ignore the arguments and boil this down to the practical difference between BEP38 and this counter-proposal, it seems to be:

Don't say atlas-<labelA>..._space-<labelB>..., say tpl-<labelB>/<datatype>/tpl-<labelB>..._atlas-<labelA>....
Don't create a new atlas DatasetType, use derivative and define four new entities.
Don't use an atlases/ subdirectory in the dataset root to store atlases.

As far as I can tell, anything that could be named under the existing BEP38 could be named under this (notwithstanding some comments on things that need clearing up below) proposal, so that's a good start.

The last point I'm inferring just by its absence. Any recommendation on what people who saw value in this construct do? My personal inclination would be to use sourcedata/atlases/, but BIDS has not defined the structure of sourcedata/. It could be a matter of convention, and the specific locations could be tracked in DatasetLinks.

Some questions on your entities. I'll start with my understanding of how they seem to be used:

tpl-<label>: The name of a template, which has the same status as (and is mutually exclusive with) sub-<label>, to be used when data have been sampled to some space <label> and then combined across subjects.
atlas-<label>: The name of an atlas, which is simply a name canonically associated with a collection of files.
seg-<label>: A specific segmentation, if an atlas defines (or is commonly used to define) more than one segmentation.
scale-<label>: A further specifier, if an atlas defines segmentations at multiple spatial scales.

tpl-<label> is not defined in your proposal so far. Is it required to be a controlled vocabulary, such as in https://bids-specification.rtfd.io/en/stable/appendices/coordinate-systems.html#image-based-coordinate-systems? Can it, like space, be uncontrolled, provided there is a link within the metadata somewhere? Are you considering study-derived templates at this point, or leaving that to another effort?

My understanding is that the 4-tuple (atlas, seg, scale, suffix) is intended to be unique such that any two files containing the same set of path components have comparable values (e.g., an integer label in a dseg means the same thing in two files where these entities match). How global is this? For example, is there supposed to be a registry that controls this vocabulary, similar to space-<label>, or would I need to verify with the atlas metadata when I receive two datasets with an overlapping atlas label?

I don't really understand scale-. At first I thought it overlaps with res- or den-, but it seems to be something else. Is it "degree of subdivision of segmentation" or "number of subjects used to derive"? Or is seg-<> for qualitative differences (different types of quantities mapped) while scale- is for quantitative differences, and the meaning of each is atlas-specific?

I've only had a quick read-through and so I might have more thoughts later. I don't see any show-stopping problems, but I would like to hear from others who've been more in-the-weeds. Might be good to get people together in Seoul to discuss?

effigies · 2024-06-13T13:13:21Z

One question regarding datatype under tpl: Is that required, optional, or what? You can derive segmentations from any of a number of modalities (or multiple modalities at once) and use them in others; does it make sense to drop datatype altogether under tpl-, or leave it optional? I think required is not tenable.

oesteban · 2024-06-13T13:46:04Z

Any recommendation on what people who saw value in this construct do? My personal inclination would be to use sourcedata/atlases/, but BIDS has not defined the structure of sourcedata/. It could be a matter of convention, and the specific locations could be tracked in DatasetLinks.

I think recommending sourcedata/atlases is a good starting point.

tpl-<label> is not defined in your proposal so far. Is it required to be a controlled vocabulary, such as in https://bids-specification.rtfd.io/en/stable/appendices/coordinate-systems.html#image-based-coordinate-systems? Can it, like space, be uncontrolled, provided there is a link within the metadata somewhere? Are you considering study-derived templates at this point, or leaving that to another effort?

Yes, you're right -- tpl should be defined and it's not, I will address that ASAP. Controlled language - I see it as space in that it is semi-controlled. I would recommend using template space names from https://bids-specification.readthedocs.io/en/stable/appendices/coordinate-systems.html#standard-template-identifiers but allow any label if those standard names do not represent the data.

For example, is there supposed to be a registry that controls this vocabulary, similar to space-<label>, or would I need to verify with the atlas metadata when I receive two datasets with an overlapping atlas label?

I don't know whether there's interest in maintaining another informal 'registry' like https://bids-specification.rtfd.io/en/stable/appendices/coordinate-systems.html#image-based-coordinate-systems? for spaces. My impression is that the spaces list has been pretty stable because the effect of adding new items is minimal.

Perhaps this proposal should also have some sort of atlas-<label>_description.json file given at the root of the structure which is inherited by all files containing atlas-<label>.

Otherwise, if there's a single file (e.g., a single atlas-<label>_dseg file; https://github.com/bids-standard/bids-specification/pull/1856/files#diff-930106228fdeff531c65486378dd4138c6f27c38cbce3bd7621743e4a42453e0R79), that could alternatively serve the purpose.

Another interesting route would be to allow YAML to facilitate a natural language description of the methods of the atlas (i.e., embed a README into the metadata file). Some sort of atlas-<label>_description.yml.

Finally, it may be useful to have an atlases.tsv and atlases.json.

I'm open to any suggestion to resolve this issue.

I don't really understand scale-. At first I thought it overlaps with res- or den-, but it seems to be something else.

It is something else. It is common for atlases to define several levels (scales) of granularity of the defined ROIs. They are typically related hierarchically. E.g., say we have a parcellation that has 7 regions for each hemisphere at the lowest scale. Those regions are then divided in a number of regions at the next level, and so on up to dividing the hemisphere into 1000 ROIs in the highest scale. I think a very interesting paper that describes this as the choice of 'brain unit' is https://www.nature.com/articles/s41593-020-00726-z

One question regarding datatype under tpl: Is that required, optional, or what? You can derive segmentations from any of a number of modalities (or multiple modalities at once) and use them in others; does it make sense to drop datatype altogether under tpl-, or leave it optional? I think required is not tenable.

I think this is a general question for BIDS Derivatives—by not saying anything explicit, we leave it open, and one day, BIDS Derivatives will address this issue. Validator-wise, I'd make it optional.

jdkent · 2024-06-13T18:26:44Z

Thanks for your work on this @oesteban! I largely agree with your approach.

Scope of BEP

As a grounding for me (and hopefully for others), the Atlas BEP scope is to cover:

Atlas generation
Atlas application/consumption

(as is the case with many/all derivatives)

And an atlas can be created at either the:

group level (incorporating multiple participants' data)
individually (using only a single participant's data)

But the atlas will always be applied to individual participant data.

`seg-` entity consensus

I believe one point to find consensus on is how to apply/define the entity seg-

In previous discussions about atlas application/consumption, seg- was understood to be an application of an atlas to a particular participant, with the same label (e.g., atlas-AAL becomes seg-AAL within the participant directory).
Thus atlas->seg mirrored the relationship between tpl->space

In a discussion of atlas generation, the entity seg is more about differentiating the same atlas by differing criteria for parcel/segmentation creation.

moving forward we could:

use the atlas- entity consistently for the application of the atlas on individual data.
create another term to differentiate the conflicting definitions of seg-

General Comments

I have a couple agreements/comments/clarifications

Issue 1: Opening DatasetType to values other than raw and derivatives should have its own BEP.

Agree, I never felt comfortable adding a new datatype, I also think atlases are derivatives.

Issue 2: The new value atlas for DatasetType evades the actual problem.

Agree, I think the issue is more of a technical one of openneuro not supporting uploading of standalone derivative datasets than anything the standard specifies.

Issue 3: the folder structure is inconsistent with current BIDS raw and derivatives
Problem 1: longitudinal templates (and atlases)
The cohort entity of templateflow could resolve this. I can update my PR if it is accepted to contemplate this.

I am open to a cohort entity, could potentially also be absorbed by the seg- entity, the criteria being the timepoint/age-range for the data selected as input.

Problem 2: multi-scale atlases
My proposal includes a new scale- entity.

I can see how scale and seg are being used in the examples, but in my mind there is still a decent amount of overlap between the entities.

seg- is REQUIRED when a single atlas has several different realizations (for instance, segmentations and parcellations created with different criteria) that need disambiguation.
scale- is REQUIRED to disambiguate different atlas 'scales', when the atlas has more than one 'brain unit' resolutions, typically relating to the area covered by regions.

In my mind, I could describe that a different number of parcels is a different criteria and would fit under the definition of seg-.

I have a request for the examples:

MIAL67ThalamicNuclei-pipeline/
├─ tpl-MNI152NLin2009cAsym/
│  └─ anat/
│     ├─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.json
│     ├─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.tsv
│     ├─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_res-1_dseg.nii.gz
│     └─ tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_res-1_probseg.nii.gz
├─ sub-01
│  └─ anat/
│     ├─ sub-01_label-ThalamicNuclei_dseg.json
│     ├─ sub-01_label-ThalamicNuclei_dseg.tsv
│     ├─ sub-01_label-ThalamicNuclei_dseg.nii.gz
│     ├─ sub-01_space-MNI152NLin2009cAsym_T1w.nii.gz
│     └─ sub-01_T1w.nii.gz
┇
└─ sub-67
   └─ anat/
      ├─ sub-67_label-ThalamicNuclei_dseg.json
      ├─ sub-67_label-ThalamicNuclei_dseg.tsv
      ├─ sub-67_label-ThalamicNuclei_dseg.nii.gz
      ├─ sub-67_space-MNI152NLin2009cAsym_T1w.nii.gz
      └─ sub-67_T1w.nii.gz

In these examples, I would prefer if tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.json and tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.tsv be represented at the top level if possible (as atlas-MIAL67ThalamicNuclei_dseg.[json|tsv]) so that the atlas information is more findable, also could reduce repetition if the atlas was generated in multiple template spaces.

PeerHerholz · 2024-06-14T09:20:50Z

Hi everyone,

thanks for all your work on this @oesteban!

As mentioned by @effigies, it would be great to also discuss this during the upcoming Brainhack if possible.

@jdkent: how would @oesteban's proposal relate to the updates and examples you've worked on? It seems that both are more aligned than the previous BEP038 versions we had, no?

Thanks again.

Best, Peer

pwighton · 2024-06-14T19:19:13Z

Thanks for this proposal, @oesteban.

I haven't had a chance to review it in detail yet, but will set aside some time next week to do so.

For the PS-13 use case, at a high level, we are interested in 2 things:

Being able to share an atlas as a standalone dataset
Being able to validate an atlas

Would this proposal be able accommodate that?

oesteban · 2024-06-17T09:33:45Z

In these examples, I would prefer if tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.json and tpl-MNI152NLin2009cAsym_atlas-MIAL67ThalamicNuclei_dseg.tsv be represented at the top level if possible (as atlas-MIAL67ThalamicNuclei_dseg.[json|tsv]) so that the atlas information is more findable, also could reduce repetition if the atlas was generated in multiple template spaces.

Thanks for your feedback @jdkent. I think the above is the only caveat you found, so I'll go ahead and address your request with 'a little twist'. In the example, as it stands, the only metadata that can be generalized across items is label-ThalamicNuclei_dseg.tsv, shared by the 67 subjects that were segmented to build the atlas. Since we only have one template space, then the atlas metadata does not need to be generalized (could be done, without issues, if you want to see the metadata at the top level).

However, generalization would be expected if two different template spaces are created (this is the twist). I've updated accordingly (see f159e61)

As mentioned by @effigies, it would be great to also discuss this during the upcoming Brainhack if possible.

@PeerHerholz definitely :)

Would this proposal be able accommodate that?

@pwighton —that's exactly the purpose. Yes, both are requirements of any BEP, and the proposal must abide by them.

@effigies - I've tried to address some of your questions in 905160d. I'm afraid I'll need to keep working to make the specs render again.

PeerHerholz

Thanks @oesteban! I think this looks great.

I was wondering if we should add a little bit of information concerning the different naming conventions, ie

tpl-MNI152NLin2009cAsym_from-MNI152NLin6Asym_mode-image_xfm.h5

vs.

sub-01_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5

to prevent confusion in users (and other stakeholders). That's somewhat outside the scope of BEP038 but as BEP014 is still in development, a little explanation as to how certain transforms are named might be beneficial. WDYT?

oesteban · 2024-06-17T11:05:51Z

Thanks @oesteban! I think this looks great.

I was wondering if we should add a little bit of information concerning the different naming conventions, ie

tpl-MNI152NLin2009cAsym_from-MNI152NLin6Asym_mode-image_xfm.h5

vs.

sub-01_from-T1w_to-MNI152NLin2009cAsym_mode-image_xfm.h5

to prevent confusion in users (and other stakeholders). That's somewhat outside the scope of BEP038 but as BEP014 is still in development, a little explanation as to how certain transforms are named might be beneficial. WDYT?

I added a little mention to BEP014 in that commit: https://github.com/bids-standard/bids-specification/pull/1856/files#diff-930106228fdeff531c65486378dd4138c6f27c38cbce3bd7621743e4a42453e0R177-R179 I believe we should not attempt to get very deep into transforms here and let it happen within BEP14.

PeerHerholz · 2024-06-17T11:12:12Z

I added a little mention to BEP014 in that commit: https://github.com/bids-standard/bids-specification/pull/1856/files#diff-930106228fdeff531c65486378dd4138c6f27c38cbce3bd7621743e4a42453e0R177-R179 I believe we should not attempt to get very deep into transforms here and let it happen within BEP14.

Definitely! Sorry, I didn't mean to say that we should explain why there are different naming patterns for transform, just that they exist and refer to transforms between template spaces in one case and transforms between subject and template spaces in the other. Simply to avoid confusion. However, maybe that's just me, haha.

oesteban · 2024-06-17T11:28:54Z

src/derivatives/atlas.md

+Please note that the specification for spatial transforms (BEP 014) is currently
+under development, and therefore, the specification of transforms files may
+change in the future.


@PeerHerholz this is the mention.

I didn't want to explicitly get down the from- / to- issue because that has great potential to change (or establish some extra rules so that it is unambiguous).

pwighton · 2024-06-17T19:21:12Z

@pwighton —that's exactly the purpose. Yes, both are requirements of any BEP, and the proposal must abide by them.

Thanks @oesteban! With that out of the way, I have a few minor comments:

I'm glad to see the cohort entity, but cohort is currently defined as: "A subset of a defined template space". Seems like an odd definition, so I'm wondering if I'm missing something? I'd suggest changing the definition to something like "a sub-population over which an atlas or template was computed"
The suggested directory structure looks like:

mni152nlin2009casym-pipeline/
├─ tpl-MNI152NLin2009cAsym/
│  └─ anat/

Just curious what the role of the anat directory is here. Is it required? What do you think this would look like for the PS13 example? There, we have PET data mapped to an anatomical template so would we use anat to signify it is in an anatomical space or pet to signify it is derived from PET data?

I think the altas metadata should include Units as a RECOMMENDED feild. I understand metadata came from the previous proposal, so this comment applies to both proposals.

@CPernet

As @CPernet mentioned, the spec was unclear because the example illustrating ``seg-`` was using an atlas name for the label. I grepped the whole spec for ``seg-`` and only found ``seg-Desikan``. This commit replaces that label, edits a bit the text around it and updates the `seg-` entity definition in the schema to be more clear. Even though its use with ``atlas-`` was already mentioned: > For atlases, `seg-<label>` distinguish different realizations of a given > segmentation or parcellation. > For example, the [Yeo 2011 atlas](https://doi.org/10.1152/jn.00338.2011) > is distributed within *FreeSurfer* with two different parcellations > (*7 networks* and *17 networks*). I have updated the entity's description to be more clear (hopefully).

@CPernet

Makes more clear the structure of what encodes what: - Space: an abstract concept meaning that anatomy gives raise to "space" (i.e., subjects also generate a space) - Template: an instance of a space that represents a population by averaging features, in a way, template encodes the sample and registration algorithm to create feature maps - Atlas: an annotation of knowledge such as partitions (or families of partitions such as 7-networks and 17-networks) in a space with reference to a template. In other words, atlas would encode the methodology to create such partitions. - Seg(mentation): a label given to a partition of the space, which potentially instances an atlas' partition (and hence, its combined used with atlas). cc/ @CPernet, @melanieganz, @mnoergaard

for more information, see https://pre-commit.ci

oesteban · 2024-10-09T05:49:02Z

I'm happy to report that @melanieganz, @CPernet, @mnoergaard, and I had a productive Zoom call where they shared their concerns. I have tried to address them as follows:

CRITICAL - There needed to be a clear difference between the most relevant concepts introduced in the proposal --in particular, the differences between space and template, as well as between atlas and segmentation. I have re-worked this in d62f0d9
CRITICAL - @melanieganz made me realize of a gap in the examples, especially lacking those about when a new template (tpl-<label>) and/or new atlas(es) (atlas-<label>) are necessary. This one is addressed in e5e27f1
IMPORTANT - @CPernet showed me how the proposal was ambiguous at points, with examples that seemed to suggest prescriptions different from what the proposal meant. Addressed with two commits: 3accb28 and 8ffda51.

Melanie, Cyril, and Martin are currently going through the changes and we will meet again or work here to continue addressing problems, should they find new issues or old issues remain unresolved.

In the meantime, I believe this is also a good point to address @pwighton's comments. Thanks for your patience, Paul, because I have been unresponsive for too long.

In the PS13 example, the PET imaging data does not define the space and is therefore not a template, so wouldn't space-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz be more appropriate than tpl-fsaverage_atlas-ps13_desc-nopvc_dseg.nii.gz?

I hope this is addressed quite directly by d62f0d9 as it feels falling within the critique (1) above.

A template is an instance map implementing an abstract space. Hence, an actual map engendering the stereotactic reference of an individual or average brain (space) is better called a template.

Nonetheless, we were already on the same page because:

I think we are saying the same thing. This seems equivalent to me saying "a template is imaging data that defines a space". Do those statements seem equivalent to you? Maybe I don't actually know what stereotaxy means in this context.

Yes, and I hope this is now clearer in the proposal :)

This comment was followed by:

The issue is we do not want stereotaxy to be implemented from the PS13 data. We use the MNI152Lin template to put the PS13 data in MNI152Lin space, but we make no claims about the PS13 data being an authoritative reference that defines the space.

which aligns with comment (2) above. Basically, I've added an example for when you do want PS13 to generate stereotaxy (therefore, tpl-PS13). Now, the specs showcase the contrast between when you do NOT want stereotaxy (with the pre-existent examples) and the new counter-examples (tpl-PS13 and tpl-SUIT).

After that, Paul read the templateflow paper (BIG THANKS for that), and suggested:

Template: An aggregation of [continuous-valued|discrete-valued|continuous- or discrete-valued] data that [MAY|MUST] serve as the authoritative definition of a space.

authoritative definition of a space -> I love this. Let me make further edits to include it (will do asap). Regarding the data type (continuous/discrete), it feels like a rabbit hole. After all, all modalities are digitized and hence discrete -- this will probably create more confusion than clarity.

It's not clear to me if the it's the intention of the proposal to harmonize the definition of Template with the Nature 2022 paper or to harmonize the definition of Template and Atlas to Nature 2022, or neither, but some clarity around this would be very helpful, I think.

Neither. The paper heavily showcases how I think about templates and atlases, and templateflow was totally driven by BIDS principles. Templateflow has some flaws and limitations that should not make it into BIDS. Once the BEP is finished, and if it is convergent with templateflow, we will definitely leverage the BEP to update templateflow.

It seems the proposal needed much more clarity on the definitions/principles front. Thanks for the help clarifying those. Let me know if further changes are necessary.

melanieganz

@mnoergaard, @CPernet and I are satisfied with this!

src/schema/objects/common_principles.yaml

melanieganz · 2024-10-09T12:14:18Z

src/schema/objects/common_principles.yaml

@@ -186,3 +184,4 @@ template:
    Like subjects' feature maps generate a *native* spatial frame of reference for analyses,
    templates engender a *generic* or *standard* space of analysis were subjects can be spatiotemporally
    aligned into.
+    In other words, *templates* are specific feature maps that instantiate an abstract *space*.


yes, this is what we needed here! Do others understand this?

melanieganz · 2024-10-09T12:15:07Z

src/schema/objects/entities.yaml

-    (*7 networks* and *17 networks*).
+    is distributed within *FreeSurfer* with two different parcellations:
+    the *7 networks* MAY be described with `seg-7n`, for instance, and
+    the *17 networks* MAY correspondingly be described with `seg-17n`.


Looks good, but there are some updates in the examples needed.

What examples are you missing? Or maybe are you referring to instances where label- was used instead of seg- (I already fixed that)

melanieganz

Thanks, @CPernet, @mnoergaard and I added some small suggestions!

melanieganz · 2024-10-09T12:20:59Z

src/derivatives/atlas.md

+**Producing a new template AND atlas.**
+Atlasing is often performed with reference to a *custom* standard space.
+In this case, a feature template map is generated from all the participant(s)
+in the study, and the atlas' artifacts are produced with reference to that


It's not clear what artifacts means? Should it be label instead?
Could also just say "and the atlas is produced with reference..."

Responded below, because it can mean several things (segmentations, parcellations, landmarks, connected landmarks, fiber bundle geometries, etc.), I used the term artifacts.

Happy to find a good alternative if it is confusing (e.g., with imaging artifacts).

melanieganz · 2024-10-09T12:27:07Z

src/derivatives/atlas.md

+map and after that, a corresponding [*atlas*](../common-principles.md#definitions)
+was defined.
+In that case, the [`atlas-<label>`](../glossary.md#atlas-entities) SHOULD be omitted
+except several atlases need specification:


So the argument here is that "_atlas-PS13" should not appear alongside "_tpl-PS13" in the example below? This would only be the case if we have several instantiations aka templates of the same atlas.

melanieganz · 2024-10-09T12:31:21Z

src/derivatives/atlas.md

+            "tpl-PS13_dseg.json": "",
+            "tpl-PS13_dseg.tsv": "",
+            "tpl-PS13_stat-std_desc-fnirtNopvc_mimap.json": "",
+            "tpl-PS13_stat-std_desc-fnirtNopvc_mimap.nii.gz": "",


In reality this would rarely happen without the definition of "_space-fsaverage", aka an anatomical reference space. So maybe adding "_space-" entities in these examples to make this more realistic? (We do understand that it is possible, but is not what most people would do.)
Also in order to keep the two templates for the different spaces in the same folder, we need to have "_space-fsaverage" and "_space-MNI152Lin" to disambiguate.

Responded below -- if I understand correctly, I think this is a duplicate comment right? #1856 (comment)

src/derivatives/atlas.md

melanieganz · 2024-10-09T12:33:07Z

src/derivatives/atlas.md

+   "ps13-with-atlases-pipeline": {
+      "tpl-PS13": {
+         "pet": {
+            "tpl-PS13_atlas-Economo1916_desc-nopvc_dseg.nii.gz": "",


In which space is this? We are guessing this is in the PS13 space, but for generic users this might be hard to guess.
So for consistency it would be good to add space here everywhere as well (fsaverage and MNI152Lin).

You are guessing what the example intends to express. Since the template instantiates the space, then space is not necessary anymore. Think of subject:

sub-01_atlas-Economo1916_seg-thalamus_dseg.nii.gz - tells you that this subject has been segmented using the Economo atlas, and this file contains discrete segment (or segments) corresponding to the Thalamus. It would not make sense to say:

sub-01_atlas-Economo1916_space-01_seg-thalamus_dseg.nii.gz, because regardless of what other space that atlas was brought in, it is now in the subject's native space.

Now, your second sentence makes me doubt whether that was the question:

So for consistency it would be good to add space here everywhere as well (fsaverage and MNI152Lin).

In this case, fsaverage and MNI152NLin are intendedly removed (they are shown above anyway). Here, the idea was to show the case that a PS13 template was generated in a custom space (very much revolving around @pwighton's comment about whether you want your results to be generative or not of a new space, in this case showing the other part when you do want to generate a new space).

So tpl-PS13_atlas-Economo1916_space-{fsaverage,MNI152Lin}_desc-nopvc_dseg.nii.gz would mean a very different thing.

Also, for transparency and reproducibility, I would recommend people instead (or in addition) provide the transformations between the new-space-generating template and fsaverage and MNI152Lin.

I'm not sure adding those conversions would help clarify this example -- I'd prefer to keep it focused on what it is trying to show.

melanieganz · 2024-10-09T12:35:10Z

src/derivatives/atlas.md

+})
+}}
+
+where the `atlas-RamonCajal1908` and `atlas-Economo1916` hypothetically define


Why hypothetically? We don't need to say hypothetical if we use real cases as mentioned above.

It's easier for me to come up with examples using hypothetical atlases, but I can see how real cases are more valuable. I'll try to find time to correct my laziness.

src/derivatives/atlas.md

melanieganz · 2024-10-09T12:39:40Z

src/derivatives/atlas.md

+two different atlases (please note that, often, atlases are named after
+the first author and indicating a year of a reference communication).
+The original *default* or *implicit* atlas' artifacts such as
+the `tpl-PS13_desc-nopvc_dseg.nii.gz` segmentation,


Here you define segmentation as an atlas artifact. But what are other atlas artifacts? Why do we need to use "artifact" instead of directly using "segmentation"? This is confusing.

Because it could be a table of landmarks, for example, or clusterings of fiber bundles, etc. Happy to consider a different term if artifacts is annoying/unclear.

Does subproduct or derivative sound better than artifact to you? (I would avoid derivative in this context because the term is loaded with the overarching BIDS-Derivatives meaning)

** they don't sound much better to me, TBH.

melanieganz · 2024-10-09T12:42:13Z

src/derivatives/atlas.md

+            "LICENSE": "",
+            "README.md": "",
+            "dataset_description.json": "",
+            "tpl-SUIT_T1w.nii.gz": "",


We know that SUIT was originally by Diedrichsen defined in MNIXXX space, but a user might want to have the space entity here again.

The template is distributed with transformations to MNI152NLin2009cAsym. I don't have the details very fresh in my mind now, but that suggests it was generated in a space that can be aligned better to MNI.

My friend @yasseraleman is using it and indeed, he first aligns SUIT's template and then either use its transforms to move everything back to MNI or bring labels from there.

I mentioned the SUIT example precisely because, despite being close to MNI, it's not quite there (hence, deserving it's own space denomination).

melanieganz

very nice!

melanieganz

Again, super that worked on this! We have a suggestion for the last example.

src/derivatives/imaging.md

melanieganz · 2024-10-09T12:58:35Z

Thank you for all the proposed changes @oesteban!
We have made comments throughout and except for these minor comments we think it is ready for community review by @effigies @PeerHerholz @jdkent @francopestilli @pwighton @bendhouseart

pwighton · 2024-10-11T14:07:19Z

Thanks for the thoughtful reply and updates to the proposal, @oesteban. I'll review this next week and be in touch.

@effigies

Addresses @effigies' comment bids-standard#1856 (comment) Co-authored-by: Chris Markiewicz <effigies@gmail.com>

@melanieganz

Follows the comment by @melanieganz: bids-standard#1856 (comment) Co-authored-by: melanieganz <ganz@di.ku.dk>

Co-authored-by: melanieganz <ganz@di.ku.dk>

Co-authored-by: Paul Wighton <PWIGHTON@mgh.harvard.edu>

@effigies

cc @effigies @tsalo does this look reasonable? will the ``atlas/`` folder be picked up automatically?

for more information, see https://pre-commit.ci

src/schema/objects/columns.yaml

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

first iteration of the alternative proposal

4fa15e5

oesteban requested a review from effigies as a code owner June 12, 2024 21:31

This comment was marked as resolved.

Sign in to view

effigies added 2 commits June 12, 2024 19:23

STY: codespell

0180e3b

Replace links to the schema with links to the glossary

bba4159

enh: generalization of atlas metadata cc/ @jdkent

f159e61

oesteban requested review from tsalo, erdalkaraca and DimitriPapadopoulos as code owners June 17, 2024 09:32

oesteban force-pushed the bep038-review branch from 5207209 to e5c8927 Compare June 17, 2024 09:38

enh: address some of @effigies' comments

9390602

oesteban force-pushed the bep038-review branch from e5c8927 to 9390602 Compare June 17, 2024 09:39

oesteban added 5 commits June 17, 2024 11:47

enh: reference entities from glossary

3f8c210

fix: folder => directory

927335c

fix: resolving issues with/within entities

6a3dcc7

enh: add mention to transforms files cc/ @PeerHerholz

f7e906d

fix: revise more glossary links

57160cd

PeerHerholz reviewed Jun 17, 2024

View reviewed changes

oesteban commented Jun 17, 2024

View reviewed changes

oesteban force-pushed the bep038-review branch 2 times, most recently from 8f72fc1 to 58a5117 Compare October 7, 2024 08:34

oesteban force-pushed the bep038-review branch from f24a858 to d62f0d9 Compare October 7, 2024 08:53

[pre-commit.ci] auto fixes from pre-commit.com hooks

161db11

for more information, see https://pre-commit.ci

melanieganz reviewed Oct 9, 2024

View reviewed changes

src/derivatives/imaging.md Outdated Show resolved Hide resolved

src/derivatives/imaging.md Show resolved Hide resolved

src/derivatives/imaging.md Outdated Show resolved Hide resolved

oesteban and others added 13 commits October 11, 2024 16:55

fix: addressing some inconsistencies about segmentations

44cb808

fix: add clarification about cohort

c77df10

Addresses @effigies' comment bids-standard#1856 (comment) Co-authored-by: Chris Markiewicz <effigies@gmail.com>

fix: replace _mimap with _pet

947f279

Follows the comment by @melanieganz: bids-standard#1856 (comment) Co-authored-by: melanieganz <ganz@di.ku.dk>

fix: replace label- with seg- where applies

a5111aa

enh: update schema according to this proposal

c05146c

fix: folder -> directory (pacify pre-commit)

f221ad0

fix: order of scale- in the schema

9ec94cd

enh: refine atlas' definition in common principles

1a258b5

Co-authored-by: melanieganz <ganz@di.ku.dk>

enh: add "authoritative definition of spaces" to template

79f1ea9

Co-authored-by: Paul Wighton <PWIGHTON@mgh.harvard.edu>

enh: update schema to consider tpl- and cohort-

3f2bde5

cc @effigies @tsalo does this look reasonable? will the ``atlas/`` folder be picked up automatically?

[pre-commit.ci] auto fixes from pre-commit.com hooks

95a85a1

for more information, see https://pre-commit.ci

enh(schema): add atlas to entities

0d7f965

fix: delete photo from atlas' file rules

a9fa24a

effigies reviewed Oct 12, 2024

View reviewed changes

src/schema/objects/columns.yaml Outdated Show resolved Hide resolved

oesteban and others added 3 commits October 12, 2024 22:11

Update src/schema/objects/columns.yaml

44c3ad8

Co-authored-by: Chris Markiewicz <effigies@gmail.com>

fix: dereference twice schema files

b66c3ee

fix: add missing properties to atlas entity definition

c3b6f41

Alternative proposal to BEP038 #1856

Are you sure you want to change the base?

Alternative proposal to BEP038 #1856

Conversation

oesteban commented Jun 12, 2024 • edited by effigies Loading

Issue 1: Opening DatasetType to values other than raw and derivatives should have its own BEP.

Issue 2: The new value atlas for DatasetType evades the actual problem.

Issue 3: the folder structure is inconsistent with current BIDS raw and derivatives

Problem 1: longitudinal templates (and atlases)

Problem 2: multi-scale atlases

Problem 3: probabilistic surface parcellations.

Other issues

oesteban commented Jun 12, 2024

This comment was marked as resolved.

effigies commented Jun 13, 2024

effigies commented Jun 13, 2024 • edited Loading

oesteban commented Jun 13, 2024

jdkent commented Jun 13, 2024 • edited Loading

Scope of BEP

seg- entity consensus

General Comments

PeerHerholz commented Jun 14, 2024 • edited Loading

pwighton commented Jun 14, 2024

oesteban commented Jun 17, 2024

PeerHerholz left a comment

Choose a reason for hiding this comment

oesteban commented Jun 17, 2024

PeerHerholz commented Jun 17, 2024

Choose a reason for hiding this comment

pwighton commented Jun 17, 2024

oesteban commented Oct 9, 2024

melanieganz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

melanieganz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

melanieganz left a comment

Choose a reason for hiding this comment

melanieganz left a comment

Choose a reason for hiding this comment

melanieganz commented Oct 9, 2024

pwighton commented Oct 11, 2024

oesteban commented Jun 12, 2024 •

edited by effigies

Loading

Issue 1: Opening `DatasetType` to values other than `raw` and `derivatives` should have its own BEP.

Issue 2: The new value `atlas` for `DatasetType` evades the actual problem.

effigies commented Jun 13, 2024 •

edited

Loading

jdkent commented Jun 13, 2024 •

edited

Loading

`seg-` entity consensus

PeerHerholz commented Jun 14, 2024 •

edited

Loading