Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[REF] Consistently reference object keys in rules #1203

Draft
wants to merge 12 commits into
base: master
Choose a base branch
from

Conversation

tsalo
Copy link
Member

@tsalo tsalo commented Aug 16, 2022

This PR proposes that we use object keys (i.e., the unique keys of the dictionaries in the object schema files) as our primary object identifiers in the rule schema files. At the moment, we use a mixture of keys and fields within the object definitions, depending on the object type.

Suffixes and extensions are referenced using their object definitions' "value" field, while most (all?) other object types are referenced using the object keys.

Why we might want to make the change

  • It is more consistent to use the same identifier for all object types.
  • It would not be difficult to update our code to use object keys instead of a range of different fields.
  • The current system assumes that no extensions or suffixes will ever be overloaded (i.e., have more than one meaning). However, if, for some reason, we do need to overload an extension (e.g., two BEPs use established formats with the same extension), then we would need to reference unique identifiers like the keys.

Why we might not want to make the change

  • Using the unique object keys for some fields (i.e., extensions and suffixes) will be confusing for contributors. Suffixes and extensions are usually more interpretable in their literal form (e.g., .nii.gz instead of niigz or "Gzipped NIFTI"), and we try to ensure that neither object type is overloaded.
    • The steering group and maintainers have ensured that it is essentially a rule that certain objects cannot be overloaded, including suffixes. We can recommend against using existing extensions for different file formats, but there are some extreme edge cases where that could be impossible, such as adding in new modalities that use the same extension for different, well-established file formats.

Changes proposed:

  • Replace extension values with extension keys in the rules files.
    • For example, the key for TSV files is "tsv", while the value is ".tsv".
  • Replace suffix values with suffix keys in the rules files.

The problem is that test_rules looks for any dictionaries (including files) that have a key matching an object type, such as "suffixes", "metadata", or "entities". The name "entities" can thus be considered reserved.
Some were mistyped and didn't match the objects.
@tsalo
Copy link
Member Author

tsalo commented Aug 16, 2022

@TheChymera it looks like a bunch of validator tests are now failing, probably because of the changes I made to the rules files.

@tsalo tsalo added bug Something isn't working exclude-from-changelog This item will not feature in the automatically generated changelog schema-code Updates or changes to the code used to parse, filter, and render the schema. schema Issues related to the YAML schema representation of the specification. Patch version release. labels Aug 16, 2022
@tsalo tsalo removed the request for review from erdalkaraca August 16, 2022 14:07
@tsalo
Copy link
Member Author

tsalo commented Aug 16, 2022

One limitation of the test is that it doesn't recursively look through subfolders, so the derivatives folders are not being tested.

Also, labelgii and dlabelnii aren't defined in objects/extensions.yaml yet.

@TheChymera
Copy link
Collaborator

TheChymera commented Aug 16, 2022

@tsalo so this is an easy fix, which would just need to go deeper in the nested dictionary structure after this line. I can fix it, but I still don't think this extension reorganization is a good idea.... If I'm the only one, we can go ahead, but maybe explicitly asking for comments somehow would be warranted. This just adds imho unnecessary higher-level categorizations which will make schema editing more intransparent and enable all sorts of exciting discussions about how we divvy up strings/regexes among these abstraction categories.

- .nii.gz
- .nii
- .json
- niigz
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we maybe not do this? =^.^=

see #1203 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I still think we should, but it's up to the maintainers/steering group now.

@tsalo tsalo changed the title [FIX] Fix schema test checking rules against objects [REF] Consistently reference object keys in rules Aug 16, 2022
@tsalo tsalo marked this pull request as draft August 16, 2022 19:31
@tsalo tsalo added the discussion ongoing discussion label Aug 16, 2022
Co-authored-by: Chris Markiewicz <effigies@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discussion ongoing discussion exclude-from-changelog This item will not feature in the automatically generated changelog schema Issues related to the YAML schema representation of the specification. Patch version release. schema-code Updates or changes to the code used to parse, filter, and render the schema.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants