-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get initial traits in place for NeuroCore #1
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a pretty epic starting point, particular so much useful documentation for so many attributes (is this taken from elsewhere or written just for this library?)
I do feel like it's a lot of generic functions to support all these attributes and wonder whether you've tried a more dictionary-like interface for metadata. Having a single generic function to extract a "metadata dictionary-like object", followed by use of a type stable getproperty
interface on that metadata could provide a convenient syntax for many attributes without exporting such a lot of functions. This would also tie in with a more "table/data driven" approach to metadata, for example, functions to interactively list or search through all the standard metadata documentation would be invaluable.
Also I'm always a bit suspicious of lifting metadata into the type domain unless actually required ;-) In that regard, I'd be interested to know how you want to use the modality singleton types. Dispatching on values can be perfectly fine (very much depending on the context of course) and much kinder to the compiler.
I also saw that a lot of this is not neuroscience specific; rather there's many general MRI concepts here. Calling it NeruoCore
is fine, but it already seems like a general tool for medical imaging. (This is true of BIDS in general; while it's ostensibly for the brain and says a lot about neuroscience-specific modalities, the BIDS conventions are relevant for a wide variety of medical imaging datasets.)
In any case, it's great to see this being worked on, cheers!
src/coordinates.jl
Outdated
Returns affine matrix. For an instance that returns `spacedirections` this is | ||
the corresponding tuple converted to an array. | ||
""" | ||
affine_matrix(x::Any) = getter(x, "affine_matrix", MMatrix{4,4,Float64,16}, i -> default_affinemat(i)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you consider using CoordinateTransformations.AffineMap
for affine maps?
Personally I consider the use of 4x4 matrices for affine transformations to be an unfortunate failure of abstraction: applying them to lenght-3 vectors involves lifting those vectors into homogenous coordinates and back again, and this lifting provides extra complication without any real value.
As far as I know, the only time homogenous coordinates are really worthwhile is
- In computer vision applications where the set of 4x4 matrices also supports projection and there are some neat algebraic identities which can help with things like camera calibration.
- In languages which have builtin matrices but don't have free abstractions. Like matlab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I was originally using but then there were so many weird transformations that NIfTI required that I just used a 4x4 matrix. That being said, I think you're right about this and it would be better to use CoordinateTransformations.AffineMap
.
src/NeuroCore.jl
Outdated
effective_echo_spacing, | ||
effective_echo_spacing!, | ||
total_readout_time, | ||
total_readout_time! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that this is a large number of names to export, especially given
- A lot of these seem to be trivial wrappers around an internal dictionary-like interface.
- There's a large number of other attributes which aren't covered here, and achieving full coverage of the very large number of DICOM attributes seems impractical.
Did you consider a more dictionary-like interface for some of these attributes? For example, perhaps allowing syntax such as
metadata(x).manufacturer
which is possible to make type stable if done carefully
(Achieving x.manufacturer
is possible and tantalizing but likely undesirable due to the need to "steal" getproperty
on some non-concrete supertype of x
.)
I feel there may be two rough categories of metadata:
- Metadata which is directly required for working with the pixel data. For example, pixel spacings and coordinate systems.
- Metadata which is somewhat irrelevant or only indirectly relevant, such as the institution name.
The first type is a more limited set of metadata and we may plausibly get a minimal required set of functions for accessing this like ImageCore
. The latter type of metadata is fairly open ended and I'd suggest it's better handled with some kind of dictionary like lookup. Does the images ecosystem also have a similar distinction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a large number of other attributes which aren't covered here, and achieving full coverage of the very large number of DICOM attributes seems impractical.
I agree. That's why I've focused on what's in BIDS. It's still a lot but theoretically this can be broken up into smaller packages if it's found to be useful (eg., BIDSImage, BIDSEeg, BIDSData, etc).
The latter type of metadata is fairly open ended and I'd suggest it's better handled with some kind of dictionary like lookup.
A large portion of this is in fact a wrapper around a dictionary lookup but makes it type stable. JuliaImages uses properties
to retrieve a dictionary with keytype == String and valtype == Any. ImageMeta has a nice some nice syntax that allows img[::String]
to search the underlying property structure. However, that would restrict everything to have ImageMeta
as the top level wrapper where it might not be appropriate (e.g., non arrays such as graphs, fiber tracks, etc.).
One example where this could be immediately useful is if you look at MRIReco.jl where there are distinct structures for raw acquisitional data, sequence parameters, etc. But very little of that is needed in NIfTI files. There's some overlap but it's not enough to agree upon a single structure for say acquisition parameters. This example doesn't even account for GIFTI, CIFTI, BED , etc. where there the data isn't an image or is a mesh instead.
All that being said, I'm certainly open to a better solution. I just don't think there will ever be a single structure everyone can agree on which as you pointed out means using a dictionary. If there's another way of guaranteeing that I get Float64
everytime I call img["EchoTime"]
then we could just require that syntax.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's another way of guaranteeing that I get
Float64
everytime I callimg["EchoTime"]
then we could just require that syntax.
Sure. Very basic sketch:
struct Metadata
d::Dict{Symbol,Any}
end
function Base.getproperty(m::Metadata, s::Symbol)
if s === :EchoTime
getfield(m, :d)[:EchoTime]::Float64 # Inferrable
else
getfield(m, :d)[s] # Fallback (uninferrable)
end
end
julia> m = Metadata(Dict(:EchoTime=>1.0, :Foo=>"hi"))
Metadata(Dict{Symbol,Any}(:EchoTime => 1.0,:Foo => "hi"))
julia> m.EchoTime
1.0
julia> m.Foo
"hi"
julia> @code_warntype (m->m.EchoTime)(m)
Variables
#self#::Core.Compiler.Const(var"#15#16"(), false)
m::Metadata
Body::Float64
1 ─ %1 = Base.getproperty(m, :EchoTime)::Float64
└── return %1
julia> @code_warntype (m->m.Foo)(m)
Variables
#self#::Core.Compiler.Const(var"#17#18"(), false)
m::Metadata
Body::Any
1 ─ %1 = Base.getproperty(m, :Foo)::Any
└── return %1
Then provide a generic function metadata(x)
which for any x
should return something with the same property-like interface as Metadata
(it could literally be a concrete Metadata
, or some other package-specific type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clearly, that type assertion for ::Float64
is a simplification and you may want to dispatch to extra logic there.
The main point is that getproperty
can be used to look up properties in a flexible backing dictionary and certain blessed property names can be treated in a special way which makes them inferrable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like that approach a lot. I would need to create a unique array type that reimplements all of the dictionary traits in JuliaImages that using String
as the keytype though. That or convince JuliaImages to change to a Symbol
keytype.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, it will be interesting to see how it goes.
Note also that Metadata
in my sketch could be implemented with actual fields for a subset of "expected" metadata, and a dict as a fallback for the rest. This would avoid the dict lookup for the set of blessed attributes, and would therefore be more efficient provided x
keeps the Metadata in this format when it's constructed so that no work is required to compute metadata(x)
. Of course, you may want to call it MRIMetadata
or some such, given most of these fields are very MRI-specific. Other types of x
would have their own type with the same API but different blessed attributes.
struct MRIMetadata
echo_time::Union{Float64,Nothing}
# ...
d::Dict{Symbol,Any}
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This leads me to a somewhat ignorant question. When I call sizeof
on your structure it doesn't seem to matter if I use Nothing
or Float64
. Would it make sense to do something like this instead:
struct MRIMetadata{ET<:Union{Float64,Nothing}}
echo_time::ET
d::Dict{Symbol,Any}
end
This would prevent allocating memory for things that only have a couple fields that aren't nothing. I know this may be trivial right now, but if we want to have this metadata be flexible enough to work with something like individual fibers tracking algorithms, then it could get pretty expensive.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that parameterizing MRIMetadata
on the presence or absence of metadata fields would be unnecessarily hard on the compiler and confer no practical advantage (it will need to create a new type each time the metadata changes). If you're worried about storage, you could go with a pure-Dict based solution instead. That would be simpler anyway so it might be for the best. Dict lookup is quite fast and is unlikely to be done in any inner analysis loop.
(Besides this, I would have imagined the relative sizeof(MRIMetdata)
to sizeof(image_data)
could easily be 1:1000 in which case it's not worth worrying about.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that parameterizing MRIMetadata on the presence or absence of metadata fields would be unnecessarily hard on the compiler and confer no practical advantage
I should probably clarify. I imagined that most users wouldn't ever construct these and that it would serve as a common way of organizing and accessing metadata for those actually creating packages.
struct Metadata{ET<:Union{Float64,Nothing}}
echo_time::ET
d::Dict{Symbol,Any}
end
const EmptMetadata =Metadata{Nothing}
This would allow the person implementing an EEG file reader to not worry about echo time but still use the same interface. As you said, it may be simpler to just go with a dictionary.
I would have imagined the relative sizeof(MRIMetdata) to sizeof(image_data) could easily be 1:1000 in which case it's not worth worrying about.
I agree with this in so far as it pertains to images, but this should be applicable outside of MRI as well. For example, we could have a small connectome composed of a 13x13 array across 10,000 subjects. This is definitely the extreme but I'd hate to paint myself into a corner early on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's worrying you, I'd suggest go for the Dict
. It will be less code anyway :-)
But I also don't think you're painting yourself into a corner too much by adopting one implementation or another. The public interface to this stuff will be getproperty
, which is allowed to present the metadata as if it came from an actual field. But whether there's an actual field present is an implementation detail.
I suspect what's most important / potentially annoying here is to settle on appropriate names for the metadata fields because these are very much the public interface. For example echo_time
vs EchoTime
. Probably the only viable thing is to stick with the original field names as defined in the BIDS standard, even though they conflict with the Julia naming conventions.
src/time.jl
Outdated
This parameter is REQUIRED if corresponding fieldmap data is present or the | ||
data comes from a multi echo sequence. | ||
|
||
please note that the DICOM term is in milliseconds not seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about just using Unitful
types for all physical measurements? It's reasonable easy (and safe!) to strip off unitful types with the newest variant of ustrip
, eg, ustrip(u"s", echo_time(x))
if users don't want them.
DICOM is not very consistent about SI prefixes and I feel its fairly easy to make a mistake with these.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that the Unitful
route is probably the best way to go with these. I would like the actual units to be consistent though. Do you think it would make sense for there to be something like x -> uconvert(u"s", x)
for the default here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually think it's fine for the values to be returned in a unit which is any multiple of seconds, provided that's reflected in the type. I think this accurately reflects the way that DSP hardware works, which is to count in integer multiples of some internal clock time step.
To the user there's not really much of a cost vs always returning in seconds. They can do ustrip(u"s", x)
or uconvert(u"s", x)
, and do arithmetic between compatible units as desired.
src/modality_agnostic.jl
Outdated
|
||
Returns the version of the BIDS standard that was used. | ||
""" | ||
bids_version(x) = getter(x, "BIDSVersion", String, x -> "1.0") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These attributes make sense for a BIDS dataset as a whole. But maybe not much sense for individual images.
I think I'm missing quite some context about what x
may be here, but I wonder whether some of these attributes belong in this library given they don't make much sense for individual images.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completely agree. I've included them here for completeness until a more suitable home exists. I'd welcome any ideas on where these should belong.
Oh, I see you've added some extra description while I was writing the above review. Perhaps some of those links answer some questions above... |
Co-Authored-By: Alex Arslan <ararslan@comcast.net>
Sorry for my slow reply here.
Yep, that sounds like the right option.
I don't know enough to comment usefully. My own bias is that it's best to use Unitful. Presumably BIDS is a file format? Thus when doing IO couldn't you convert to seconds? A possible model is NRRD.jl which returns unitful-axes when the file is so annotated. Your BIDS reader could return all times in seconds but of course people could create their own metadata. Your writer could assume seconds if there are no units and otherwise convert to seconds before writing to disk. If some time values come in arrays rather than being single scalars, the other issue is whether you'll ever want to use
First, kudos on a huge amount of work! I don't know DICOM and BIDS enough to be very useful here, but I would tend to lean towards using more of a dictionary/property interface than a getter/setter interface. It might save you a lot of code. I'd also agree with the goal of avoiding type-tags where possible. For the coordinates, I can imagine the type-tags are necessary since I imagine coordinate transformation might be performance-sensitive. |
Actually no, it's a standard for curation of whole datasets containing multiple scans of multiple study subjects. It assumes that the unit of organization is the "session", which would typically be a scanning session in an MRI machine, resulting in multiple image types. The BIDS format does specify that individual images be stored in the NiFTi format which has typically been used in neuroscience as conversion target from DICOM. Hierarchical metadata about individual images is stored in json sidecar files. There are conventions for storing tabular data pertaining to the study. |
I'll see if I can get a basic PR for ImageMetadata going later today.
Hopefully this doesn't matter for mapping arrays because the time units are for axes or parameters. There may be some derived values in the image array (e.g., decay time in PET scans) but I would rather not open up that can of worms right now. What I'm proposing would be that we convert the parameters (eg., echo_time, dwell_time, etc.). to seconds (as unitful types). I will probably implement the same thing with the axes in NIfTI.jl's file reading, but there's no way we could/should ever enforce all axes to be in seconds as that would mean imposing the BIDS standard on JuliaImages. I can't image the audio visual people would like that. |
From a pragmatic point of view I do agree with that. I still find that pervasively using Unitful can be frictionful because not all packages work well with unitful quantities. Also it can be hard on the compiler. But for things like metadata I feel the cost is worth it because there can be such a depth and complex nonuniformity in metadata. So features which help people correctly understand and use metadata can be really worthwhile.
I still feel like this is somewhat unnecessary so I'd like to understand the rationale for the conversion. |
That being said, I imagine that decision was largely driven by a need to accept a raw number with no actual units. We have the unique advantage of being able to programmatically determine the units of measurement. Theoretically developers shouldn't have too much trouble getting at the unit of time they want. I'm starting to agree with your approach more. I'll meditate on it more today. |
I checked the BIDS specification again and saw the following:
https://bids-specification.readthedocs.io/en/stable/02-common-principles.html#units So to the extent that this is actually followed by implementations, let's definitely return seconds! (I'm not suggesting we deviate from what's in the files, just that we limit unnecessary conversion when reading those files.) Unfortunately in practice I've still seen a lot of mixed units in supposedly bids-format metadata files generated by the available tooling (possibly dcm2niix, possibly heudiconv; or possibly some of our own tooling; I'd have to check). I believe the mixed units are generally just inherited from the associated DICOM fields. For example ImagingFrequency expressed in MHz rather than Hz. |
I think this is the motivation for standardizing it in some way. With the current direction doing something like For now I think I'll stick with strictly following the units under the BIDS standard. This is by no means a final statement on how it will be. I think we could make a good argument for relaxing this but I want to see how it does in the wild first. It will be much easier to relax this if it's found to be too burdensome than it will be to require everyone to get use to a more strict standard. So I'll start with the strict units enforced and go from there. |
If you return unitful quantities then you can safely place responsibility for standardization on the consumer, and just report what's in the files. That's how I'd do it. |
* BIDS entities have own dedicated file * access to docstrings through `neurohelp` for non exported functions * Use getproperty and setproperty to access metadata * task_events -> events_table method * BIDS data entities are now `BIDSMetadata` * Added ImageMeta interface. This won't work publicly until #58 PR in ImageMetadata goes through
* Have basics tests for most BIDS entitites now * Enums mapping between for directions b/c many variations: - actual numeric values used in calculating direction are positive and negative integers. - strings are commonly used in JSON files - named directions are derived from this and stored in axes
- fixed some typos - bidsdata.jl now contains what could be the beginnings of a BIDS based datadeps - traits contains non BIDS entities properties
Most BIDS entities are tested. These tests will only pass with the upcoming PR to ImageMetadata implemented (symbol keys).
There's definitely a lot more to do but I think this is a good start for a first PR. I'll create issues for the other things that I think need to be done before a formal release can be made. I'll wait until tomorrow to merge this PR just in case someone else wants to take a look at things. |
This is the first set of traits proposed for NeuroCore. This is all based off of the Brain Imaging Data Standard so the exact traits/properties proposed should remain conceptually the same (return the same parameter). At this point the goals should be:
Outline
getter
andsetter
to enforce type stability and search for theproperties
a la ImageMeta.jl.sform
andqform
methods. This is derived from NIfTI standard but has been problematic in other imaging software working together (ITK and HCP project).getproperty
andsetproperty
for the BIDS entities.Issues
getproperty
interface with a dictionary backend. These map directly to internal functions. This allows for an internal function for quick access to the property without exported a ridiculous amount of methods. It also allows future packages to overload the specific method if faster access is needed for the property. This will be documented to make more clear in a subsequent PR.TODO
CoordinateTransformations.AffineMap