-
Notifications
You must be signed in to change notification settings - Fork 251
Prairie fixes #508
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Prairie fixes #508
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This avoids a spurious "Content is not allowed in prolog" error that Java normally emits when attempting to parse an XML document with UTF-8 encoding that includes a BOM: http://www.rgagnon.com/javadetails/java-handle-utf8-file-with-bom.html
The usage of parallel arrays makes it very difficult to guarantee robust behavior in the case of missing and malformed data. The code is also harder than necessary to understand. The new Prairie metadata structure provides a "1st class citizen" for accessing Prairie-specific information. It will also ease later migration to the SCIFIO structure.
This change utilizes the new PrairieMetadata data structure, improving
on the behavior of the previous reader implementation as follows:
1. PrairieView is capable of acquiring multiple stage positions (i.e.,
multiple Images), but does not differentiate them in the metadata in
any way. Rather, all time points at all positions are simply numbered
sequentially as "cycles" and conflated.
However, we can disentangle stage positions from time points by
comparing stage positions. If we notice a repeating pattern of stage
positions, we record them as multiple series in the core metadata.
PrairieView acquires planes for all positions at a given time point
before moving on to the next time point. Hence, the rasterization
order is always P faster, T slower; e.g.:
P1T1, P2T1, P3T1, P1T2, P2T2, P3T2
2. PrairieView numbers each Sequence (i.e., stage position / time point
pair) with a "cycle" value, each Frame (i.e., focal plane of a
Sequence) with an "index" value, and each File (i.e., channel of a
Frame) with a "channel" value. These values are 1-based.
In our experience, the XML is always recorded in sequential order.
However, the new implementation does not assume that will always be
the case. Rather, it attempts to gracefully handle missing Sequences,
Frames and Files, using the cycle, index and channel metadata as the
canonical definition of how each TIFF file fits into the dataset. One
useful consequence of this flexibility is that the reader now
supports partial Prairie datasets: if the run is interrupted before
completion, PrairieView produces a partial dataset, with no XML
elements or TIFF files beyond the point of interruption. This is
useful if, for example, the desired phenomenon or occurrence is
observed taking place before the acquisition is complete; there may
be no reason to continue the run after that point.
If linesPerFrame or pixelsPerLine is missing, we fall back to using the first TIFF file's Y or X size, respectively.
This method returns the list of Sequences ordered by key. It will be useful to handle cases where the cycle numbering does not increment one at a time (e.g., for metadata with cycle=2,5,8,11,... or even for non-linear increment patterns).
PrairieView is capable of producing data where the cycle numbering does not increment one at a time (e.g., for metadata with cycle=2,5,8,11,...). We have not yet observed any datasets with variable, non-linear increment patterns (e.g., cycle=2,4,5,11,12,...), but this new approach should handle that too. The new approach works by explicitly creating and sorting a list of existing sequences, then using that list as the basis for sizeT & sizeP, rather than assuming that Sequence#getCycleCount() will be sensible.
In the case where a Sequence is flagged as a TSeries, the Frames of that Sequence must be treated as time points rather than focal planes. This logic was previously not fully propagated through the code.
Member
Author
|
Sadly, this branch does not currently build. Investigating... |
Thanks to Melissa Linkert for noticing.
Thanks to Melissa Linkert for pointing this out.
I somehow missed this before, so it was always null.
This is required to properly handle cases where some channels are active and others are not, as defined in the CFG metadata. For example, we have datasets with "*_Ch1_*.tif" and "*_Ch3_*.tif" planes acquired, but no "*_Ch2_*.tif" files.
We no longer care about which channel min/maxes exist per frame, as we
rely totally on the CFG metadata to determine the desired channels.
To be absolutely sure that all of our sample Prairie datasets have
channel metadata given in CFG, I did a quick check:
find . -name '*.cfg' -print0 | xargs -0 grep -L channel_
And indeed they do. If we come across a dataset in the future with this
information missing, it would be pretty easy to synthesize based on the
per-frame channel min/maxes again, but until then, we don't need it.
More specifically, we ignore the X stage position inversion flag, because our sample data does not appear to respect it anyway.
Member
Author
|
Build errors fixed, and the code is working in my manual tests (using Fiji). |
|
--test prairie |
hflynn
pushed a commit
to hflynn/bioformats
that referenced
this pull request
Oct 11, 2013
Ldap filters (rebased onto dev_4_4)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is the same as #217, but for the dev_4_4 branch.