Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

from_tiff convenience function #51

Merged
merged 5 commits into from
Sep 15, 2020
Merged

from_tiff convenience function #51

merged 5 commits into from
Sep 15, 2020

Conversation

tlambert03
Copy link
Owner

adds a from_tiff convenience to generate an OME class from a tiff, using tifffile.
Stopped short of adding tifffile to dependencies though (added an extra). Ideas for extracting XML without tifffile welcome.

@jmuhlich
Copy link
Collaborator

I got you fam: https://github.com/labsyspharm/ome-tiff-pyramid-upgrade/blob/master/pyramid_upgrade.py#L36
I wrote that to read, manipulate, and write IFDs in order to upgrade the old legacy OME-TIFF "Faas" pyramids to the new BioFormats 6 format in place, without reading or writing any of the pixel data so it runs in an instant (and doesn't require space for a copy). With a little more work it could be a general purpose IFD reading library (like adding classic TIFF support and the ability to refer to tags by name instead of number) but I could easily extract just enough code to read only the ImageDescription out of the first IFD. It only uses pure Python.

@tlambert03
Copy link
Owner Author

I got you fam

lol... I had to google that 😂

wow 😱 that looks amazing. I think I'm all for it if you are! Though I do want a bit more thorough background on exactly what the difference are. what the limitations of tifffile are, what motivated you to do this, what potential limitations might be here...

but yeah. this looks crazy

@jmuhlich
Copy link
Collaborator

For now, if we just want to extract the ImageDescription it's probably 10-20 lines of code. I can omit all the defensive code (which I needed since I didn't want to corrupt files on write), writing code, and general purpose Tag/TagSet data structures. I'll see what I can scrape together. PR against this branch?

The rest of the code in TiffSurgeon would only be required for a full OME-TIFF read/write library, which is out of scope for ome_types. Or is it...?

@tlambert03
Copy link
Owner Author

PR against this branch?

yeah, I'd like to see what that looks like ... (also feel free to open a new PR if you want)

The rest of the code in TiffSurgeon would only be required for a full OME-TIFF read/write library, which is out of scope for ome_types. Or is it...?

To me, it really doesn't matter if it's in ome-types or some other package that plays nicely with ome-types... but it would be great to have a pure python library with a nice API to write a numpy array to ome.tiff. Again, I wouldn't want to reinvent any wheels (ala tifffile), but if there are new (bioformats 6?) features that don't have a pure-python library yet, and if you've already got a start on that sort of thing, it would be great to hook it up here (whether directly or indirectly)

@cgohlke
Copy link
Contributor

cgohlke commented Sep 15, 2020

Ideas for extracting XML without tifffile welcome.

Here's a small standalone function to read the first ImageDescription tag value (as NULL terminated bytes) from an open TIFF file handle:

def tiff_image_description(fh):
    """Return value of first ImageDescription tag from open TIFF file."""
    from struct import unpack
    offsetsize, offsetformat, tagnosize, tagnoformat, tagsize, codeformat = {
        b'II*\0': (4, '<I', 2, '<H', 12, '<H'),
        b'MM\0*': (4, '>I', 2, '>H', 12, '>H'),
        b'II+\0': (8, '<Q', 8, '<Q', 20, '<H'),
        b'MM\0+': (8, '>Q', 8, '>Q', 20, '>H'),
    }[fh.read(4)]
    fh.read(4 if offsetsize == 8 else 0)
    fh.seek(unpack(offsetformat, fh.read(offsetsize))[0])
    for _ in range(unpack(tagnoformat, fh.read(tagnosize))[0]):
        tagstruct = fh.read(tagsize)
        if unpack(codeformat, tagstruct[:2])[0] == 270:
            size = unpack(offsetformat, tagstruct[4:4+offsetsize])[0]
            if size <= offsetsize:
                return tagstruct[4+offsetsize:4+offsetsize+size]
            fh.seek(unpack(offsetformat, tagstruct[-offsetsize:])[0])
            return fh.read(size)

See also https://docs.openmicroscopy.org/ome-model/6.1.2/ome-tiff/code.html#extracting-a-tiff-comment

@evamaxfield
Copy link
Contributor

Resolves #50

@jmuhlich
Copy link
Collaborator

Thanks @cgohlke! That's about what I was envisioning. :)

@tlambert03
Copy link
Owner Author

Thanks @cgohlke!

Also ignore Flake E203 since Black imposes some patterns that are incompatible
with it (and Black forces formatting for all cases that E203 covers, anyway).

Co-authored-by: Christoph Gohlke <cgohlke@uci.edu>

Co-authored-by: Christoph Gohlke <cgohlke@uci.edu>
src/ome_types/__init__.py Outdated Show resolved Hide resolved
@tlambert03 tlambert03 merged commit da07448 into master Sep 15, 2020
@tlambert03 tlambert03 deleted the from_tiff branch September 15, 2020 12:42
@tlambert03 tlambert03 mentioned this pull request Sep 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants