-
Notifications
You must be signed in to change notification settings - Fork 70
Description
Whole slide image (WSI) data plays a significant role in the digital pathology field. However, integrating WSI into SpatialData is quite challenging.
What makes WSI different:
- Large file size: WSI data is typically large on disk, ranging from approximately 300 MB to 2 GB per slide, even with JPEG compression.
- Proprietary formats: Most formats differ from TIFF, not even including OME-TIFF. Many formats require drivers like OpenSlide or BioFormat to be read.
- Read-only: In 99% of cases, users need only to read the WSI data instead of modifying anything.
So far, there are a few attempts to integrate WSI into SpatialData:
- DVP image readers lucas-diedrich/spatialdata-io#1
- SOPA's reader: https://github.com/gustaveroussy/sopa
The idea is to wrap OpenSlide behind xarray or the zarr store to mimic the image interface in SpatialData. The issue is that this approach creates an unnecessary copy of WSI data when serializing the SpatialData on disk. Without proper compression, this could lead to substantial disk usage. While it is a feasible solution for small datasets like ST with few slides, it becomes impractical in the digital pathology field, which often deals with thousands of slides.
I currently have a solution that extends SpatialData with WSI readers rendeirolab/wsidata. The wsidata will hold a reader object with extra APIs to access WSI images but will not mount the image to the images
slot in SpatialData like previous solutions. This way, we can avoid unnecessary data copies during serialization. The main drawback of this solution is that it does not comply with the scverse ecosystem when it encounters anything related to images.
Another potential solution is to create soft links for the WSI image files on disk with SpatialData so that when a user saves a SpatialData object, we do not have to copy the WSI data.
Hi @LucaMarconato, I discussed this with you a few months ago at the scverse conference. Hope we can find a graceful solution soon!