Skip to content

Commit

Permalink
Metadata (#93)
Browse files Browse the repository at this point in the history
* Replace methods in dataset.py with classes

* Two types of samples

* Image metadata class not from opentile

* Use DicomAttribute to specify dicom attribute requirements

* Use FieldFactory for creating fields with json serialization properties

* Add from_dataset()-methods

* Specimen samples with preparation steps

* Extended optical path

* Initial use of marshmallow

* Remove dataclass field factory

* Update dependencies

* Improve restriction on sampling chain

* Use cached property for uidss

* Add tests

* Run codespell

* Spell fixes

* Fix serialization and deserialization of specimen

* Class and instance for default values

* Read specimen from dataset

* Refactor

* Test slide sample from dataset

* Merge base, user, and default model

* Serialize staining instead of only compound

* Add default metadata to main inits

* Add license to new files

* Rename module for json serialization

* Dump/load dicom using marshmallow

* Rename json schemas

* Remove additional attributes

* Remove old comments and organize imports

* Tests for default models

* Add generic type

* Deserialize label

* Simplify sample json

* Simplify step json

* Move subclass behavior to subclass

* Separate dicom dataset methods from model

* Method for creating WsiMetadata from multiple datasets

* Lut implementation

* Allow separate start for lut components

* Move metadata to wsidicom

* Enable defining pixel spacing

* Add icc profile to test

* Rename test file

* Merge optical path lists

* Remove dateutil, update highdicom

* Fix empty `tile` fetching

* Update changelog

* Relax pydicom requirement

* Update for changes in wsidicom metadata

* Use remove confidential methods

* Release 0.12.0

* Update lock

* Update lock

* Fix tests
  • Loading branch information
erikogabrielsson authored Jan 12, 2024
1 parent 8302ffd commit 7846b5d
Show file tree
Hide file tree
Showing 36 changed files with 2,343 additions and 1,637 deletions.
3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ Pipfile.lock*
.pytest_cache*
*__pycache__**
.vscode
typings
typings
.coverage
16 changes: 14 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,17 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased] -
## [Unreleased]

## [0.12.0] - 2023-01-12

### Changed

- Replaced `Dataset` based metadata by `modules`-parameter to `open()` and `convert()` with metadata models from `WsiDicom`. Use the `metadata`-parameter to define metadata that should override any metadata found in the source file, and the`default_metadata`-parameter to define metadata that should be used if no other metadata is defiend.

### Fixed

- Fixed fetching empty tile regions with tiffslide and openslide.

## [0.11.0] - 2023-12-10

Expand Down Expand Up @@ -194,7 +204,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

- Initial release of wsidicomizer

[Unreleased]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.10.2..HEAD
[Unreleased]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.12.0..HEAD
[0.12.0]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.11.0..0.12.0
[0.11.0]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.10.2..0.11.0
[0.10.2]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.10.1..0.10.2
[0.10.1]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.10.0..0.10.1
[0.10.0]: https://github.com/imi-bigpicture/wsidicomizer/compare/0.9.3..0.10.0
Expand Down
119 changes: 90 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,13 +64,19 @@ wsidicomizer -i 'path_to_wsi_file' -o 'path_to_output_folder'
-i, --input, path to input wsi file
-o, --output, path to output folder
-t, --tile-size, required depending on input format
-d, --dataset, optional path to json file defining base dataset
-m, --metadata, optional path to json file defining metadata
-d, --default-metadata, optional path to json file defining default metadata
-l, --levels, optional levels to include
-w, --workers, number of threads to use
--label, optional label image to use instead of label found in file
--no-label, if not to include label image
--no-overview, if not to include overview image
--no-confidential, if to not include confidential metadata
--chunk-size, number of tiles to give each worker at a time
--format, encoding format to use if re-encoding. 'jpeg' or 'jpeg2000'
--quality, quality to use if re-encoding.
--subsampling, subsampling option to use if re-encoding.
--offset-table, offset table to use, 'bot', 'eot', or 'None'
~~~~

### Flags
Expand All @@ -86,48 +92,103 @@ Using the no-confidential-flag properties according to [DICOM Basic Confidential
- Acquisition DateTime
- Device Serial Number

## Basic notebook-usage
## Basic usage

***Create module datasets (Optional)***
***Create metadata (Optional)***

```python
from wsidicomizer.dataset import create_device_module, create_sample, create_specimen_module, create_brightfield_optical_path_module, create_patient_module, create_study_module
device_module = create_device_module(
manufacturer='Scanner manufacturer',
model_name='Scanner model name',
serial_number='Scanner serial number',
software_versions=['Scanner software versions']
from wsidicom.conceptcode import (
AnatomicPathologySpecimenTypesCode,
ContainerTypeCode,
SpecimenCollectionProcedureCode,
SpecimenEmbeddingMediaCode,
SpecimenFixativesCode,
SpecimenSamplingProcedureCode,
SpecimenStainsCode,
)
sample = create_sample(
sample_id='sample id',
embedding_medium='Paraffin wax',
fixative='Formalin',
stainings=['hematoxylin stain', 'water soluble eosin stain']
from wsidicom.metadata import (
Collection,
Embedding,
Equipment,
Fixation,
Label,
Patient,
Sample,
Series,
Slide,
SlideSample,
Specimen,
Staining,
Study,
)
specimen_module = create_specimen_module(
slide_id='slide id',
samples=[sample]
from wsidicomizer.metadata import WsiDicomizerMetadata

study = Study(identifier="Study identifier")
series = Series(number=1)
patient = Patient(name="FamilyName^GivenName")
label = Label(text="Label text")
equipment = Equipment(
manufacturer="Scanner manufacturer",
model_name="Scanner model name",
device_serial_number="Scanner serial number",
software_versions=["Scanner software versions"],
)
optical_module = create_brightfield_optical_path_module()
patient_module = create_patient_module()
study_module = create_study_module()

specimen = Specimen(
identifier="Specimen",
extraction_step=Collection(method=SpecimenCollectionProcedureCode("Excision")),
type=AnatomicPathologySpecimenTypesCode("Gross specimen"),
container=ContainerTypeCode("Specimen container"),
steps=[Fixation(fixative=SpecimenFixativesCode("Neutral Buffered Formalin"))],
)

block = Sample(
identifier="Block",
sampled_from=[specimen.sample(method=SpecimenSamplingProcedureCode("Dissection"))],
type=AnatomicPathologySpecimenTypesCode("tissue specimen"),
container=ContainerTypeCode("Tissue cassette"),
steps=[Embedding(medium=SpecimenEmbeddingMediaCode("Paraffin wax"))],
)

slide_sample = SlideSample(
identifier="Slide sample",
sampled_from=block.sample(method=SpecimenSamplingProcedureCode("Block sectioning")),
)

slide = Slide(
identifier="Slide",
stainings=[
Staining(
substances=[
SpecimenStainsCode("hematoxylin stain"),
SpecimenStainsCode("water soluble eosin stain"),
]
)
],
samples=[slide_sample],
)
metadata = WsiDicomizerMetadata(
study=study,
series=series,
patient=patient,
equipment=equipment,
slide=slide,
label=label,
)
```

***Convert a wsi-file into DICOM using python-interface***

```python
from wsidicomizer import WsiDicomizer
created_files = WsiDicomizer.convert(
path_to_wsi_file,
path_to_output_folder,
[device_module, specimen_module, optical_module, patient_module, study_module],
tile_size
filepath=path_to_wsi_file,
output_path=path_to_output_folder,
metadata=metadata,
tile_size=tile_size
)
```

tile_size is required for Ndpi- and OpenSlide-files.

***Import a wsi file as a WsiDicom object.***

```python
Expand All @@ -147,7 +208,7 @@ Support for reading images using Openslide c library can optionally be enabled b
pip install wsidicomizer[openslide]
```

The OpenSlide extra requires the OpenSlide library to be installed separately. Instructions for how to install OpenSlide is avaiable on <https://openslide.org/download/>
The OpenSlide extra requires the OpenSlide library to be installed separately. Instructions for how to install OpenSlide is available on <https://openslide.org/download/>
For Windows, you need also need add OpenSlide's bin-folder to the environment variable 'Path'

## Bioformats support
Expand All @@ -164,7 +225,7 @@ The `bioformats` extra enables usage of the `bioformats` module and the `bioform

### Using

As the Bioformats library is a java library it needs to run in a java virtual machine (JVM). A JVM is started automatically when the `bioformats` module is imported. The JVM can´t be restarted in the same Python inteprenter, and is therfore left running once started. If you want to shutdown the JVM (without closing the Python inteprenter) you can call the shutdown_jvm()-method:
As the Bioformats library is a java library it needs to run in a java virtual machine (JVM). A JVM is started automatically when the `bioformats` module is imported. The JVM can´t be restarted in the same Python inteprenter, and is therefore left running once started. If you want to shutdown the JVM (without closing the Python inteprenter) you can call the shutdown_jvm()-method:

```python
import scyjava
Expand All @@ -175,7 +236,7 @@ Due to the need to start a JVM, the `bioformats` module is not imported when usi

### Bioformats version

The Bioformats java library is avaiable in two versions, one with BSD and one with GPL2 license, and can read several [WSI formats](https://bio-formats.readthedocs.io/en/v6.12.0/supported-formats.html). However, most formats are only avaible in the GPL2 version. Due to the licensing incompatibility between Apache 2.0 and GPL2, *wsidicomizer* is distributed with a default setting of using the BSD licensed library. The loaded Biformats version can be changed by the user by setting the `BIOFORMATS_VERSION` environmental variable from the default value `bsd:6.12.0`.
The Bioformats java library is available in two versions, one with BSD and one with GPL2 license, and can read several [WSI formats](https://bio-formats.readthedocs.io/en/v6.12.0/supported-formats.html). However, most formats are only available in the GPL2 version. Due to the licensing incompatibility between Apache 2.0 and GPL2, *wsidicomizer* is distributed with a default setting of using the BSD licensed library. The loaded Biformats version can be changed by the user by setting the `BIOFORMATS_VERSION` environmental variable from the default value `bsd:6.12.0`.

## Limitations

Expand Down
6 changes: 6 additions & 0 deletions package-lock.json

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 7846b5d

Please sign in to comment.