Skip to content

Commit fd538c5

Browse files
committed
Merge branch 'dev' into dev_peso
2 parents be8ddad + 5856d80 commit fd538c5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+3892
-2161
lines changed

.coveragerc

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,9 +2,13 @@
22

33
[run]
44
source = pathml
5-
omit =
6-
pathml/preprocessing/base_*
75
command_line = -m pytest
86

97
[html]
108
directory = coverage_report_html
9+
10+
[report]
11+
exclude_lines =
12+
pragma: no cover
13+
if self.debug:
14+
raise NotImplementedError

.github/workflows/python-package-conda.yml renamed to .github/workflows/tests-conda.yml

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,16 @@
11
name: Python Package using Conda
22

3-
on: [pull_request]
3+
on:
4+
pull_request:
5+
branches: [dev, master]
46

57
jobs:
68
build-linux:
79
runs-on: ubuntu-latest
810
strategy:
911
max-parallel: 5
1012
matrix:
11-
python-version: [3.7]
13+
python-version: [3.8]
1214

1315
steps:
1416
- uses: actions/checkout@v2
@@ -45,3 +47,10 @@ jobs:
4547
run: |
4648
conda install pytest
4749
python -m pytest
50+
- name: Compile docs
51+
shell: bash -l {0}
52+
run: |
53+
sudo apt-get install pandoc
54+
pip install ipython sphinx nbsphinx nbsphinx-link sphinx-rtd-theme
55+
cd docs
56+
make html

CONTRIBUTING.rst

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,14 +47,16 @@ Here's how to contribute code, documentation, etc.
4747
6. Write new tests as needed to maintain code coverage
4848
7. Ensure that all tests still pass
4949
8. Commit your changes and submit a pull request reference the corresponding issue
50+
9. Respond to discussion/feedback about the pull request. Make changes as necessary.
5051

5152
Documentation Standards
5253
=======================
5354

5455
All code should be documented, including docstrings for users AND inline comments for
5556
other developers whenever possible! Both are crucial for ensuring long-term usability and maintainability.
56-
Documentation is automatically generated using the Sphinx `autodoc`_ extension from properly formatted docstrings.
57-
All documentation (including docstrings) are written in `reStructuredText`_ format.
57+
Documentation is automatically generated using the Sphinx `autodoc`_ and `napoleon`_ extensions from
58+
properly formatted Google-style docstrings.
59+
All documentation (including docstrings) is written in `reStructuredText`_ format.
5860
See this `docstring example`_ to get started.
5961

6062
To build documentation:
@@ -65,10 +67,12 @@ To build documentation:
6567
cd docs # enter docs directory
6668
make html # build docs in html format
6769
70+
Open ``docs/build/html/index.html`` in your favorite web browser.
71+
6872
Testing Standards
6973
=================
7074

71-
All new code should be accompanied by tests, whenever possible, to maintain good code coverage.
75+
All new code should be accompanied by tests, whenever possible, to maintain good code coverage (target >90%).
7276
We use the `pytest`_ testing framework.
7377
All tests should pass for new code, and new tests should be added as necessary when fixing bugs.
7478

@@ -91,4 +95,5 @@ Thank you for helping make ``PathML`` better!
9195
.. _pytest: https://docs.pytest.org/en/stable/
9296
.. _autodoc: https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html
9397
.. _reStructuredText: https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html
94-
.. _docstring example: https://sphinx-rtd-tutorial.readthedocs.io/en/latest/docstrings.html
98+
.. _docstring example: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html
99+
.. _napoleon: https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html

README.md

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,14 +24,45 @@ A toolkit for computational pathology and machine learning.
2424

2525
## Installation
2626

27+
1. Clone repo
28+
29+
````
30+
git clone https://github.com/Dana-Farber/pathml.git
31+
cd pathml
32+
````
33+
34+
2. Set Up Conda Environment
35+
36+
````
37+
conda create --name pathml
38+
conda activate pathml
39+
````
40+
41+
3. Install CUDA. This step only applies if you want to use GPU acceleration for model training or other tasks. This guide should work, but for the most up-to-date instructions, refer to the [official PyTorch installation instructions](https://pytorch.org/get-started/locally/).
42+
43+
- Check the version of CUDA:
44+
45+
````
46+
nvidia-smi
47+
````
48+
49+
- Install correct version of `cudatoolkit`:
50+
51+
````
52+
# update this command with your CUDA version number
53+
conda install cudatoolkit=11.0
54+
````
55+
56+
57+
4. Install PathML
58+
2759
````
28-
git clone https://github.com/Dana-Farber/pathml.git # clone repo
29-
cd pathml # enter repo directory
30-
conda env create -f environment.yml # create conda environment
31-
conda activate pathml # activate conda environment
32-
pip install -e . # install pathml in conda environment
60+
conda env update -f environment.yml # install dependencies
61+
pip install -e . # install pathml
3362
````
3463
64+
>> to verify PyTorch installation with GPU support: `python -c "import torch; print(torch.cuda.is_available())"`
65+
3566
## Generate Documentation
3667
3768
This repo is not yet open to the public. Once we open source it, we will host documentation online.

docs/source/api_reference_full.rst

Lines changed: 20 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,30 +1,34 @@
11
Full API Reference
22
==================
33

4-
Preprocessing
5-
-------------
4+
Core
5+
----
66

7-
.. automodule:: pathml.preprocessing.wsi
7+
.. automodule:: pathml.core.slide_data
88
:members:
9-
.. automodule:: pathml.preprocessing.multiparametricslide
9+
.. automodule:: pathml.core.slide_classes
1010
:members:
11-
.. automodule:: pathml.preprocessing.slide_data
11+
.. automodule:: pathml.core.slide_backends
1212
:members:
13-
.. automodule:: pathml.preprocessing.pipeline
13+
.. automodule:: pathml.core.tile
1414
:members:
15-
.. automodule:: pathml.preprocessing.tiling
15+
.. automodule:: pathml.core.tiles
1616
:members:
17-
.. automodule:: pathml.preprocessing.stains
17+
.. automodule:: pathml.core.masks
1818
:members:
19-
.. automodule:: pathml.preprocessing.transforms
19+
.. automodule:: pathml.core.h5managers
2020
:members:
21-
.. automodule:: pathml.preprocessing.transforms_HandE
21+
22+
Preprocessing
23+
-------------
24+
25+
.. automodule:: pathml.preprocessing.pipeline
2226
:members:
23-
.. automodule:: pathml.preprocessing.utils
27+
.. automodule:: pathml.preprocessing.tiling
2428
:members:
25-
.. automodule:: pathml.preprocessing.base
29+
.. automodule:: pathml.preprocessing.transforms
2630
:members:
27-
31+
:undoc-members:
2832

2933
Datasets
3034
--------
@@ -39,5 +43,7 @@ Datasets
3943
ML
4044
--
4145

42-
.. automodule:: pathml.ml
46+
.. automodule:: pathml.ml.hovernet
47+
:members:
48+
.. automodule:: pathml.ml.utils
4349
:members:

docs/source/conf.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@
1818
# -- Project information -----------------------------------------------------
1919

2020
project = 'PathML'
21-
copyright = '2020, DFCI'
21+
copyright = '2021, Dana-Farber Cancer Institute'
2222
author = 'Jacob Rosenthal'
2323

2424
version = '0.0.1'

docs/source/custom_pipelines.rst

Lines changed: 51 additions & 90 deletions
Original file line numberDiff line numberDiff line change
@@ -1,113 +1,74 @@
11
Custom Preprocessing Pipelines
22
==============================
33

4-
``PathML`` comes with a set of pre-made pipelines ready to use out of the box.
5-
However, it may also be necessary in many cases to create custom preprocessing pipelines tailored to the specific
6-
application at hand.
4+
``PathML`` makes designing preprocessing pipelines easy. In this section we will walk through how to define a
5+
:class:`~pathml.preprocessing.pipeline.Pipeline` object by composing pre-made
6+
:class:`~pathml.preprocessing.transforms.Transform`s, and how to implement a
7+
new custom :class:`~pathml.preprocessing.transforms.Transform`.
78

89
Pipeline basics
910
---------------
1011

11-
Preprocessing pipelines are defined in objects that inherit from the BasePipeline abstract class.
12-
The preprocessing logic for a single slide is defined in the ``run_single()`` method.
13-
Then, when the ``run()`` method is called, the input is checked to see whether it is a single slide or a dataset of
14-
slides. The ``run_single()`` method is then called as appropriate, and multiprocessing is automatically handled in the
15-
case of processing an entire dataset.
12+
Preprocessing pipelines are defined in :class:`~pathml.preprocessing.pipeline.Pipeline` objects.
13+
When :meth:`~pathml.core.slide_data.SlideData.run`
14+
is called, tiles are lazily extracted from the slide by
15+
:meth:`~pathml.core.slide_data.SlideData.generate_tiles` and passed to the
16+
:class:`~pathml.preprocessing.pipeline.Pipeline`, which modifies the :class:`~pathml.core.tile.Tile` object in place.
17+
Finally, the processed tile is saved.
18+
This design facilitates preprocessing of gigapixel-scale whole-slide images, because :class:`~pathml.core.tile.Tile`
19+
objects are small enough to fit in memory.
1620

17-
To define a new pipeline, all that is necessary is to define the ``run_single()`` method.
18-
The method should take a ``BaseSlide`` object as input (or a specific type of slide inheriting from the ``BaseSlide``
19-
class), and should write the processed output to disk. Because the ``run()`` method is just a wrapper around the
20-
``run_single()`` method, there is no need to override the default ``run()``.
21+
Composing a Pipeline
22+
--------------------
2123

22-
A ``SlideData`` object can be used to hold intermediate outputs, so that a preprocessing step can have access to
23-
outputs from earlier steps.
24+
In many cases, a preprocessing pipeline can be thought of as a sequence of transformations.
25+
:class:`~pathml.preprocessing.pipeline.Pipeline` objects can be created by composing
26+
a list of :class:`~pathml.preprocessing.transforms.Transform`:
2427

25-
Interacting with slides
26-
------------------------
28+
.. code-block:: python
2729
28-
Pipelines must take ``BaseSlide`` objects as input.
29-
This interaction between Pipelines and Slides is very important - design choices here can affect pipeline execution
30-
times by orders of magnitude!
31-
This is because whole-slide images can be very large, even exceeding the amount of available memory in most machines!
30+
pipeline = Pipeline([
31+
BoxBlur(kernel_size=15),
32+
TissueDetectionHE(mask_name = "tissue", min_region_size=500,
33+
threshold=30, outer_contours_only=True)
34+
])
35+
..
3236
33-
.. note::
37+
In this example, the preprocessing pipeline will first apply a box blur kernel, and then apply tissue detection.
38+
It is that easy to compose pipelines by mixing and matching :class:`~pathml.preprocessing.transforms.Transform` objects!
3439

35-
Naively loading an entire WSI into memory at high-resolution should therefore be avoided in most cases!
3640

37-
Consider these best-practices when designing custom pipelines:
41+
Custom Transforms
42+
-----------------
3843

39-
- Make use of the ``BaseSlide.chunks()`` method to process the WSI in smaller chunks
40-
- Perform operations on lower-resolution image levels, when possible (i.e. when the slide has multiple resolutions
41-
available and the operation will not suffer from decreased resolution)
42-
- Be cognizant of memory requirements at each step in the pipeline
43-
- Avoid loading entire slides into memory at high-resolution!
44+
A :class:`~pathml.preprocessing.pipeline.Pipeline` is a special case of
45+
a :class:`~pathml.preprocessing.transforms.Transform` which makes it easy
46+
to compose several :class:`~pathml.preprocessing.transforms.Transform`s sequentially.
47+
However, in some cases, you may want to implement a :class:`~pathml.preprocessing.transforms.Transform` from scratch.
48+
For example, you may want to apply a transformation which is not already implemented in ``PathML``.
49+
Or, perhaps you want to apply a preprocessing pipeline which is not perfectly sequential.
4450

45-
Using Transforms
46-
-------------------
51+
To define a new custom :class:`~pathml.preprocessing.transforms.Transform`,
52+
all you need to do is create a class which inherits from :class:`~pathml.preprocessing.transforms.Transform` and
53+
implements an ``apply()`` method which takes a :class:`~pathml.core.tile.Tile` as an argument and modifies it in place.
54+
You may also implement a functional method ``F()``, although that is not strictly required.
4755

48-
``PathML`` provides a set of modular Transformation objects to make it easier to define custom preprocessing pipelines.
49-
Individual low-level operations are implemented in ``Transform`` objects, through the ``apply()`` method.
50-
This consistent API makes it convenient to use complex operations in pipelines, and combine them modularly.
51-
There are several types of Transforms, as defined by their inputs and outputs:
56+
For example, let's take a look at how :class:`~pathml.preprocessing.transforms.BoxBlur` is implemented:
5257

53-
================== ========== ===========
54-
Transform type Input Output
55-
================== ========== ===========
56-
ImageTransform image image
57-
Segmentation image mask
58-
MaskTransform mask mask
59-
================== ========== ===========
58+
.. code-block:: python
6059
61-
Some things to consider when implementing a custom pipeline:
62-
63-
- Use existing Transforms when possible! This will save time compared to implementing the entire pipeline from scratch.
64-
- If implementing a new transformation or pipeline operation, consider contributing it to ``PathML`` so that other
65-
users in the community can benefit from your hard work! See: contributing
66-
- Be aware of memory and computation requirements of your pipeline.
67-
68-
69-
Examples
70-
--------
71-
72-
In this example we'll define a Pipeline which reads chunks of the input slide, applies a box blur with a given kernel
73-
size, and then writes the blurred image to disk.
74-
75-
.. code-block::
76-
77-
import os
78-
import cv2
79-
from pathml.preprocessing.base import BasePipeline
80-
from pathml.preprocessing.transforms import BoxBlur
81-
from pathml.preprocessing.wsi import HESlide
82-
83-
class ExamplePipeline(BasePipeline):
84-
def __init__(self, kernel_size):
60+
class BoxBlur(Transform):
61+
"""Box (average) blur kernel."""
62+
def __init__(self, kernel_size=5):
8563
self.kernel_size = kernel_size
8664
87-
def run_single(self, slide, output_dir):
88-
blur = BoxBlur(kernel_size)
89-
for i, chunk in enumerate(slide.chunks(level = 0, size = 1000)):
90-
blurred_chunk = blur.apply(chunk)
91-
fname = os.path.join(output_dir, f"chunk{i}.jpg")
92-
cv2.imwrite(fname, blurred_chunk)
93-
94-
# usage
95-
wsi = HESlide("/path/to/wsi.svs")
96-
ExamplePipeline(kernel_size = 11).run(wsi)
97-
98-
99-
In this example, we define a Transform which changes the order of the channels in the input RGB image.
100-
101-
.. code-block::
65+
def F(self, image):
66+
return cv2.boxFilter(image, ksize = (self.kernel_size, self.kernel_size), ddepth = -1)
10267
103-
from pathml.preprocessing.base import ImageTransform
68+
def apply(self, tile):
69+
tile.image = self.F(tile.image)
70+
..
10471
105-
class ChannelSwitch(ImageTransform):
106-
def apply(self, image):
107-
# make sure that the input image has 3 channels
108-
assert image.shape[2] == 3
109-
out = image
110-
out[:, :, 0] = image[:, :, 2]
111-
out[:, :, 1] = image[:, :, 0]
112-
out[:, :, 2] = image[:, :, 1]
113-
return out
72+
That's it! Once you define your custom :class:`~pathml.preprocessing.transforms.Transform`,
73+
you can plug it in with any of the other :class:`~pathml.preprocessing.transforms.Transform`s,
74+
compose :class:`~pathml.preprocessing.pipeline.Pipeline`, etc.

docs/source/examples.rst

Lines changed: 0 additions & 13 deletions
This file was deleted.

docs/source/examples/link_advanced_HE_chunks.nblink

Lines changed: 0 additions & 3 deletions
This file was deleted.

docs/source/examples/link_basic_HE.nblink

Lines changed: 0 additions & 3 deletions
This file was deleted.

0 commit comments

Comments
 (0)