Skip to content

Conversation

@shaneahmed
Copy link
Member

1.4.0 (2023-04-24)

Major Updates and Feature Improvements

Changes to API

Bug Fixes and Other Changes

Development related changes

  • Upgrades dependencies which are dependent on Python 3.7
  • Moves requirements*.txt files to requirements folder
  • Removes tox
  • Uses pyproject.toml for bdist_wheel, pytest and isort
  • Adds joblib and numba as dependencies.

shaneahmed and others added 30 commits February 1, 2022 11:53
- Update `tiatoolbox` according to black v22.1.0
- Add badge for biorxiv paper.
Thanks @sarthakpati for helping in release of conda package for TIAToolbox.
- Fix flake8 errors and typos in `stainextract.py`

Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
# Main Changes

1. Add a new `DICOMWSIReader` for reading DICOM WSIs. Very similar to `OpenSlideWSIReader`.
   - Add new sample DICOM image to the toolbox samples server by converting CMU-1.svs with [wsidicomizder ](https://github.com/sectra-medical/wsidicomizer) and zipping it.
   - Update function for fetching remote samples to enable downloading of a zip (as the DICOM WSI is a directory) and unzipping it after download.
2. Refactor some code to fix linter issues, reduce complexity, and move private functions
from `WSIReader` to `WSIMeta` where appropriate.
    - Move `level_downsample` to `WSIMeta` to calculate the downsample for a level as it only requires access to information in the metadata object (sizes of levels in the pyramid).
    - Move `relative_level_scales` to `WSIMeta` as it only depends on `level_downsample` and data from the metadata object.

### Additional Changes

- DICOM terms added to whitelist.
- This PR now also updates some requirements constraints in response to comments.
- Pre-commit config for black updated due to CI issues.

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: George Hadjigeorgiou <31721507+ghadjigeorghiou@users.noreply.github.com>
Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
- Update installation instructions
- Update required dependencies
- Update README badges

Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: David Epstein <22086916+DavidBAEpstein@users.noreply.github.com>
- Fix some issues in graph.py

1. Fix typos in docstrings.
2. Fix typo in function name triangle -> triangle
3. Fix a bug in graph visualization where some edges would not be drawn.

Some other things have been fixed as they were discovered during this PR:
1. Fix a bug in function to convert edge index to triangles (it would find the same triangle twice in a different order).
2. Add tests.

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
- Enhances the error messages to be more informative.

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
Co-authored-by: Mostafa Jahanifar <74412979+mostafajahanifar@users.noreply.github.com>
Co-authored-by: Simon Graham <20071401+simongraham@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
- Fix flake8 errors and typos in `patchextraction.py`
- Replace ValueError with a warning if empty list of points is passed.

Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
Co-authored-by: Mostafa Jahanifar <74412979+mostafajahanifar@users.noreply.github.com>
- tox.ini modified to pass Travis variables which fixes the tests.
- Updated tests to run locally for Travis detection.
- Reduces Travis runtime by up to 5 mins per run.

Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
- Setting up dependabot.
- Update the installation docs with instructions on how to setup tiatoolbox docker container.

Co-authored-by: George Hadjigeorgiou <31721507+ghadjigeorghiou@users.noreply.github.com>
Co-authored-by: David Epstein <22086916+DavidBAEpstein@users.noreply.github.com>
Co-authored-by: Srijay Deshpande <58081136+Srijay-lab@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
- Update models abc to use input_tensor.
- Remove variable input and skip deepsource test.

Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
- [DEP: Bump flake8 from 3.7.8 to 4.0.1](4ba0c3b)
- [DEP: Bump twine from 1.14.0 to 3.8.0](1e63d9a)
- [DEP: Bump bump2version from 0.5.11 to 1.0.1](d33fefb)
- [DEP: Remove watchdog from dependency list](0ac3216)
- [DEP: Bump coverage from 5.1 to 6.3.2](8c2d97f)
- [DEP: Bump wheel from 0.33.6 to 0.37.1](0e2251b)
- [DEP: Bump pytest-cov from 2.9.0 to 3.0.0](bf3d630)
- Update `requirements.dev.conda.yml` accordingly.

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
- Update scikit image version to 0.19.1 

Co-authored-by: Srijay Deshpande <58081136+Srijay-lab@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
- Add `micronet` architecture.
- Add Consep trained model.
- Restructure cli to use lazy imports
- Minimize duplication for consistency in input arguments using common decorators.

Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
- Make line wrapping consistent
- Make indentation consistent
- Fix typos

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
- Make line wrapping consistent
- Make indentation consistent

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
- Update HoVerNetPlus Metrics


Co-authored-by: Adam Shephard <39619155+adamshephard@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
- Make line wrapping consistent.
- Make indentation consistent
- Remove unused import

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
- Fix the stain normalisation CLI test which would dump output into the project root directory by specifying an output to a temporary test directory.
- Update micronet.py docstring

Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
- Update micronet.py docstring
- Update micronet.py metrics
- Update MicroNet tests
There was a bug in `make docs` where it would try to remove a directory `rm docs/_autosummary` without the required `-r` option. This has been fixed in this PR. Also,  a `clean-docs` command has been added in addition to including generated files in the .gitignore file.

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
- Wrap Lines
- Fix typos
- Make indentation consistent

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
- Update dependencies
Bumps [twine](https://github.com/pypa/twine) from 3.8.0 to 4.0.0.
- [Release notes](https://github.com/pypa/twine/releases)
- [Changelog](https://github.com/pypa/twine/blob/main/docs/changelog.rst)
- [Commits](pypa/twine@3.8.0...4.0.0)

---
updated-dependencies:
- dependency-name: twine
  dependency-type: direct:development
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
Carry on from #285 with a dedicated branch for this fix.
Linked to #287 

Check list:
- [x] Check feature extractor integrity
- [x] Performance gain measurement: 21.67s (new) vs 45.564 (old) using a 4k x 4k WSI

@ByteHexler To add clarification on the origin of this performance issue. 

https://github.com/TissueImageAnalytics/tiatoolbox/blob/ca0ece69e0f1cd110687a391b9aed882597c46d9/tiatoolbox/models/engine/semantic_segmentor.py#L670-L673
`self._process_predictions` is called every time `cum_output` is updated. We will process `n**2` time compared to the normal `n` times. As such, depending on the size of the WSI, this can lead to a significant slowdown, and potentially crash the system because many machines won't be able to hold an entire WSI in the memory. In the initial design, we assumed that

https://github.com/TissueImageAnalytics/tiatoolbox/blob/ca0ece69e0f1cd110687a391b9aed882597c46d9/tiatoolbox/models/engine/semantic_segmentor.py#L681 which calls https://github.com/TissueImageAnalytics/tiatoolbox/blob/ca0ece69e0f1cd110687a391b9aed882597c46d9/tiatoolbox/models/engine/semantic_segmentor.py#L736 with `free_prediction=True` will free up the prediction in place, however, this did not work as expected.

Co-authored-by: Dang Vu <24943262+vqdang@users.noreply.github.com>
Co-authored-by: ByteHexler <92634454+ByteHexler@users.noreply.github.com>
Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: Simon Graham <20071401+simongraham@users.noreply.github.com>
- Ignore all DeepSource warnings about generating paths in test files. This is an issue with PyTest `tmp_path`. This might not be the only/best way to do this, but is the best fix I can think of for now to prevent PRs from being blocked.
- Remove GPL 3.0 License
- Add BSD 3-clause license

Co-authored-by: Shan Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Srijay Deshpande <58081136+Srijay-lab@users.noreply.github.com>
adamshephard and others added 23 commits April 18, 2023 10:33
Add engine for the multi-task segmentor to allow simultaneous instance segmentation of nuclei and the semantic segmentation of regions. This replaces commit #216 due to history error in old PR.

TODO
====

- [x] Update unit testing
- [x] Add argument to store instance segmentation outputs as .dat files and semantic segmentation outputs as .npy
- [x] Update Notebook

---------

Co-authored-by: John Pocock <John-P@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
- Use `logger` Instead Of `warnings` for `wsireader.py`

---------
Co-authored-by: Mark Eastwood <20169086+measty@users.noreply.github.com>
- Add a script to update notebook URLs to match the current git ref.

This can be run with `make`. This will pass in all files in `examples`, uses the current branch name as the 'to' ref, and "develop" as the 'from' ref.
```bash
make update-notebook-urls
```

You can also run the script manually. If run manually, you can specify the target ref and input files.

```bash
python pre-commit/notebook_urls.py --from develop --to dev-notebook-urls-hook examples/*
```

---------

Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
- Move requirements*.txt  to requirements folder

---------

Co-authored-by: Mostafa Jahanifar <74412979+mostafajahanifar@users.noreply.github.com>
- Fix Multi-line docstring closing quotes should be on a separate line FLK-D209
- Fix No blank lines allowed after function docstring FLK-D202
- Fix Doc line too long FLK-W505
- Fix Attribute defined outside __init__ PYL-W0201
- Fix Function contains unused argument PYL-W0613
- Fix Consider decorating method with @staticmethod PYL-R0201
- Fix Missing module/function docstring PY-D0003
- Update HISTORY.md with release notes
- Update HISTORY.md with release notes
- Update Jupyter notebook links
- Pin dependencies for long term stable release
- Fix Typo `stain_norm_target`
- Fix `test_validate_docstring_examples`.
- Revert to `test_validate_docstring_examples` execution in CI.

---------

Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com>
- np.int, np.bool, np.float is depreciated in v1.20 in favour of np.int_, np.bool_, np.float_

Waiting on some dependencies to update:
- numba not yet compatible with numpy 1.24
  >   from numba.np.ufunc import _internal
  E   SystemError: initialization of _internal failed without raising an exception

   - numba/numba#8464
   - numba/numba#8691
   - numba/numba#8841
   - https://github.com/numba/numba/milestone/63

- [x] Waiting for next numba release with numpy 1.24 support.

---------
Co-authored-by: John Pocock <John-P@users.noreply.github.com>
# Conflicts:
#	requirements/requirements.txt
- Update `numpy` pin
Enable easy and efficient querying of annotations within a neighbourhood of other annotations. This is planned to be executed in the query domain (e.g. by sqlite, outside of the Python GIL) where possible which should be faster than a user writing a for loop and performing many queries for this common use case.

## Docs

<a href="https://tia-toolbox.readthedocs.io/en/feature-neighbourhood-query/_autosummary/tiatoolbox.annotation.storage.AnnotationStore.html#tiatoolbox.annotation.storage.AnnotationStore.nquery"><img src="https://readthedocs.org/projects/tia-toolbox/badge/?version=feature-neighbourhood-query&style=for-the-badge"></a>


## Initial Concept

Pseudo query example

```
SELECT purple WITHIN 10 points of red
```

![Neighbourhood Querying drawio](https://user-images.githubusercontent.com/4615004/225148384-e125151f-3df8-4d2e-9e32-c0a7ea69d269.svg)



This is now implemented as:

```python
results = store.nquery(
  where="props['color'] == 'red'",
  n_where="props['color'] == 'purple'",
  distance=10,
  # Use centre points of bounding boxes for distance
  mode="boxpoint-boxpoint",
)
```

### Query Modes

<table style="td {text-align: center;}">
<tr>
<td>
<img src="https://user-images.githubusercontent.com/4615004/224540734-5e47c82b-6f0d-4775-bacf-dd73508225dc.svg" alt="poly-poly">
</td>
<td>
<img src="https://user-images.githubusercontent.com/4615004/224540673-90d4965b-5d4e-44eb-8093-7e666b856c7a.svg" alt="box-box">
</td>
</tr>
</tr>
<td>
<img src="https://user-images.githubusercontent.com/4615004/224540741-6d0c18fe-6f75-47a7-b30e-c532c7e5a196.svg" alt="boxpoint-boxpoint">
</td>
</tr>
</table>

## In this PR

- Fix special "bbox_intersects" for `DictionaryStore`. This special geometry predicate was added as a special optimized case which is true if the bounding box of the query geometry intersects the bounding box of the stored geometry. This was never implemented for `DicstionaryStore`.
- Add geometry predicate "bbox_centre_within" which functions similarly to the special "bbox_intersects" predicate. "bbox_centre_within" returns True if the centre of the bounding box of the stored geometry is within the query bounds.
- Add `nquery` function which allows for querying for annotations which are within the neighbourhood of other annotations. This is done by supplying a query geometry and a where predicate to select the initial annotations to use as the centre of each neighbourhood. A second `n_where` predicate, `distance`, and `mode` argument are used to find neighbours.
  - `distance` is the size of the neighbourhood to search (see also `mode`). For a polygon-polygon check, this is via a buffer applied to the geometry, for point-point checks this is a radius, for box-box checks this is added to each side of the bounding box.
  - `n_where` is a predicate, similar to `where`, but used for filtering neighbours after selecting the annotations to query around via `where`.
  - `mode` may be "poly-poly" (full expensive polygon-polygon intersection), "box-box" (bounding box intersection) or "boxpoint-boxpoint" (centre of bounding box within `distance`). Currently, supported modes are only "poly-poly", "box-box", and "boxpoint-boxpoint". Others are possible, but would complicate implementation of these two sets. This would lose information about exactly which annotation was within the neighbourhood of another, but would be more efficient at finding intersections of two groups. Perhaps this could simply be shown as a 'recipe' in the documentation.

## Benchmarking

Some initial benchmarking shows that `SQLiteStore` significantly outperforms `DictionaryStore`.

In this test an $n \times n$ grid of cell boundaries is generated and overlaid with another $n \times n$ grid. A query is then performed to find overlapping geometries via either any overlap of the bounding boxes (box-box) or the centre of the bounding boxes (boxpoint-boxpoint) within a distance $k$. Therefore, the largest test here where $n=100$ is a grid of $100 \times 100$ artificial cell boundaries labelled as class "A", overlaid with another $100 \times 100$ grid of cell boundaries labelled with class "B" for a total of $20,000$ geometries.

This plot is produced from runs on a 6 core Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz CPU:


![nquery2](https://user-images.githubusercontent.com/4615004/225102896-3cbd603c-bb7b-4582-ab0a-ac5e86d3b7e5.png)

It is curious that the "boxpoint-boxpoint" query mode is slower than "poly-poly". I have a feeling that it may have something to do with Shapely 2.0 optimisations where it can vectorise batch geometry operations, or maybe there is some condition for the polygon-polygon intersection that allows for fast failing of tests.

## Notes

- It is already much faster for SQLiteStore but this could be likely improved further by using a subquery.
- The implementation currently performs one query to get all annotations which will be the centre of $m$ secondary queries performed via a loop in Python. This could be refactored to be a nested sqlite subquery which will likely be faster. I think this optimisation could be a new PR though as this basic functionality is already usable.
- This is currently implemented in the base class, so all subclasses get it for 'free'.
- Another approach which may be faster for some cases (but more limited) is to query for initial annotations, then apply a distance buffer, merge overlapping geometries (reduce total number assuming many overlaps), query for all candidate matches once, compute intesection.

## To-Do

- [x] Test coverage (it's as good as I can get it for now)
- [x] Fix linter issues
- [x] Add diagrams to docstrings?
- [x] Add support for um units as distance? (Decided to do in a follow up PR)
- [x] Improve docstrings and add more examples? (suggestions welcome)
- [x] More tests? (suggestions welcome)

---------

Co-authored-by: Mark Eastwood <20169086+measty@users.noreply.github.com>
Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Adds support for Python 3.11 (beta)

- [x] Add 3.11 to the GitHub Actions CI workflow.
- [x] Add 3.11 to setup.py
- [x] Update pip install workflow.
- [x] Update conda requirements python versions.


Waiting on some dependencies to update:

- [x] Shapely (closed) shapely/shapely#1584
  - PyGeos pygeos/pygeos#457
- [x] PyTorch pytorch/pytorch#86566 
    - [x] PyTorch 2.0 supports Python 3.11 
    - [ ] `torch.compile` is not fully supported yet.
- [x] Scipy 1.9.2 not supported by Scikit-image scikit-image/scikit-image#6773
- [x] numba numba/numba#8304 support https://github.com/numba/numba/milestone/63
  - [x] numba/numba#8841
- [x] OpenSlide support for Python 3.11 openslide/openslide-python#189 
  - [ ] openslide/openslide-python#188

---------

Co-authored-by: Shan E Ahmed Raza <13048456+shaneahmed@users.noreply.github.com>
- Use `logger` Instead Of `warnings` for `models` Package

---------

Co-authored-by: adamshephard <39619155+adamshephard@users.noreply.github.com>
- Use `logger` Instead Of `warnings` for `wsi_registration.py`
- Refactor duplicate code fragments
# Conflicts:
#	requirements/requirements.txt
- Update HISTORY.md
- Pin dependecies
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@shaneahmed shaneahmed self-assigned this May 5, 2023
@shaneahmed shaneahmed added this to the Release v1.4.0 milestone May 5, 2023
@codecov
Copy link

codecov bot commented May 5, 2023

Codecov Report

Merging #599 (1524323) into master (d548b33) will increase coverage by 0.20%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #599      +/-   ##
==========================================
+ Coverage   99.57%   99.77%   +0.20%     
==========================================
  Files          64       63       -1     
  Lines        6584     6782     +198     
  Branches     1079     1117      +38     
==========================================
+ Hits         6556     6767     +211     
+ Misses         16        7       -9     
+ Partials       12        8       -4     
Impacted Files Coverage Δ
tiatoolbox/data/__init__.py 100.00% <ø> (ø)
tiatoolbox/utils/transforms.py 100.00% <ø> (ø)
tiatoolbox/visualization/tileserver.py 100.00% <ø> (ø)
tiatoolbox/__init__.py 100.00% <100.00%> (+21.87%) ⬆️
tiatoolbox/annotation/dsl.py 100.00% <100.00%> (ø)
tiatoolbox/annotation/storage.py 99.63% <100.00%> (-0.24%) ⬇️
tiatoolbox/cli/common.py 100.00% <100.00%> (ø)
tiatoolbox/cli/save_tiles.py 100.00% <100.00%> (ø)
tiatoolbox/cli/slide_info.py 100.00% <100.00%> (ø)
tiatoolbox/models/__init__.py 100.00% <100.00%> (ø)
... and 22 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@shaneahmed shaneahmed merged commit 5231f9d into master May 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.