Skip to content

Conversation

@shaneahmed
Copy link
Member

@shaneahmed shaneahmed commented Jan 8, 2026

Summary

This PR introduces a new MultiTaskSegmentor engine that supports multi‑head models (e.g., HoVerNet/HoVerNetPlus) across both patch and WSI inference workflows. It adds a dedicated CLI entry point, implements memory‑aware processing with automatic Zarr spillover for large slides, and refactors HoVerNet/HoVerNetPlus post‑processing into a unified, task‑centric structure.


Key Changes

1) New engine: tiatoolbox.models.engine.multi_task_segmentor

  • Adds the full MultiTaskSegmentor engine implementing:
    • Patch‑mode and WSI‑mode inference paths
    • Horizontal/vertical patch stitching with instance de‑duplication
    • Output types: dict, zarr, or annotationstore
    • Memory‑aware merging with Zarr spillover
  • Internal helpers for:
    • Constructing per‑task outputs
    • Multi‑row and multi‑tile merges
    • Saving multi‑task Zarr structures with task‑level grouping

2) CLI: tiatoolbox multitask-segmentor

  • Introduces a new CLI command to run multi‑task models directly from the terminal.
  • Mirrors engine functionality with options for input paths, resolutions, patch/tile configuration, output type, predictions/probabilities, device, stride, and more.
  • Ensures consistent UX with the rest of the TIAToolbox CLI ecosystem.

3) Model architecture updates (HoVerNet & HoVerNetPlus)

  • Refactors postproc to return task‑centric dictionaries with keys such as "task_type", "predictions", and "info_dict".
  • Adds tasks and class_dict attributes to both architectures for unified downstream handling.
  • Renames contourcontours and ensures Dask compatibility for large outputs.
  • HoVerNetPlus now also provides bounding boxes for layer predictions.

4) EngineABC enhancements

  • Allows Zarr output grouping by task_name for multi‑task layouts.
  • Validates save_dir for zarr and annotationstore outputs; missing directories now raise a clear ValueError.
  • class_dict defaults to model.class_dict when missing.
  • Refactors dictionary assembly and improves handling of dask output types.

5) Dependency updates

  • Updates Dask requirement:
    dask>=2025.12.0dask>=2026.1.2.
    Required to support newer dask APIs used in Zarr spillover and array operations.

Breaking / Behavioral Changes

  • HoVerNet & HoVerNetPlus postproc now return task dictionaries instead of positional tuples; downstream consumers must update accordingly.
  • Instance dictionaries now use "contours" consistently.
  • EngineABC.run requires save_dir for zarr and annotationstore outputs; omitting it raises a ValueError.
  • CLI users should now use the dedicated multitask-segmentor command for multi‑task models.

Usage examples

Patch mode:

mt = MultiTaskSegmentor(model="hovernetplus-oed", device="cuda")
out = mt.run(images=patches, patch_mode=True, output_type="dict",
             return_probabilities=True, return_labels=False)

WSI mode (Zarr output):

from pathlib import Path

mt = MultiTaskSegmentor(model="hovernetplus-oed", batch_size=64)
out = mt.run(
    images=[slide],
    patch_mode=False,
    output_type="zarr",
    save_dir=Path("out/"),
    return_predictions=(False, True),  # per-task flags
)

WSI + AnnotationStore:

from pathlib import Path

mt = MultiTaskSegmentor(model="hovernetplus-oed")
out = mt.run(
    images=[slide],
    patch_mode=False,
    output_type="annotationstore",
    save_dir=Path("ann/"),
    return_probabilities=True,
)

CLI:

tiatoolbox multitask-segmentor \
  --img-input slides/ \
  --output-path results/ \
  --model hovernetplus-oed \
  --output-type annotationstore \
  --return-predictions False, True

@shaneahmed shaneahmed self-assigned this Jan 8, 2026
@shaneahmed shaneahmed added the enhancement New feature or request label Jan 8, 2026
@shaneahmed shaneahmed added this to the Release v2.0.0 milestone Jan 8, 2026
@codecov
Copy link

codecov bot commented Jan 8, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 96.79%. Comparing base (d6ef926) to head (f7e35be).

Additional details and impacted files
@@                    Coverage Diff                     @@
##           dev-define-engines-abc     #981      +/-   ##
==========================================================
+ Coverage                   95.33%   96.79%   +1.46%     
==========================================================
  Files                          79       80       +1     
  Lines                       10001    10513     +512     
  Branches                     1290     1378      +88     
==========================================================
+ Hits                         9534    10176     +642     
+ Misses                        431      295     -136     
- Partials                       36       42       +6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@shaneahmed
Copy link
Member Author

New release of dask has changed the API for to_zarr, dask/dask#12205

https://github.com/dask/dask/releases/tag/2026.1.2

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a comprehensive MultiTaskSegmentor engine for TIAToolbox that handles multi-head segmentation models (HoVerNet/HoVerNetPlus) with both patch and WSI inference capabilities. The implementation adds memory-aware processing with automatic Zarr spillover for large slides and refactors model post-processing into a unified, task-centric structure.

Changes:

  • Adds new MultiTaskSegmentor engine with patch/WSI modes, memory-aware Zarr caching, and support for dict/zarr/annotationstore outputs
  • Introduces CLI command tiatoolbox multitask-segmentor for terminal-based multi-task model execution
  • Refactors HoVerNet/HoVerNetPlus postprocessing to return task-centric dictionaries and renames contourcontours
  • Updates EngineABC to support Zarr task grouping and adds save_dir validation for zarr/annotationstore outputs
  • Adds create_smart_array utility for memory-aware NumPy/Zarr allocation

Reviewed changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
tiatoolbox/models/engine/multi_task_segmentor.py New multi-task segmentation engine with comprehensive inference, post-processing, and saving capabilities
tiatoolbox/models/engine/engine_abc.py Enhanced base engine with Zarr task grouping, save_dir validation, and dask config changes
tiatoolbox/models/architecture/hovernet.py Updated postproc to return task dictionaries, renamed contour→contours, added dask support
tiatoolbox/models/architecture/hovernetplus.py Extended hovernet changes with layer segmentation task and bounding boxes
tiatoolbox/cli/multitask_segmentor.py New CLI command for multi-task segmentation
tiatoolbox/cli/common.py Added parse_bool_list callback and cli_return_predictions for tuple[bool] parsing
tiatoolbox/cli/init.py Registered new multitask-segmentor CLI command
tiatoolbox/utils/misc.py Added create_smart_array for memory-aware array allocation with psutil
tests/* Comprehensive test coverage for new functionality

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 14 changed files in this pull request and generated 7 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants