Skip to content

PyPathomics is an open-source software for gigapixel whole-slide image analysis. Extract features directly from HoverNet's json file.

License

Notifications You must be signed in to change notification settings

HaoyuCui/PyPathomics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

image

PyPathomics

PyPathomics is an open-source software for gigapixel whole-slide image analysis. Off-the-shelf and easy-to-use.

Currently under development. This is a simplified version of [sc_MTOP] [Paper]

Support for:

  • Hover-Net [Repo] [Paper] (Graham et al., 2019)

  • Cerberus (In Beta) [Repo] [Paper] (Graham et al., 2023)

Feature-sets

feature-set-eg

PyPthomics supports [a] Cell Ratio, [b] Morphological Features, [c] Texture Features, and [d] Spatial Features (Delaunay Triangle).

How it works?

image

Get all cell and slide-level features directly through the cell segmentation file like Hover-Net's .json file.

Installation

conda create -n pypathomics
conda activate pypathomics
pip install -r requirements.txt

Options and Usage

Options:

Options for config.yaml

   openslide-home: path\to\openslide-home # for Windows only
   feature-set: ['Morph', 'Texture', 'Triangle']
   cell_types: ['I', 'S', 'T']
   statistic-types: ['basic', 'distribution']

Where,

  • openslide-home: Specifies the path to the OpenSlide library. (for Windows only, leave it empty for Linux)
  • feature-set: A list of feature sets to extract. Options: Morph, Texture, Triangle.
  • cell_types: Types of cells to analyze. Options: I (inflammatory), S, (stromal), T (tumor).
  • statistic-types: Types of statistics to calculate. Options: basic, distribution.
statistics-types e.g.
basic mean, std
distribution Q25, Q75, median, IQR, range, skewness, kurtosis

Options for main.py

Required Arguments:

    --config     Specify the configuration file path
    --seg        Path to the segmentation directory or file from Hover-Net(.json) or Cerberus(.dat)
    --wsi        Path to the WSI directory or file
    --ext        WSI file extension (default: .svs or svs)
    --buffer     Specify the output buffer dir for preprocessing
    --output     Set the output directory for the analysis

Optional Arguments:

    -f           Run for a single file (default: run for directory) 
    --auto_skip  Skip existing directories automatically (default: True)
    --level      Detail level of the WSI to analyze (default: 0)

Usage:

  1. Make sure you run the Hover-Net's wsi seg or Cerberus' wsi seg and get the seg files.

  2. Modify and check the config.yaml before running.

  3. Analyze a Directory:

    python main.py --seg /path/to/seg_dir --wsi /path/to/wsi_dir --buffer /path/to/buffer --ext .svs --output /path/to/output.csv
  4. Analyze a Single File

    python main.py -f --seg /path/to/seg_file --wsi /path/to/wsi_file --buffer /path/to/buffer --ext .svs --output /path/to/output.csv
All-cell Info, stored in /path/to/buffer
Feature Description e.g.
Name Identifier of the cell 7, 42, 113, ...
Centroid Position of the cell [1676.85, 12851.68], ...
Cell Type Cell type in numeric coding 0, 1, 0, ...

Different cell types will be stored in different files. Each file could be hundreds or thousands MBs in size as they stored all cells' information.

Feature-sets Explanation

Slide Cell Ratio
  • Ratio: Reflects the proportion of cells of this type.
Morphological Features
  • Area: Area of the cell, indicating cell size.
  • AreaBbox: Area of the minimum bounding rectangle around the cell.
  • CellEccentricities: Eccentricity of the cell.
  • Circularity: Roundness of the cell.
  • Elongation: Elongation rate of the cell.
  • Extent: Proportion of the cell occupying its bounding rectangle.
  • MajorAxisLength / Morph_MinorAxisLength: Lengths of the major and minor axes of the fitted ellipse for the cell.
  • Perimeter: Perimeter of the cell boundary.
  • Solidity: Ratio of the cell area to its convex hull area.
  • CurvMean / Std / Max / Min: Mean, standard deviation, maximum, and minimum of the cell boundary curvature.
Texture Features
  • ASM (Angular Second Moment): Texture consistency, measuring the similarity between a pixel and its neighbors.
  • Contrast: Texture contrast, describing the intensity variation in the image.
  • Correlation: Texture correlation, measuring the similarity between a pixel and its neighbors.
  • Entropy: Texture entropy, representing the diversity of information in the image; higher values indicate more complex textures.
  • Homogeneity: Texture homogeneity, assessing the consistency of the texture.
  • IntensityMean / Std / Max / Min: Mean, standard deviation, maximum, and minimum of the texture intensity.
Delaunay Triangle Spatial Features
  • Area: Area of the Delaunay triangle around the cell.
  • Perimeter: Perimeter of the Delaunay triangle around the cell.
  • Angle_Range: Difference between the maximum and minimum angles of the Delaunay triangle around the cell.

Citation

If you find our work useful in your research, please consider citing our Zenodo record:

@software{pypathomics,
  author       = {HY Cui and XX Wang and J Xu and DP Chen},
  title        = {PyPathomics},
  year         = 2024,
  publisher    = {GitHub},
  url          = {https://github.com/HaoyuCui/PyPathomics},
  version      = {1.0},
  doi          = {10.5281/zenodo.15164919},
  howpublished = {\url{https://zenodo.org/record/15164919}},
}

About

PyPathomics is an open-source software for gigapixel whole-slide image analysis. Extract features directly from HoverNet's json file.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages