Whole slide image understanding in pathology: what is the salient scale of analysis?

## Whole slide image understanding in pathology: what is the salient scale of analysis?
- Author: Jenkinson et al.
- Journal: BioMedInformatics
- Year: 2024
- Link: https://www.mdpi.com/2673-7426/4/1/28/pdf


### Abstract
- Digitised pathology slides, otherwise known as whole slide images, can be analysed by pathologists with the same methods used to analyse traditional glass slides.
- The digitisation of pathology slides has also led to the possibility of using these whole slide images to train machine learning models to detect tumours.
  - Patch-based methods are common in the analysis of whole slide images as these images are too large to be processed using normal machine learning methods.
- It was discovered that the most successful method uses a patch size of 256 × 256 pixels with the informed sampling method, using the location of tumour regions to sample a balanced dataset


### Introduction
- Automated Analysis of Whole Slide Images (WSIs) aims to replicate how a pathologist analyzes a WSI to determine the presence of cancerous tissue.
- The main objective is to detect tumorous tissues within the WSI, which requires deep learning methods that can model the cognitive processes of pathologists effectively.
  - Deep learning methods are significantly more effective than traditional machine learning models in recognizing complex patterns within high-dimensional data.
  - They learn to identify and precisely locate tumorous areas in WSIs by training on annotated images.
- WSIs often contain several gigapixels, making it impractical to feed these full-sized images directly into deep learning models.
  - **Downsampling** | Reducing the image resolution so that it can be processed in its entirety, albeit with a loss of detail.
  - **Patch Extraction** | Dividing the large WSIs into smaller, manageable patches that can be analyzed independently, allowing for efficient processing without losing vital information.
    - Only a selected subset of these patches is used as input for neural networks, which are algorithms designed to process data in a way inspired by the human brain.
    - The patch-based method is more commonly preferred than downsampling, which reduces the image resolution.
    - The work assesses how the size of the patches affects the analysis by testing several sizes ranging from 256x256 pixels to larger sizes like 1024x1024 pixels.


### Related Works
#### Introduction to Digital Pathology
- **Whole Slide Image (WSI)**
  - In digital pathology, WSIs serve as high-resolution images that replace traditional glass slides, generated by scanning pathological samples. They are used by pathologists to diagnose diseases.
  - **Features**
    - Contains billions of pixels and can be around 2 gigabytes in size.
    - Allows for large screen displays, remote viewing, and sharing among pathologists.
- **Benefits of WSIs**
  - Improves diagnostic accuracy.
  - Facilitates collaboration between medical professionals.
  - Offers a flexible work environment.
- **Challenges with WSIs**
  - The size and complexity of WSIs, along with the variation in morphology and possible artifacts, make conventional deep learning methods less effective.

- **Digital Pathology and WSI**
  - The use of WSIs eliminates the need for expensive storage solutions for glass slides and enables remote analysis, enhancing collaboration among healthcare professionals while speeding up the diagnostic process.
  - However, manual analysis of WSIs remains time-consuming, and there is a lack of standardization between analyses performed by different pathologists. This can lead to inconsistencies in the results.

- **Current Barriers**
  - Many regulations and ethical issues slow down progress in implementing digital pathology.
  -  Despite its potential, integration into diagnostics will be gradual, with algorithms helping pathologists by prioritizing slides that likely indicate disease.

#### Analysis of Whole Slide Images
- **Challenges**
  - WSIs are extremely large, often containing several gigapixels of data, making them difficult to analyze using standard deep learning techniques.
  - There is a significant shortage of annotated training data since the process of annotating images is laborious and time-consuming for pathologists.
    - This lack of data limits the development and effectiveness of deep learning models, as these models require a substantial amount of training data to perform well.
- **Research Focus** |  The inherent size of WSIs, The limited availability of annotated training datasets

- **Problems with Computational Analysis of Whole Slide Images**
  - WSIs are large digital images of pathology slides, containing billions to trillions of pixels.
    - **Average size** | 1 to 4 GB, which makes them challenging for conventional deep learning algorithms.
  - **Common Solutions**:
    - Downsampling: 
      - Involves scaling down the WSI to reduce pixel count. 
      - Not ideal because it loses fine details, impacting classification accuracy.
  - **Patch Extraction**
    - Splits WSIs into smaller, manageable patches for analysis.
    - Each patch is analyzed individually, with classifications combined to produce a slide-level classification.
    - **Problem** | Some patches may not contain diseased tissue, potentially misinforming the model and loss of spatial information between patches may diminish understanding of relationships in the data.
    - Advantages | Retains more morphological details compared to downsampling
  - **Computational Analysis Bottlenecks**
    - Lack of annotated data for training due to the time-consuming annotation process performed by pathologists.
    - Training should ideally involve patch-level annotations to enhance model accuracy, but manual annotations are resource-intensive.
  - **Additional Issues**
    - Stain variation and artefacts across different WSIs lead to classification challenges.
    - Differences in tissue characteristics complicate learning algorithms trying to distinguish between normal and diseased tissue.
    - **Variation Among Scanners** | Different laboratories and equipment can produce significant staining differences and artifacts, complicating model training.
    - **AI Limitations** | Unlike pathologists, AI struggles to overlook these variations in image data, which can lead to misclassifications.
    - **Feature Extraction Difficulties** | Variability in WSIs can make it challenging for models to distinguish between disease and healthy tissue.
    - **Class Imbalance** | Benign (normal) classes often have significantly more samples than malignant (disease) classes, complicating model training.

- **Levels of Annotation**
  - Pixel-Level: Detailed annotations, marking exact disease locations.
  - Patch-Level: Annotations based on small image segments (patches).
    - Ideally, training data would include patch-level annotations to closely emulate expert results and patch-level annotations help in predicting disease presence accurately.
  - Slide-Level: Annotations for the entire slide.
  - Lesion-Level: Specific areas of disease within a slide.
  - Patient-Level: Annotations based on overall patient information.

- **Weakly Supervised Learning**
  - The time-consuming nature of manual annotation limits the data available for training fully supervised models and these using patch-level labels can lack comprehensive disease location knowledge.
  - Many researchers are now using weakly supervised learning methods that rely on slide-level annotations only.
    - These annotations indicate whether disease is present but do not specify where, helping to train the model with less detailed information.


#### Patch-Based Whole Slide Image Analysis
- **1. Pre-processing**
  - A. **Tissue Segmentation** identifies and removes unwanted areas, such as background or blurry sections that don’t contain meaningful tissue data.
    - Helps reduce the computational load by avoiding processing large irrelevant areas.
  - B. **Colour Normalisation** adjusts the distribution of colour values in an image.
    - Ensures consistent colour differences across multiple slides, minimizing stain variation that could bias training data and skew results.
  - C. **Patch Extraction** involves cutting the WSI into square patches, commonly sized at 256 × 256 pixels.
    - This is crucial for manageable data sizes and optimizes computational processing, allowing for focused analysis of smaller, relevant sections of the WSI. 
    - Various factors such as patch size, resolution, and sampling methods can be optimized in this step (overlapping vs tiled, etc.).
  - D. Data Augmentation creates new training data by transforming existing training data.
    - Helps prevent overfitting during model training and addresses class imbalances by making the model robust to variations in data.

- **2. Architectures**
  - **CNN**s are a type of deep learning model commonly used for processing images, including whole slide images (WSIs)
  - Often in cases where training data is limited, **CNNs** leverage **weakly supervised learning**.
    - **MIL** is a specific approach within weakly supervised learning. It allows a single label (e.g., classifying a slide as having cancer) to be applied across multiple sections (patches) of the slide.
    - **Max pooling** is a technique that aggregates the predictions made for individual patches. 
      - If any patch predicts the presence of disease, the entire slide is labeled as containing disease.
  - **Instance Assignment** is suitable for situations like pathology slides, where you might have labels for the overall slide as 'disease present' but no specific indications of where the disease is located in the different patches.

- **3. Classification**
  - **Patch-Level Classification** analyzes small sections (patches) of a Whole Slide Image (WSI) to determine if they contain disease.
  - **Slide-Level Classification** combines the results of patch-level classifications to determine the overall state of the slide.
  - Individual patches are classified first and the predictions from these patches are then aggregated to generate a final decision about the entire slide.
  - Heatmaps often used to depict the distribution of results across patches and these can show where disease is likely present, typically aligning with a pathologist’s annotations.

#### WSI Analysis Checklist
1. **Hardware & Software** | Document the system's hardware and software used for training and testing.
2. **Data Source** |  Clearly state the source of the data and the method for accessing it.
3. **Data Splitting** | Explain how the data is divided into training, validation, and testing sets.
4. **Normalization** | Describe whether and how the slides were normalized (adjusted for consistent color and contrast).
5. **Background Removal** | State how background noise and artefacts were eliminated from the slides.
6. **Patch Extraction** | Detail how patches were extracted from the images and any data augmentation applied.
7. **Patch Labeling** | Specify how patches were classified or labeled for analysis.
8. **Patch Classifier Training** | Explain how the classifier for patches was trained, including the technique, architecture, and hyperparameters.
9. **Slide Classifier Training** | Describe how the classifier for whole slides was trained, including pre-processing, techniques, architecture, and hyperparameters.
10. **Lesion Detection** | Detail how lesions were detected within the images.
11. **Patient Classifier Training** | Describe how the patient classifier was trained, including pre-processing, techniques, architectures, and hyperparameters.
12. **Performance Metrics** | List all relevant metrics applicable to the tasks.
  - Additionally, the text also mentions that Wang et al., winners of the Camelyon16 challenge, evaluated several architectures for WSI analysis, revealing that **GoogLeNet** performed best in their tests.


### Proposed Methodology
#### Camelyon16 Winning Paper
- The dataset consists of 160 train WSIs labeled as "normal" and 111 WSIs labeled as "tumour" and 129 test WSIs.
- Image Pre-processing:
  - Tissue Segmentation: Removes irrelevant backgrounds from WSIs.
  - Patch Extraction: Millions of patches of size 256 × 256 pixels were extracted from training WSIs to train the model.
- Patch-Level Classification: The trained model predicts if a specific patch contains any tumors.
- Post-Processing: Overlapping patches from testing WSIs are used to generate tumor probability heatmaps corresponding to the images.
- Slide-Level Classification: Features from **heatmaps** are input into a slide-level classification model (a random forest classifier) to determine the probability of the presence of tumors in the WSI.

#### System Structure
- The training structure
<img src='https://github.com/user-attachments/assets/46c649e1-dbc8-467f-9a41-cbd26d3f3966' width=60%>
- The final structure
<img src='https://github.com/user-attachments/assets/711518ab-faf1-4b6c-a633-3794f4376640' width=60%>

### Load WSIs
- The WSIs are stored in a format that resembles a multi-tiered **pyramid** which allows access to images at varying resolutions.
- **OpenSlide Library**
  - It is a specialized C library designed for reading WSIs.
  - It includes a Python binding that has additional features, such as a Deep Zoom generator.
  - Functions Offered by OpenSlide:
    - **Read WSIs** | Retrieve the slide images.
    - **Get Dimensions** | Access information regarding the size of each resolution layer of the WSI.
    - **Fetch Regions** | Extract specific areas of the WSI at a desired resolution level.
    - **Tile Splitting** | The Deep Zoom generator can split the WSI into smaller tiles of a specified size, which is especially useful for patch-based analysis.

### Pre-processing
- The WSIs are divided into hundreds of thousands of patches, the amount depends on the specified size during the split. 
  - Different patch sizes will affect the results, and this investigation is a part of the study.
- Patches can **overlap** by a defined number of pixels. 
  - This overlap helps retain spatial information that might be lost when patches are generated.
- After generating all patches, only a **subset** is chosen for specific tasks. 
  - For test data, the entire set of patches is used, while for training, only a sample is utilized.

- **Background Removal**
  - To ensure only relevant tissue patches are included in the training dataset.
  - **Thresholding** | A defined threshold is used to decide whether a patch should be discarded: 
    - If the mean pixel value is above the threshold, it is considered background.

- **Sampling Methods**
  - **Random Sampling** | Selects patches completely at random. This can lead to an imbalance in class labels (e.g., too many "normal" patches compared to "tumour" patches).
    - Each patch is selected without replacement to ensure no duplicates appear in the training data.
    - There is a maximum number of patches that can be selected per image.
  - **Informed Sampling** | Accounts for the actual locations of tumours, ensuring a more balanced dataset by selecting patches based on their labels.
    - Extracts tumor patches from lesion annotation files that specify tumor locations.

### Patch-level Classification
- The model aims to determine if a tissue patch contains tumor tissue by outputting probabilities for two classes: “normal” and “tumor”.
- The **input** consists of pre-processed patches of tissue sampled from training data and the model **outputs** a probability value for each class (normal and tumor).
- Unlike slide-level analysis, each input patch is treated independently, meaning the model doesn't account for spatial relationships in the original Whole Slide Image (WSI).
- Lack of spatial information
- The patch-level classification is performed using a neural network based on the **GoogLeNet** architecture and utilized with categorical cross entropy and the softmax activation function.

### Slide-level Classification
- This step determines the overall classification of a whole slide image (WSI) based on extracted features.
- **Input Features** come from tumor probability heatmaps, such as:
  - Percentage of the slide that is tumor.
  - Average probability values from the heatmaps.
  - Frequency of areas identified as high probability for tumors.
- A Random Forest model is employed for this classification task.

#### Input Features
- **Tumor Percentage**
  - This feature is calculated as the percentage of tissue predicted to be tumor.
  - It is derived by summing the positive probabilities (tumor patches) and negative probabilities (normal patches).
  - The percentage is computed from the total area (tumor + normal patches) in the tissue region.
- **Tumor Regions**
  - Number of Tumor Regions: This is determined by creating a mask of the tumor regions and counting them.
  - Size of the Largest Tumor Region: The largest connected area of tumor patches is found and its size is measured in pixels.
- **Statistical Features from Probabilities**
  - Probability values are categorized into positive (tumor) and negative (normal).
  - From these values, the following statistical metrics are derived:
    - Mean, Median, Mode, Variance, SD, Minimum, Maximum, Range and Sum

### Downsampling
- To efficiently analyze Whole Slide Images (WSIs) by reducing their size and complexity.
- Downsampling can be done alone or alongside patch extraction.
- If downsampled enough, the image can be used directly for tumor probability predictions without breaking it into patches.
- Generally, **patch-based methods are preferred** since downsampling can lead to loss of important details—especially morphological information—which is crucial for accurate model predictions.
- **Process**
  1. **Inputting into models** | Downsampled images can be fed into a model for slide-level classification.
  2. **Model used** | The GoogLeNet network, which commonly requires specific input sizes.
  3. **Data preparation** | A new dataset of the lowest resolution WSIs is created from original training/testing WSIs.
  4. **Pre-processing** | Downsampling -> Resizing to 256 × 256 pixels -> Transforming to tensors -> Normalisation.


### Conclusion
- **An optimal patch size of 256 × 256 pixels** was identified when using an informed sampling method, meaning patches were selected based on the locations of tumors.
- A **downsampling** method was also tested (reducing the resolution of WSIs). However, the patch-based method performed better, aligning with previous literature that suggested patch-based methods retain more detail compared to downsampling.
- One of the significant achievements was creating tumor probability **heatmaps**. These visually represent probable tumor locations, aiding pathologists in diagnosing the specimens by providing finer detail at smaller patch sizes.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whole slide image understanding in pathology: what is the salient scale of analysis? #43