This is our implementation of the nuclei detection method proposed in [Peikari et al.].
The objective is to assess the cellularity of any given hematoxylin and eosin (H&E) stained histopathology image of breast cancer.
For our method of direct cellularity assessment, please visit https://github.com/hbk16/BreastPathQ.
The detection is comprised of two parts: segmentation and classification. The segmentation is based on color deconvolution, multi-level Otsu thresholding and marker-controlled watershed. Classification of individual nuclei is based on SVMs trained using shape, intensity and texture features.
The data is provided as a part of the SPIE-AAPM-NCI BreastPathQ: Cancer Cellularity Challenge (http://spiechallenges.cloudapp.net/competitions/14 or https://breastpathq.grand-challenge.org/).
The ROC curves of the classification of lymphcyte/epithelial (L vs. BM) and benign/malianant (B vs. M):
Examples of nuclei segmentation and detection:
The accuracy, sensitivity and specificity of nuclei classification on the test set:
Class | ACC | SEN | SPE |
---|---|---|---|
Lymphocyte | 0.960 | 0.840 | 0.981 |
Benign | 0.878 | 0.577 | 0.928 |
Malignant | 0.890 | 0.927 | 0.802 |
For cellularity assessment, this implementation got ICC of 0.76 [0.69, 0.81], Kendall's tau-b of 0.57 [0.49, 0.63] and Prediction Probability (the metric adopted by the challenge organizer) of 0.79 [0.75, 0.82] on the validation set.
Python3.5
Scikit-images
OpenCV
MATLAB Engine API for Python
GLCM_Features4
(Make sure that the function GLCM_Features4
is within your MATLAB Path)
Scikit-learn
Standard scientific Python stack: NumPy, Pandas, SciPy, Matplotlib
(Make sure that you are using the latest version compatible with Python3.5)
- Download the dataset from http://spiechallenges.cloudapp.net/competitions/14 and copy the images to the
corresponding directories in
data/
as instructed by thereadme.md
files. Please note that the training images should be copied todata/corr/
(for the purpose of linear correction). - To segment the nuclei for all the images, run:
The segmentation mask of each phase will be stored in
python3 peikari.py --phase cells val corr --precomputed_threshold
segmentation&classification/PHASE/seg.npy
. In each mask, the the background is denoted by 0 and the region of each nuclus is denoted by a specific positive integer. - To stack the images into array to speedup loading, run:
The images will be stacked in
python3 make_data.py --opt stack --phase cells python3 make_data.py --opt stack --phase val python3 make_data.py --opt stack --phase corr
data/PHASE/x.npy
and the ground truth of cellularity will be stored indata/PHASE/y.npy
. - To get the annotations of nuclei from
xml
files and store them in an array, run:The annotations will be stored inpython3 make_data.py --opt get_annotation
segmentation&classification/cells/annotation.npy
. Lymphcyte, benign and malignant epithelial nuclei are denoted by 2, 1, 0 respectively in the third column. - To generate centroids from the segmentation masks, run:
The centroids will be stored in
python3 make_data.py --opt get_centroid --phase cells python3 make_data.py --opt get_centroid --phase val python3 make_data.py --opt get_centroid --phase corr
segmentation&classification/PHASE/centroid.npy
. - To match the segmented nuclei to the manual annotations, run:
The result of matching will be stored in
python3 make_data.py --opt match
segmentation&classification/cells/anno_match.npy
. - To manually check the segmentation results, run:
You can browse the images in
python3 make_data.py --opt mark_countour --phase cells python3 make_data.py --opt mark_countour --phase val python3 make_data.py --opt mark_countour --phase corr
segmentation&classification/PHASE/seg_contour
to review your segmentation. - To extract features describing shape, intensity and textures, run:
The features will be stored in
python3 make_data.py --opt extract_nuclei&intensity --phase cells python3 make_data.py --opt extract_nuclei&intensity --phase val python3 make_data.py --opt extract_nuclei&intensity --phase corr python3 make_data.py --opt extract_shape&lbp --phase cells python3 make_data.py --opt extract_shape&lbp --phase val python3 make_data.py --opt extract_shape&lbp --phase corr python3 make_data.py --opt extract_haralick --phase cells python3 make_data.py --opt extract_haralick --phase val python3 make_data.py --opt extract_haralick --phase corr
segmentation&classification/PHASE/features/
. - To train SVMs to classify lymphcyte/epithelial and benign/malianant, run:
The SVMs will be pickled to
python3 classify.py --opt lvsbm python3 classify.py --opt bvsm
segmentation&classification/model/
, along with the ROC curves of 5-fold cross validation. - To validate the models on the validation set, run:
python3 classify.py --opt validate
- To predict the cellularity of the validation set, run:
The prediction on the validation set will be stored in
python3 classify.py --opt predict
segmentation&classification/pred_val.csv
, along with the ground truth.
If you find this repository useful for your publication, please star it and consider citing the following papers:
The original idea
Peikari, M., Salama, S., Nofech‐Mozes, S., & Martel, A. L. (2017).
Automatic cellularity assessment from post‐treated breast surgical specimens.
Cytometry Part A, 91(11), 1078-1087.
This implemantation
Pei Z., Cao S., Lu L., & Chen W-F. (2019).
Direct cellularity estimation on breast cancer histopathology images using transfer learning.
Computational and Mathematical Methods in Medicine, 2019, 3041250.