Skip to content

Computer Vision application to diagnose diverse Cytology samples using medical imaging Data from a virtual microscope. Showcased in Ironhack Hackshows and during the European Congress of Cytology & published it on Cytopathology Journal.

Notifications You must be signed in to change notification settings

isi-mube/Digital-Cytology-ML

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Published in Cytopathology Journal - Oct 2023

About the Project

Primary Objectives:

  • Develop multiple-multiclass classification models capable of diagnosing cytological image samples from diverse locations, including salivary glands, gynecological, thyroid and effussions.
  • Develop an AI algorithm using single-layer cytological slide scans challenging the need for full slide multi-layer scanning with z-stack, a process that is both costly and time-consuming.

Secondary Objectives:

  • Implement a web-based application using Streamlit that enables users to predict diagnoses based on their image-inputs.
  • Provide informative feedback on the image features using OpenAI API.

About Cytology

Glossary

Let´s define first a few key terms:

  • Cytology: This is the study of individual cells to detect abnormalities, including cancer. It's a type of sample method that provides a less invasive alternative to biopsies, enabling early diagnosis and treatment initiation, and improving health outcomes.
  • Cytopathology: A specialized field, nested in pathology, that looks at diseases on the cellular level. Professionals of cytopathology include Cytotechnologists & Cytopathologists, focusing on screening, interpretation, and diagnosis of diverse cell samples.
  • Digital Pathology: This involves digitizing pathology slides, allowing the use of image-based information for diagnosis, research, and teaching. Digital Pathology includes not only the digitalization of histology and cytology slides but also the automatization, technology, and tools of all preanalytical, analytical and post-analytical processes in a pathology department

Challenges in Digital Cytology

In digital cytology, we face a unique challenge. Unlike in histology, where cells maintain their flat structure (like a single layer of bricks on a wall), cytology samples can be more like a pile of bricks dumped out of a bucket. These cells in suspension no longer hold their original formation, making diagnosis more complex and time-consuming because it requires mastery of pattern recognition. Furthermore, due to these additional dimensions, digitizing these cell images requires even more storage space.



Thyroid, papillar carcinoma. Same tumor, different methods and different features. On the left, histology (1-dimensional thin layer), and on the right, cytology (three-dimensional in suspension cells).

Personal Journey and Perspectives on Cytology

My past 5 years of work have been all around Cytology; it involved screening and diagnosis of numerous cytology specimens, quality control, and engaging in both teaching and research, including Digital Pathology publications.

One significant barrier to the digitalization of Cytological samples is the final size. As previously explained, the cells in Cytology are not flat, unlike in Histology, but three-dimensional. This complexity typically requires a Z-stack scanning of the slides to capture all focal points, resulting in large digital files.

Despite this challenge, I firmly believe that Machine Learning and Deep Learning models can be implemented in Cytology images, bypassing the need for a complete scan, hence one of the most challenging aspects of the digitalization process.

Results and Conclusions

  1. The convolutional neural network (CNN) model demonstrated excellent accuracy in the multiple-multiclass classification of cytology images, with a performance metric of approximately 90-95% accuracy around the 20-25 epoch mark.
  2. The challenge of the lack of available Data was addressed through the synthetic generation of new cytology images, an approach known as data augmentation. This technique was crucial for minimizing false negatives across all diagnostic categories.
  3. This model has the potential for real-world implementation, opening the door for the creation of AI algorithms using single-layer cytological slide scans or even phone-captured images, thereby challenging the need for full slide multi-layer scanning with z-stack, a process that is both costly and time-consuming.
  1. Salivary gland specimens::
  2. Gynecological specimens:
  3. Thyroid specimens:
  4. Effussions specimens:

For specific metric results, please refer to the specific Python folder:

About the Data:

Data info

Libraries:

  • Pandas: Data manipulation and analysis.
  • Numpy: Arrays and mathematical functions, allowing it to read images.
  • Os: File managment.
  • Matplotlib: 2D Data visualization.
  • Seaborn: Runs on top of matplotlib, HD data visualization.
  • PIL: Python Imaging Library to manipulate images.
  • Warnings: Roses are red, violets are blue --> Warnings are annoying.
  • Shutil: File operations (copying, deleting...).
  • Random: To generate random subsets of data.
  • TensorFlow: Machine Learning for Computer Vision.
  • Keras: High-level neural networks API for Deep Learning, running on top of TensorFlow.
  • ImageDataGenerator: To generate random data augmentation (flips, zoom...).
  • Sklearn: Machine Learning metrics.
  • Confusion Matrix: To evaluate true and false positives and negatives.
  • Confusion Matrix Display: To easily display the matrix.
  • Classification Report: For a more accurate detail of each metrics (precision, recall, f1-score, support).

Bibliography:

About

Computer Vision application to diagnose diverse Cytology samples using medical imaging Data from a virtual microscope. Showcased in Ironhack Hackshows and during the European Congress of Cytology & published it on Cytopathology Journal.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published