Developed a deep learning model using a customized ResNet-50 architecture to automate the identification and classification of white blood cell (WBC) subtypes in microscopic blood smear images. This model aimed to reduce diagnostic errors and workload in clinical pathology by accurately detecting eosinophils, lymphocytes, monocytes, and neutrophils. Leveraged advanced CNN techniques like skip connections and residual blocks. Evaluated performance using a publicly available blood cell image dataset.
White blood cell, also called leukocyte or white corpuscle, a cellular component of the blood that lacks haemoglobin, has a nucleus, is capable of motility, and defends the body against infection and disease by ingesting foreign materials and cellular debris, by destroying infectious agents and cancer cells, or by producing antibodies. WBC is made in the bone marrow and found in the blood and lymph tissue. White blood cells are part of the body’s immune system. They help the body fight infection and other diseases. Blood smear analysis done by a pathologist using microscope is used to observe WBC cells. This analysis help doctor to diagnose a patient but it can cause error. That is why, with the advancement of deep learning, various object detection techniques have become useful for automating the process and reducing human errors in blood smear analysis.
In the last few years, deep learning has increasingly shown the potential to improve healthcare by aiding medical professionals with diagnostic processes and patient interactions. In particular, Convolutional Neural Networks (CNNs), a class of deep learning algorithms, have successfully been applied to classify images of biological features that are often used to help monitor overall health and detect disorders in patients. In this project Convolutional Neural Network based ResNet architecture is used. ResNet is one of the most powerful deep neural networks. The classic ResNet architecture is ResNet50 i.e., ResNet with 50 layers, but here we use a customized ResNet with 24 layers which is similar to ResNet50. ResNet uses Batch Normalization. The Batch Normalization adjusts the input layer to increase the performance of the network. ResNet makes use of the Identity Connection, which helps to protect the network from vanishing gradient problem. Deep Residual Network uses bottleneck residual block design to increase the performance of the network. In ResNet architecture, a “shortcut” or a “skip connection” allows the gradient to be directly backpropagated to earlier layers. Here the ResNet identify the different types of WBC cells in the blood smear image.
The dataset used in this project to train and evaluate the model is blood cell images from Kaggle. The dataset contains 5106 images in total in which 2548 for training, 2487 images for testing and 71 for validation. White blood cells are an important part of our immune system. Different types of white blood cells perform different functions in the body. Overall, white blood cells help to protect us against bacteria, viruses, and parasites. Mainly there are 5 types of WBC cells. They are Neutrophils, Lymphocytes, Eosinophils, Basophils and Monocytes. But here in this project we are detecting only 4 of them: eosinophil, lymphocyte, monocyte and neutrophil. By detecting the type of WBC, this model helps doctors to diagnose a patient easily. The normal range (total) of white blood cells is between 4,000 and 10,000 cells per microliter (mcL). On its own, a low WBC count doesn't have symptoms. But a low count will often lead to an infection, because not enough white cells are present to fight off the invader. A high white blood count (WBC) can be a symptom of an underlying disorder. Disorders that are related to a high WBC include autoimmune or inflammatory disease, bacterial or viral infection, leukaemia, Hodgkin's disease, or allergic reaction.
Blood cell images dataset contains 5106 of smear images with 4 classes eosinophil, lymphocyte, monocyte and neutrophil. There are 2548 images for training, 2487 images for testing and 71 for validation. In train image set 2497 are eosinophil, 2483 are lymphocyte, 2478 are monocyte and 2499 are neutrophil. In test image set 623 images are eosinophil, 620 are lymphocyte, 620 are monocyte and 624 are neutrophil. Lastly in validation set 13 images are eosinophil, 6 are lymphocyte, 4 are monocyte and 48 are neutrophil.
This project is a deep learning project, which aims to detect WBC cell and identify its subtype. The idea to develop this application as my academic project came to my mind after reading some research papers. As the paper suggested it can be done using CNN which is deeper. While going through some CNN papers I found that ResNet is the most efficient one among other CNN architectures. As a part of my contribution in thought of customizing the ResNet. Due to lesser number of classes in this project I decided to go with a less number layer ResNet. So, I took the first 23 layers of ResNet50 and a fully connected layer. Because ResNet50 has better performance than other architectures, ResNet18 and ResNet50 and also it is the most commonly used architecture in ResNet versions. There are 4 classes of WBC subtype Eosinophil, Neutrophil, Monocyte and Lymphocyte. It is a tedious task to predict correct subtype because there are similarities in the structure. The model has 90% test accuracy and 96% train accuracy as the better one. I used Google Colab and Jupyter Notebook for developing this application. It made the work so easy and efficient. The use of Graphics Processing Unit (GPU) can accelerate the training process and yield faster results. The models were trained six to ten times for different number of epochs and finally an optimum number 20 was chosen as the number of epochs for training in the final stage. Saving the model after each epoch can help us to choose the best model based on the validation loss and validation accuracy. By this I have achieved the above-mentioned accuracy.