Deep learning for medicinal plant leaf disease classification with ResNet, EfficientNetV2-S, ConvNeXt-Tiny, and explainability using Grad-CAM & t-SNE.
This project focuses on medicinal plant leaf disease classification using deep learning on the AI-MedLeafX dataset. In addition to classification performance, the project also emphasizes model interpretability through Explainable AI (XAI) techniques such as Grad-CAM and t-SNE.
- Classify diseases on medicinal plant leaves from RGB images.
- Compare multiple CNN architectures on the same dataset.
- Analyze model decisions using XAI.
- Build a solid baseline for future real-world agricultural applications.
- ✅ Fine-tuning pretrained CNN models on AI-MedLeafX
- ✅ Comparison of ResNet50, EfficientNetV2-S, and ConvNeXt-Tiny
- ✅ Explainability with Grad-CAM and t-SNE
- ✅ Strong performance with 98.85% accuracy from EfficientNetV2-S
- ✅ Organized notebooks for training, evaluation, and XAI experiments
Early detection of plant diseases is important for reducing crop damage and improving decision-making in smart agriculture. Unlike many previous works that focus on common agricultural crops, this project targets medicinal plants, which are less studied and have more limited public datasets.
The task is to classify leaf images into disease/healthy categories using deep learning models and provide visual explanations for model predictions.
The project uses AI-MedLeafX, a medicinal plant disease dataset containing 4 plant species and 13 classes.
- Camphor
- HariTaki
- Neem
- Sojina
- Bacterial Spot
- Shot Hole
- Powdery Mildew
- Yellow Leaf
- Healthy Leaf
- Image size: 224 × 224
- Data format: RGB images
- Preprocessing based on ImageNet normalization
The notebook uses the augmented image directory and splits the dataset as follows:
- Train: 70%
- Validation: 20%
- Test: 10%
- 65,178 images
- 13 classes
- Train: 45,624
- Validation: 13,035
- Test: 6,519
Note: The original report mentions the base AI-MedLeafX dataset at around 10,858 images. The notebook appears to use an augmented version, resulting in a much larger number of samples.
Three representative CNN architectures were studied:
A strong baseline CNN with residual connections that helps stabilize deep training.
A more compute-efficient architecture balancing model size and classification performance.
A modern ConvNet inspired by design ideas from Vision Transformers, achieving the best performance in this project.
The overall pipeline follows these main steps:
Input Images
↓
Preprocessing & Augmentation
↓
Train / Validation / Test Split
↓
Fine-tune Pretrained CNN Models
↓
Evaluation Metrics
↓
XAI Analysis (Grad-CAM, t-SNE)
Implemented using torchvision.transforms:
- Resize to
224x224 - Random horizontal flip
- Random rotation
- Convert to tensor
- Normalize with ImageNet mean/std
Based on the report:
- Pretrained ImageNet weights
- AdamW optimizer
- Cross-Entropy Loss
- Early stopping
- ReduceLROnPlateau
- Two-stage training strategy for stable convergence
- Accuracy
- Precision
- Recall
- F1-score
- ROC-AUC
- Confusion Matrix
| Model | Accuracy (%) | Precision | Recall | F1-score | Params |
|---|---|---|---|---|---|
| ResNet50 | 95.03 | 0.95 | 0.95 | 0.95 | ~24M |
| EfficientNetV2-S | 98.85 | 0.99 | 0.99 | 0.99 | ~21M |
| ConvNeXt-Tiny | 98.16 | 0.98 | 0.98 | 0.98 | ~28M |
- ConvNeXt-Tiny achieved the best overall performance.
- EfficientNetV2-S provided a strong balance between accuracy and efficiency.
- ResNet50 remained a reliable baseline for comparison.
This project does not stop at prediction performance. It also investigates why the model makes a prediction.
Grad-CAM helps identify the image regions that most influence the final prediction.
Findings from the report:
- Good localization on diseases such as Bacterial Spot and Shot Hole.
- Distributed attention over the leaf surface for Healthy Leaf samples.
- More difficulty with Powdery Mildew, where symptoms are diffuse and spread across the leaf.
t-SNE is used to visualize learned feature embeddings in 2D.
Observations:
- ResNet50 forms reasonably separated clusters but still shows overlap.
- EfficientNetV2-S produces the clearest feature separation.
- ConvNeXt-Tiny forms denser clusters yet still achieves the highest classification accuracy.
Recommended environment: Python 3.10+
pip install torch torchvision scikit-learn matplotlib seaborn pillow opencv-python tqdm jupyter lime grad-cam kagglePlace your Kaggle API file at:
~/.kaggle/kaggle.jsonSet permissions:
chmod 600 ~/.kaggle/kaggle.jsonjupyter notebookmedleaf-disease-detection (2).ipynb
Inside the notebook, a command similar to this is used:
kaggle datasets download -d mrlocbap/ai-medleafxThe notebook extracts the archive into:
content/ai-medleafx
Main notebook tasks include:
- loading images by class directory
- creating train/val/test splits
- fine-tuning pretrained models
- saving checkpoints
- evaluating classification results
- visualizing XAI outputs
The test.ipynb notebook includes helper functions for:
- single-image prediction
- Grad-CAM visualization
- LIME explanation
- loading trained
.pthweights for inference
Typical inference flow:
- Load model checkpoint
- Apply the same test transform
- Run forward pass
- Return predicted class and confidence score
This can be extended into:
- a simple desktop app
- a Streamlit/Gradio web app
- a mobile app
- a field-support diagnostic tool
Despite strong results, several limitations remain:
- The training data is mostly collected in controlled conditions, not fully reflecting field environments.
- Diffuse diseases such as powdery mildew remain harder to localize and explain.
- The system has not yet been deployed as a real-time production application.
Possible next steps include:
- training on real-world field images for better generalization;
- experimenting with ViT, Swin Transformer, or hybrid CNN-Transformer models;
- using additional XAI tools such as SHAP and deeper LIME analysis;
- building a more user-friendly diagnosis interface;
- deploying the model on mobile devices or agricultural drones.
- Supervisor: TS. Lê Thị Vĩnh Thanh
- Course: Thị giác máy tính và ứng dụng
- Institution: Trường Đại học Công nghiệp TP.HCM
This README was consolidated from:
Nhom_01_NhanDienBenhTrenLaCay.docxmedleaf-disease-detection (2).ipynbtest.ipynb
If you want to make this repository even more professional, the next best additions would be:
- a clean architecture diagram;
- confusion matrix images with labels;
- Grad-CAM comparison figure per model;
- a
requirements.txtfile; - a
demo.ipynborapp.pyfor quick demonstration.








