LeafDisease-AI: Deployable Deep Learning for Cross-Domain Plant Leaf Disease Detection via Ensemble Learning, Knowledge Distillation, and Quantization
LeafDisease-AI is the first comprehensive framework for cross-domain tomato leaf disease detection, bridging the gap between laboratory research and real-world agricultural deployment. This repository implements a unified optimization approach integrating ensemble learning, knowledge distillation, and quantization for edge-compatible disease detection.
- First Open Cross-Domain Benchmark: Unifying PlantVillage and TomatoVillage datasets into 15 harmonized disease classes
- Unified Optimization Framework: Integrating ensemble learning, knowledge distillation, and quantization
- Edge-Compatible Deployment: Achieving 671Γ compression (1.46 MB) with 97.46% accuracy
- Real-World Validation: Cross-domain evaluation on field condition datasets
- Explainable AI: Grad-CAM++ and LIME-based interpretability analysis
| Model | Accuracy | F1-Score | Parameters | Size | Inference Time |
|---|---|---|---|---|---|
| Ensemble (Teacher) | 99.15% | 97.07% | 163M | 652 MB | 12.6ms |
| ShuffleNetV2 (Student) | 98.53% | 96.12% | 1M | 4.2 MB | 0.29ms |
| Quantized INT8 | 97.46% | 95.36% | 1M | 1.46 MB | 0.29ms |
LeafDisease-AI/
βββ src/ # Source code
β βββ configurations/ # Configuration files
β βββ data_augmentation/ # Data augmentation strategies
β βββ data_balancing/ # ADASYN-based balancing
β βββ datasets/ # Dataset loaders and utilities
β βββ distillation/ # Knowledge distillation
β βββ ensemble/ # Ensemble learning
β βββ evaluation/ # Model evaluation scripts
β βββ explainable_ai/ # Interpretability analysis
β βββ hyperparameter_tuning/ # Hyperparameter optimization
β βββ models/ # Model architectures
β βββ quantization/ # Model quantization
β βββ utils/ # Utility functions
βββ data/ # Datasets (available under data/)
β βββ combined/ # Unified dataset
β βββ plantvillage/ # PlantVillage dataset (lab conditions)
β βββ tomatovillage/ # TomatoVillage dataset (field conditions)
βββ docs/ # Documentation
βββ scripts/ # Training and evaluation scripts
βββ outputs/ # Model outputs and results
βββ checkpoints/ # Trained model checkpoints
# Python 3.8+ required
pip install -r requirements.txt
pip install -r requirements_balancing.txt-
Download Datasets:
- PlantVillage: https://www.kaggle.com/datasets/abdallahalidev/plantvillage-dataset
- TomatoVillage: https://github.com/mamta-joshi-gehlot/Tomato-Village
Extract datasets to
data/plantvillage/anddata/tomatovillage/respectively. -
Datasets are also available under data/ directory
python scripts/train.py --model densenet121 --dataset combined --epochs 100python src/ensemble/train_best_ensemble.pypython src/distillation/train_best_kd.pypython src/quantization/mobile_quantization_pipeline.py# Cross-domain evaluation
python src/evaluation/evaluate_kd_on_test_datasets.py
# Interpretability analysis
python src/explainable_ai/interpretability_analysis.pyOur framework addresses three fundamental challenges:
- Cross-Domain Generalization: Unified PlantVillage (lab conditions) and TomatoVillage (field conditions) datasets
- Class Imbalance: ADASYN-based balancing for 75:1 imbalance ratio
- Computational Constraints: Knowledge distillation and quantization for edge deployment
- Data Preprocessing: Strategic augmentation and ADASYN balancing
- Hyperparameter Tuning: Systematic optimization across 24 architectures
- Ensemble Learning: Four-model soft voting ensemble
- Knowledge Distillation: Teacher-student framework with temperature scaling
- Quantization: INT8 quantization for mobile deployment
- Grad-CAM++: Attention visualization for model interpretability
- LIME: Local interpretable model-agnostic explanations
- Biological Validation: Alignment with plant pathology principles
| Dataset | Accuracy | F1-Score | Precision | Recall |
|---|---|---|---|---|
| Combined Test | 99.15% | 97.07% | 97.23% | 96.91% |
| PlantVillage | 98.87% | 96.45% | 96.78% | 96.12% |
| TomatoVillage | 95.70% | 93.62% | 94.01% | 93.23% |
| Model | Parameters | FLOPs | Memory | Inference Time |
|---|---|---|---|---|
| DenseNet-121 | 7.98M | 2.87G | 32.4 MB | 3.2ms |
| ResNet-101 | 44.55M | 7.83G | 179.8 MB | 4.1ms |
| ShuffleNetV2 | 1.26M | 146M | 5.1 MB | 0.29ms |
| Quantized INT8 | 1.26M | 146M | 1.46 MB | 0.29ms |
The framework detects 15 tomato leaf disease classes:
- Healthy
- Bacterial Spot
- Early Blight
- Late Blight
- Leaf Mold
- Septoria Leaf Spot
- Spider Mites
- Target Spot
- Yellow Leaf Curl Virus
- Mosaic Virus
- Powdery Mildew
- Nutrient Deficiency
- Pest Damage
- Environmental Stress
- Other Diseases
Detailed documentation is available in the docs/ directory:
We welcome contributions! Please see our Contributing Guidelines:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
If you use this work in your research, please cite: [citation coming soon]
Mohammad Junayed Hasan
- Email: junayedhasan100@gmail.com
- GitHub: @junayed-hasan
- LinkedIn: LinkedIn Profile
This project is licensed under the MIT License - see the LICENSE file for details.
MIT License
Copyright (c) 2024 Mohammad Junayed Hasan
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
- PlantVillage dataset contributors
- TomatoVillage dataset contributors
- PyTorch and timm library developers
- Agricultural research community
