Comparative Analysis of Ensemble Distillation and Mutual Learning:A Unified Framework for Uncertainty‑Calibrated Vision Systems
This repository contains the complete implementation of "Comparative Analysis of Ensemble Distillation and Mutual Learning: A Unified Framework for Uncertainty-Calibrated Vision Systems", a comprehensive research study investigating advanced knowledge transfer techniques for computer vision tasks.
Our research introduces CALM (Comparative Analysis of Learning Methods), a unified framework that systematically compares and evaluates multiple knowledge transfer paradigms:
- Ensemble Distillation: Six teacher models transfer knowledge to a lightweight student
- Mutual Learning: Collaborative training where models learn from each other simultaneously
- Meta-Student Learning: Advanced knowledge fusion using adaptive protocols
- Uncertainty Calibration: Focus on both accuracy and prediction reliability
- Six Teacher Models: ViT-B16, EfficientNet-B0, Inception-V3, MobileNet-V3, ResNet-50, DenseNet-121
- Student Model: Scaled EfficientNet-B0
- Advanced Features:
- Calibration-aware weighting of teacher contributions
- Heterogeneous Feature Integration (HFI)
- Dynamic temperature scaling
- Adaptive teacher gating
- Collaborative Training: All models learn simultaneously
- Knowledge Exchange: KL divergence-based peer learning
- Calibration Awareness: Temperature-scaled probability distributions
- Curriculum Learning: Gradual increase in mutual learning weights
- Stage 1: Train baseline models (Sb), distilled student (Sd), and mutual student (Sm)
- Stage 2: Meta-student (EfficientNet-B1) learns from fused knowledge
- AKTP (Adaptive Knowledge Transfer Protocol): Dynamic CE/KD loss weighting
- ACP (Adaptive Curriculum Protocol): Progressive calibration loss integration
- Stage 3: Cross-dataset evaluation (CIFAR-10 → CIFAR-100)
- Expected Calibration Error (ECE): Measure prediction reliability
- Calibration Loss: MSE-based calibration training
- Temperature Scaling: Post-hoc calibration refinement
- Reliability Diagrams: Visual calibration assessment
- Target Hardware: RTX 3060 Laptop (6GB VRAM) + Ryzen 7 6800H
- Optimizations:
- Automatic Mixed Precision (AMP)
- Gradient accumulation
- Memory-efficient attention
- Dynamic GPU cache clearing
- Memory Optimization: Supports training on 6GB VRAM
- Distributed Training: Multi-GPU support
- Automatic Hyperparameter Tuning: Dynamic batch size finding
- Comprehensive Logging: TensorBoard integration
- Robust Checkpointing: Resume training from interruptions
Method | Accuracy | F1 Score | ECE ↓ | Parameters |
---|---|---|---|---|
Baseline | 93.24% | 0.9318 | 0.0864 | 4.0M |
Ensemble Distillation | 95.47% | 0.9545 | 0.0234 | 4.0M |
Mutual Learning | 94.83% | 0.9481 | 0.0198 | 4.0M |
Meta-Student (AKTP) | 95.12% | 0.9509 | 0.0211 | 6.2M |
Teacher Model | Accuracy | Parameters | Computational Cost |
---|---|---|---|
DenseNet-121 | 95.07% | 7.0M | High |
EfficientNet-B0 | 94.94% | 5.3M | Medium |
MobileNet-V3 | 94.98% | 5.5M | Low |
ResNet-50 | 94.08% | 25.6M | High |
ViT-B16 | 93.89% | 86.6M | Very High |
Inception-V3 | 83.17% | 23.8M | High |
├── Models/ # Trained model checkpoints and exports
│ ├── Baseline/ # Standard supervised learning models
│ ├── DenseNet121/ # DenseNet-121 teacher models
│ ├── EfficientNetB0/ # EfficientNet-B0 teacher models
│ ├── EfficientNetB1/ # EfficientNet-B1 models
│ ├── EnsembleDistillation/ # Ensemble distillation models
│ ├── InceptionV3/ # Inception-V3 teacher models
│ ├── MetaStudent_AKTP/ # Meta-student with AKTP/ACP
│ ├── MobileNetV3/ # MobileNet-V3 teacher models
│ ├── MutualLearning/ # Mutual learning models
│ ├── ResNet50/ # ResNet-50 teacher models
│ └── ViT/ # Vision Transformer models
├── Scripts/ # Training and evaluation scripts
│ ├── Baseline/ # Standard training scripts
│ ├── EnsembleDistillation/ # Ensemble distillation implementation
│ ├── MetaStudent/ # Meta-student training (AKTP/ACP)
│ ├── Mutual Learning/ # Mutual learning implementation
│ └── [Individual Model Scripts] # Teacher model training
├── Results/ # Experimental results and analysis
│ ├── Analysis/ # Comparative analysis notebooks
│ └── [Method Results]/ # Results per method
└── temp/ # Temporary files and logs
# Ensemble Distillation
batch_size = 64
gradient_accumulation_steps = 8
lr = 1e-4
soft_target_temp = 4.0
alpha_weight = 0.7 # KD weight
# Mutual Learning
mutual_learning_weight = 0.5
calibration_weight = 0.1
warmup_epochs = 5
# Meta-Student (AKTP)
lr_meta_student = 5e-5
lr_combiner_aktp = 1e-4
entropy_weight = 0.3
disagreement_weight = 0.7
- Unified Framework: First comprehensive comparison of ensemble distillation vs. mutual learning
- Calibration Focus: Emphasis on both accuracy and uncertainty quantification
- Novel Protocols: Introduction of AKTP and ACP for adaptive knowledge transfer
- Hardware Optimization: Practical implementation for resource-constrained environments
- Cross-Dataset Transfer: Systematic evaluation of knowledge transfer across datasets
If you use CALM in your work, please cite:
@inproceedings{Perdana2025CALM,
title = {CALM: Calibrated Adaptive Learning via Mutual-Ensemble Fusion},
author = {G. A. Perdana and M. A. Ghazali and I. A. Iswanto and S. Joddy},
booktitle = {Proceedings of the 10th International Conference on Computer Science and Computational Intelligence (ICCSCI 2025)},
series = {Procedia Computer Science},
year = {2025},
doi = {...},
url = {https://github.com/cujoramirez/CALM}
}
Contributions welcome via issues and PRs. Released under the MIT License.
Acknowledgments
- Datasets: CIFAR-10 and CIFAR-100 by Alex Krizhevsky; STL-10 by Adam Coates, Andrew Ng, and Honglak Lee (accessed dynamically via torchvision.datasets, no checked‑in data).
- Pretrained Models: Torchvision and TIMM model libraries
- Hardware Support: NVIDIA RTX 3060 optimization
- Framework: PyTorch ecosystem
For questions, suggestions, or collaborations:
- GitHub Issues: Create an issue
- Email: [gading.perdana@binus.ac.id]
Keywords: Knowledge Distillation, Mutual Learning, Ensemble Methods, Computer Vision, Uncertainty Quantification, Deep Learning, PyTorch, CIFAR-10, Model Calibration, Transfer Learning