MLDiff provides a novel approach to comparing learned classifiers by translating them into SMT formulas and systematically analyzing their decision boundaries. This enables precise identification of input regions where classifiers agree or disagree, offering insights into model behavior and reliability. The repository has the following structure:
MLDiff/
├── mldiff/ # Core library
│ ├── diff.py # Main comparison framework
│ ├── dt2smt.py # Decision Tree to SMT translation
│ ├── svm2smt.py # SVM to SMT translation
│ ├── logReg2smt.py # Logistic Regression to SMT translation
│ ├── mlp2smt.py # MLP to SMT translation
│ └── ... # Additional translation
├── examples/ # Usage examples
│ ├── iris_diff.py # Iris dataset comparison
│ ├── digits_diff.py # Digits dataset comparison
│ └── ... # Additional examples
├── sefm2025/ # SEFM 2025 paper
│ ├── rq1.py # Research Question 1
│ ├── rq2-4.py # Research Questions 2-4
│ └── results/ # Experimental results for SEFM 2025 paper
│ └── ... # Additional scripts
├── tests/ # Test suite
├── models/ # Pre-trained models
└── pyproject.toml # Project configuration
- Multi-Classifier Support: Compare Decision Trees, SVMs, Logistic Regression, and Multi-Layer Perceptrons
- SMT-Based Analysis: Leverages Z3 solver for formal verification and exhaustive comparison
- Comprehensive Evaluation: Generates difference matrices showing all possible agreement/disagreement patterns
- Research Framework: Includes experimental setup for empirical studies on classifier comparison
- Multiple Datasets: Built-in support for Iris, Digits, Breast Cancer, CIFAR, and Olivetti Faces datasets
git clone https://github.com/yourusername/MLDiff.git
cd MLDiff
poetry install
poetry shellgit clone https://github.com/yourusername/MLDiff.git
cd MLDiff
pip install -r requirements.txt- Python 3.12+
- scikit-learn 1.4.2
- NumPy, Matplotlib, z3-solver
- Additional dependencies listed in
pyproject.toml
Here's a simple example comparing a Decision Tree and SVM on the Iris dataset:
from z3 import *
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.svm import LinearSVC
import mldiff.dt2smt as dt2smt
import mldiff.svm2smt as svm2smt
from mldiff.diff_matrix import evaluateDiffMatrix
# Load data and train classifiers
iris = load_iris()
X, y = iris.data, iris.target
dt = DecisionTreeClassifier(max_depth=3)
dt.fit(X, y)
svm = LinearSVC()
svm.fit(X, y)
# Convert to SMT formulas
clDt = Int("classDT")
s = dt2smt.toSMT(dt, str(clDt))
clSvm = Int("classSVM")
svm2smt.toSMT(svm, str(clSvm), s)
# Check for disagreements
s.push()
s.add(clDt != clSvm)
if s.check() == sat:
print("Classifiers disagree on:", s.model())
else:
print("Classifiers agree on all inputs")
s.pop()The examples/ directory contains comprehensive usage examples:
iris_diff.py: Basic classifier comparison on Iris datasetdigits_diff.py: Comparison on handwritten digit classificationcifar_diff_dt_svm.py: CIFAR-10 dataset comparisonolivetti_diff_dt_svm.py: Face recognition comparison
Run any example:
python -m examples.iris_diffThe sefm2025/ directory contains evaluation artifacts of the paper "On the Comparison of Learned Classifiers" submitted to SEFM 2025.
- Iris: Classic 3-class flower classification
- Digits: Handwritten digit recognition (0-9)
- Breast Cancer: Binary cancer diagnosis
- Olivetti Faces: Face recognition
The framework supports various configuration options in diff.py:
VIZ_FLAG: Enable visualization of resultsPCA_FLAG: Apply PCA preprocessingFEATURE_CON_FLAG: Enable domain constraintsEPSILON_*: Tolerance parameters for different classifiers
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
If you use MLDiff in your research, please cite:
@inproceedings{
}This project is licensed under the Apache License 2.0 - see the LICENSE file for details.