Awesome XAI

A curated list of XAI and Interpretable ML papers, methods, critiques, and resources.

Explainable AI (XAI) is a branch of machine learning research which seeks to make various ML techniques more understandable.

Papers

Surveys

Methods

Ada-SISE - Adaptive semantice inpute sampling for explanation
ALE - Accumulated local effects plot
ALIME - Autoencoder Based Approach for Local Interpretability
Anchors - High-Precision Model-Agnostic Explanations
Auditing - Auditing black-box models
BayLIME - Bayesian local interpretable model-agnostic explanations
Break Down - Break down plots for additive attributions
CAM - Class activation mapping
CDT - Confident interpretation of Bayesian decision tree ensembles
CICE - Centered ICE plot
CMM - Combined multiple models metalearner
Conj Rules - Using sampling and queries to extract rules from trained neural networks
CP - Contribution propogation
DecText - Extracting decision trees from trained neural networks
DeepLIFT - Deep label-specific feature learning for image annotation
DTD - Deep Taylor decomposition
ExplainD - Explanations of evidence in additive classifiers
FIRM - Feature importance ranking measure
Fong, et. al. - Meaninful perturbations model
G-REX - Rule extraction using genetic algorithms
Gibbons, et. al. - Explain random forest using decision tree
GoldenEye - Exploring classifiers by randomization
GPD - Gaussian process decisions
GPDT - Genetic program to evolve decision trees
GradCAM - Gradient-weighted Class Activation Mapping
GradCAM++ - Generalized gradient-based visual explanations
Hara, et. al. - Making tree ensembles interpretable
ICE - Individual conditional expectation plots
IG - Integrated gradients
inTrees - Interpreting tree ensembles with inTrees
IOFP - Iterative orthoganol feature projection
IP - Information plane visualization
KL-LIME - Kullback-Leibler Projections based LIME
Krishnan, et. al. - Extracting decision trees from trained neural networks
Lei, et. al. - Rationalizing neural predictions with generator and encoder
LIME - Local Interpretable Model-Agnostic Explanations
LOCO - Leave-one covariate out
LORE - Local rule-based explanations
Lou, et. al. - Accurate intelligibile models with pairwise interactions
LRP - Layer-wise relevance propogation
MES - Model explanation system
MFI - Feature importance measure for non-linear algorithms
NID - Neural interpretation diagram
OptiLIME - Optimized LIME
PALM - Partition aware local model
PDA - Prediction Difference Analysis: Visualize deep neural network decisions
PDP - Partial dependence plots
POIMs - Positional oligomer importance matrices for understanding SVM signal detectors
ProfWeight - Transfer information from deep network to simpler model
Prospector - Interactive partial dependence diagnostics
QII - Quantitative input influence
REFNE - Extracting symbolic rules from trained neural network ensembles
RETAIN - Reverse time attention model
RISE - Randomized input sampling for explanation
RxREN - Reverse engineering neural networks for rule extraction
SHAP - A unified approach to interpretting model predictions
SIDU - Similarity, difference, and uniqueness input perturbation
Simonynan, et. al - Visualizing CNN classes
Singh, et. al - Programs as black-box explanations
STA - Interpreting models via Single Tree Approximation
Strumbelj, et. al. - Explanation of individual classifications using game theory
SVM+P - Rule extraction from support vector machines
TCAV - Testing with concept activation vectors
Tolomei, et. al. - Interpretable predictions of tree-ensembles via actionable feature tweaking
Tree Metrics - Making sense of a forest of trees
TreeSHAP - Consistent feature attribute for tree ensembles
TreeView - Feature-space partitioning
TREPAN - Extracting tree-structured representations of trained networks
TSP - Tree space prototypes
VBP - Visual back-propagation
VEC - Variable effect characteristic curve
VIN - Variable interaction network
X-TREPAN - Adapted etraction of comprehensible decision tree in ANNs
Xu, et. al. - Show, attend, tell attention model

Critiques

Do Not Trust Additive Explanations - Authors argue that addditive explanations (e.g. LIME, SHAP, Break Down) fail to take feature ineractions into account and are thus unreliable.
Please Stop Permuting Features An Explanation and Alternatives - Authors demonstrate why permuting features is misleading, especially where there is strong feature dependence. They offer several previously described alternatives.
Stop Explaining Black Box Machine Learning Models for High States Decisions and Use Interpretable Models Instead - Authors present a number of issues with explainable ML and challenges to interpretable ML: (1) constructing optimal logical models, (2) constructing optimal sparse scoring systems, (3) defining interpretability and creating methods for specific methods. They also offer an argument for why interpretable models might exist in many different domains.
The (Un)reliability of Saliency Methods - Authors demonstrate how saliency methods vary attribution when adding a constant shift to the input data. They argue that methods should fulfill input invariance, that a saliency method mirror the sensistivity of the model with respect to transformations of the input.

Books

Open Courses

Repositories

EthicalML/xai - A toolkit for XAI which is focused exclusively on tabular data. It implements a variety of data and model evaluation techniques.
PAIR-code/what-if-tool - A tool for Tensorboard or Notebooks which allows investigating model performance and fairness.
slundberg/shap - A python module for using Shapley Additive Explanations.

Follow

The Institute for Ethical AI & Machine Learning - A UK-based research center that performs research into ethical AI/ML, which frequently involves XAI.

Who else should we be following!?

Contributing

Contributions of any kind welcome, just follow the guidelines!

Contributors

Thanks goes to these contributors!

License

CC0 License

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

readme.md

readme.md

Awesome XAI

Contents

Papers

Surveys

Methods

Critiques

Books

Open Courses

Repositories

Follow

Contributing

Contributors

License

Files

readme.md

Latest commit

History

readme.md

File metadata and controls

Awesome XAI

Contents

Papers

Surveys

Methods

Critiques

Books

Open Courses

Repositories

Follow

Contributing

Contributors

License