Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

This is the official repo of our ICML'25 paper, and our project website is at this [link].

In this work we unify many existing neuron-level explanation evaluation methods under one mathematical framework: Neuron Eval.
The unified framework Neuron Eval allows us to compare and contrast existing evaluation metrics, understand the evaluation pipeline with increased clarity and apply existing statistical concepts on the evaluation.
In addition, we propose two simple sanity tests on the evaluation metrics and show that many commonly used metrics fail these tests. Our proposed tests served as necessary conditions for a reliable evaluation metric.

Setup

Install Python (3.11), PyTorch (tested with 2.0.1) and torchvision (0.15.2). Should also work with more recent versions.
- https://pytorch.org/get-started/locally/
Install other requirements
- pip install -r requirements.txt

Downloading data and models (Required for some settings):

ResNet-18 (Places365): bash download_rn18_places.sh
CUB200 dataset: bash download_cub.sh(Linux) or download_cub.bat(Windows)
CUB200 CBM model: bash download_cub_cbm.sh
ImageNet (validation): We do not provide a download link, add link to your copy in DATASET_ROOTS under data_utils.py

Quickstart

Theoretical Missing/Extra Labels Test (Sec. 4) - run theoretical_sanity_check.ipynb.
- No downloads required
- To test your own metric implement it under metrics.py and evaluate on a new line of theoretical_sanity_check.ipynb.
- activation_frequencies determines what fraction of the inputs activate the simulated neuron/concept. Test will be run with each value in the list.
Experimental Missing/Extra Labels Test (Sec. 4) - run experimental_sanity_check.ipynb.
- Test different settings (defined in F.1) by changing the setting = parameter
Known Neuron AUPRC Test (Sec. 5) - run known_neuron_auprc_eval.ipynb.
- Test different settings (defined in F.2) by changing the setting = parameter

Main parameters:

setting: which test setup to use for experimental results, see Appendix F for details.
activations_dir: Directory for saving neuron activations.
epsilon: Minimal (normalized) decrease required for missing and extra labels test, default 0.001.

Additional experiments such as ablations can be found under additional_experiments.

Results

We find most existing evaluation metrics fail at least one of our simple sanity checks.
Only Correlation, Cosine similarity, AUPRC, F1-score and IoU pass both sanity checks.

Sources

CUB model and processing: ConceptBottleneck - GitHub
Overall code and data processing based on: Linear-Explanations - GitHub
Places365 models: NetDissect-Lite

Cite this work

T. Oikarinen, G. Yan, and T.-W. Weng, Evaluating Neuron Explanations: A Unified Framework with Sanity Checks, ICML 2025.

@inproceedings{oikarinen2025evaluating,
  title={Evaluating Neuron Explanations: A Unified Framework with Sanity Checks},
  author={Oikarinen, Tuomas and Yan, Ge and Weng, Tsui-Wei},
  booktitle={International Conference on Machine Learning},
  year={2025}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Setup

Quickstart

Results

Sources

Cite this work

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
CUB		CUB
additional_experiments		additional_experiments
data		data
figs		figs
.gitignore		.gitignore
data_utils.py		data_utils.py
download_cub.bat		download_cub.bat
download_cub.sh		download_cub.sh
download_cub_cbm.sh		download_cub_cbm.sh
download_rn18_places.sh		download_rn18_places.sh
experimental_sanity_check.ipynb		experimental_sanity_check.ipynb
known_neuron_auprc_eval.ipynb		known_neuron_auprc_eval.ipynb
metrics.py		metrics.py
readme.md		readme.md
requirements.txt		requirements.txt
theoretical_sanity_check.ipynb		theoretical_sanity_check.ipynb
utils.py		utils.py

Trustworthy-ML-Lab/Neuron_Eval

Folders and files

Latest commit

History

Repository files navigation

Evaluating Neuron Explanations: A Unified Framework with Sanity Checks

Setup

Quickstart

Results

Sources

Cite this work

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages