FIV, Sep 2024, v.2
Repository to replicate experiments and results in:
What do anomaly scores actually mean? Key characteristics of algorithms' dynamics beyond accuracy by F. Iglesias, H. O. Marques, A. Zimek, T. Zseby
Comparison of score dynamics and accuracy (S-curves, accuracy, discriminant power, stability, robustness, confidence, coherence, variance) generated by different outlier detection algorithms subjected to different types of perturbations.
Experiments have been tested with Python 3.9.6.
Create a new virtual environment and install dependencies with the following commands:
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
Synthetic data is generated with:
python generate_data.py
This creates the folder [data/synthetic_data] with datasets used for the experiments.
It additionally creates the [plots/synthetic_data] folder with selected plots included in the paper.
Note that the folder [data/real_data] contains 4 real datasets downloaded from: https://www.dbs.ifi.lmu.de/research/outlier-evaluation/DAMI/. Specifically, they are:
- Cardiotocography (nodups, norm, 22%)
- Shuttle (nodups, norm, v10)
- Waveform (nodups, norm, v10)
- Wilt (nodups, norm, 05%)
All datasets (both synthetic and real) are in .CSV format with first row as header and the last column 'y' is the binary label: '1' for outliers, '0' for inliers.
To extract outlier scores and accuracy performances, run:
python outdet.py data/synthetic_data/ minmax
This scrip will take the datasets in [data/synthetic_data] and generate the [scores/minmax] folder. This folder will contain files with point-wise outlierness scores outputed by each algorithm under test. It also creates the file performances/perf_minmax.csv folder and file with a summary table with the overall performances (accuracy metrics). The minmax argument selects the type of normalization applied on the outlierness scores.
For proba-normalization:
python outdet.py data/synthetic_data/ proba
It will generate scores ([scores/proba]) and summaries (performances/perf_proba.csv) for probability (Gaussian) normalization.
Repeat the process for real datasets with:
python outdet.py data/real_data/ minmax
python outdet.py data/real_data/ proba
Warning! When running scripts, information saved in performance files is appended (not rewritten).
To extract S-curves and dynamic measurements:
python compare_scores_group.py data/synthetic_data scores/minmax minmax
python compare_scores_group.py data/synthetic_data scores/proba proba
python compare_scores_group.py data/real_data scores/minmax minmax
python compare_scores_group.py data/real_data scores/proba minmax
This will generate plots with S-curves in folders: [plots/minmax/S-curves] and [plots/proba/S-curves], also the files performances/dynamic_minmax.csv and performances/dynamic_proba.csv files. Note that the compare_scores_group.py script matches the right dataset and file-with-scores by matching file-names.
Warning! When running scripts, information saved in performance files is appended (not rewritten).
To extract Perini's metrics (Stability & Confidence):
python perini_tests.py data/synthetic_data minmax
python perini_tests.py data/synthetic_data proba
python perini_tests.py data/real_data minmax
python perini_tests.py data/real_data proba
This will create the files: performances/peri_stab_minmax.csv and performances/peri_stab_proba.csv for the Stability measurement, and performances/peri_conf_minmax.csv and performances/peri_conf_proba.csv for the Confidence measurement.
Note that Perini's Confidence is defined element-wise. To obtain a Confidence value per solution we use the 1% quantile.
Warning!! This step can take considerable time in a desktop computer (some days).
Warning!! When running scripts, information saved in performance files is appended (not rewritten).
Original scripts are obtained from the repositories:
-
Confidence [1]: https://github.com/Lorenzo-Perini/Confidence_AD
-
Stability [2]: https://github.com/Lorenzo-Perini/StabilityRankings_AD
[1] Perini, L., Vercruyssen, V., Davis, J.: Quantifying the confidence of anomaly detectors in their example-wise predictions. In: The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases. Springer Verlag (2020).
[2] Perini, L., Galvin, C., Vercruyssen, V.: A Ranking Stability Measure for Quantifying the Robustness of Anomaly Detection Methods. In: 2nd Workshop on Evaluation and Experimental Design in Data Mining and Machine Learning @ ECML/PKDD (2020).
To merge all dynamic and accuracy indices in a single file (for accuracty we only keep ROC and AAP), run:
python merge_indices.py performances/dynamic_minmax.csv performances/perf_minmax.csv performances/peri_stab_minmax.csv performances/peri_conf_minmax.csv minmax
python merge_indices.py performances/dynamic_proba.csv performances/perf_proba.csv performances/peri_stab_proba.csv performances/peri_conf_proba.csv proba
Generated outputs are performances/all_minmax.csv and performances/all_proba.csv.
To generate scatter plots comparing measurements and algorithms, run:
python scatterplots.py performances/all_minmax.csv minmax
python scatterplots.py performances/all_proba.csv proba
Additional plots will be generated in the [plots/minmax/performance] and [plots/proba/performance] folders.
To create a table in .TEX format (performances/perf_table.tex) with an overall comparison, run:
python latex_table.py performances/all_minmax.csv performances/all_proba.csv performances/perf_table.tex
Correlation plots (plots/corr_lin.pdf and plots/corr_gaus.pdf) are generated with:
python metric_corr.py performances/all_minmax.csv performances/all_proba.csv plots/
The file performances.zip contains tables with summary results obtained from conducting all previous steps.