Skip to content

Script to detect whether a model was trained on test data

Notifications You must be signed in to change notification settings

MedARC-AI/contamination

Repository files navigation

MultiMEDQA Contamination Detection

Description

We provide which implements Membership Inference Attacks (MIA) from 6 papers to determine whether a provided HuggingFace Model was trained on any of the tests sets in the MultiMedQA benchmark. The MultiMedQA benchmark consists of the following QA datasets:

  • PubMedQA
  • MedMCQA
  • MedQA
  • MMLU - Anatomy, Clinical Knowledge, College Biology, College Medicine, Medical Genetics, Professional Medicine

The script is easily extensible to other QA datasets

The methods implemented are from the following papers:

Setup

pip install -r requirements.txt

Running

If testing on a dataset not in MultiMedQA, first add it to the DATASET_CONFIGS dictionary in configs.py.

If testing on a HuggingFace model not in MODEL_CONFIGS, add it as well.

Generating Paraphrases (Neighbors)

2 of the contamination methods from:

involve pre-computing paraphrased versions of test set instances. If you want to run these tests, you must first run python paraphrase.py which will generate --num_neighbors paraphrases of each test set instance and save inside ./results/neighbors.

The main script can be run with -no_neighbors if you want to disable these paraphrase-based scores.

Running Contamination Metrics

python main.py --model qwen-72b --dataset pubmedqa --max_examples 100

will test qwen-72b on the full suite of metrics on a random sample of 100 examples from the pubmedqa test set.

The results will print to the console as well as be saved to a .csv file under ./results.

Statistical Significance

python significance_test.py will print out if the ROUGE-L for “Guided” prompt is statistically significantly higher (p < 0.05) then the "General", which is explained in the below paper:

About

Script to detect whether a model was trained on test data

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published