-
Notifications
You must be signed in to change notification settings - Fork 0
andyzorigin/data_contamination
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Usage: python compute_contamination_metrics.py --input-data <input_data> --scenario-data <scenario_data> --output-stats <output_stats> --input-format <input_format> For instance, you can call this with The Pile, e.g. have: input_data = 00.jsonl (download https://pile.eleuther.ai/) scenario_data = (example included with repo, but can use HELM to generate) output_stats = arbitrary output file name, e.g. "output_stats" input_format = the_pile
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published