-
Notifications
You must be signed in to change notification settings - Fork 67
Closed
Milestone
Description
Problem Description
Currently, the monthly multi-table benchmark runs on 7 demo datasets. We aim to expand this by including all datasets from the following list:
['WebKP',
'DCG',
'UW_std',
'Same_gen',
'CORA',
'got_families',
'SalesDB',
'UTube',
'Student_loan',
'Hepatitis_std',
'Elti',
'Bupa',
'Toxicology',
'imdb_ijs',
'ftp',
'imdb_small',
'imdb_MovieLens',
'Pima',
'university',
'legalActs',
'Dunur',
'Mesh',
'world',
'airbnb-simplified',
'trains',
'FNHK',
'fake_hotels',
'SAT',
'genes',
'Biodegradability',
'Pyrimidine',
'mutagenesis',
'restbase',
'Triazine',
'Carcinogenesis',
'fake_hotels_extended',
'Mooney_Family',
'PTE',
'Facebook',
'multi_table_ID_demo_dataset',
'SAP',
'Chess',
'Countries',
'NCAA',
'Atherosclerosis',
'nations',
'TubePricing',
'financial',
'Accidents',
'MuskSmall',
'NBA',
'AustralianFootball',
'PremierLeague',
'OMOP_CDM_dayz']Expected behavior
Add the 'sdv_datasets' parameter with the list of datasets when running the benchmark.
| for synthesizer_group in MODALITY_TO_SETUP[modality]['synthesizers_split']: |
Additional context
All those datasets are publicly available on sdv
Reactions are currently unavailable