NOTE: The code for our method is in src, and the Python script for running experiments using our method is fairtabddpm_opt.py.
The PyTorch version we used in this project is 2.3.0+cu121, and you can install the required packages by running the following command:
conda create -n ai python=3.10
source activate ai
pip install -r requirements.txt
pip install dgl -f https://data.dgl.ai/wheels/torch-2.3/cu121/repo.html
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/torch-2.3.0+cu121.htmlTo download and preprocess the datasets, run the following command:
python build.pyUnder the root directory, run the following commands to reproduce the results of our method:
# run experiments for our method
bash fairtabddpm.shTo reproduce the results of baseline methods, run the following commands:
# go to baselines directory
cd baselines
# run experiments for baselines
bash codi.sh
bash fairsmote.sh
bash fairtabgan.sh
bash goggle.sh
bash great.sh
bash smote.sh
bash stasy.sh
bash tabddpm.sh
bash tabsyn.sh- Adult
- COMPASS
- German Credit
- Bank Marketing
The baseline methods we used in this project are as follows (sorted alphabetically):
- CoDi
- Goggle
- GReaT
- SMOTE
- STaSy
- TabDDPM
- TabSyn
- Fair Class Balancing (FCB)
- FairTGAN
Avoid repeatition to improve the code quality:
- Replace
exp_config['home']by importingEXPS_PATHfromconstant.pyin all running scripts - Replace
data_config['path']by importingDB_PATHfromconstant.pyin all running scripts - Delete home of experiments and path of datasets in all
config.tomlfiles - Add a new argument
--methodto optimization scripts and merge all optimization scripts into one - Find commonly used functions in all running scripts and move them to
utils.py
Organize the code:
- Move
fairtabddpm.sh,fairtabddpm_run.py,fairtabddpm_opt.pytobaselinedirectory and renamebaselinedirectory tomethods, and editreadme.mdaccordingly - Move
src/evaluate/metrics.pyout to the root directory because it is specific to the project
Automate the experiments and evaluations:
- Refactor and reorganize
assess/present.ipynbwith functional programming - Rewrite all the code in
assessdirectory with functional programming
Correct the errors:
- The implementation of TabSyn in
baselinesis incorrect
fair-tab-diffusion is released under the GPL 3.0. See LICENSE for details.