SABAF: Removing Strong Attribute Bias from Neural Networks with Adversarial Filtering

Find our paper at arXiv and the short-version workshop paper at Algorithmic Fairness through the Lens of Time, NeurIPS 2023 and arXiv. Please cite the following if using the code:

Long paper:

@article{li2023sabaf,
  title={SABAF: Removing Strong Attribute Bias from Neural Networks with Adversarial Filtering},
  author={Li, Jiazhi and Khayatkhoei, Mahyar and Zhu, Jiageng and Xie, Hanchen and Hussein, Mohamed E and AbdAlmageed, Wael},
  journal={arXiv preprint arXiv:2311.07141},
  year={2023}
}

Workshop paper at Algorithmic Fairness through the Lens of Time, NeurIPS 2023 (see poster and presentation at https://nips.cc/virtual/2023/77745):

@article{li2023information,
  title={Information-Theoretic Bounds on The Removal of Attribute-Specific Bias From Neural Networks},
  author={Li, Jiazhi and Khayatkhoei, Mahyar and Zhu, Jiageng and Xie, Hanchen and Hussein, Mohamed E and AbdAlmageed, Wael},
  journal={arXiv preprint arXiv:2310.04955},
  year={2023}
}

Setup

This repo uses the packages described in the requirements.txt file. To set the environment, simply run:

pip install -r requirements.txt

Datasets

Colored MNIST dataset
- Download or set --color_dataset generated to generate Colored MNIST dataset with arbitrary color variance --biased_var BIASED_VAR
CelebA dataset
- Download
- Place images in ./data/CelebA/raw_data/img_align_celeba/*.jpg and other files in ./data/CelebA/raw_data/
- Preprocess data by python prepare_data.py --dataset CelebA
Adult Income dataset
- Download
- Place data in ./data/Adult/raw_data/adult.csv
- Preprocess data by python prepare_data.py --dataset Adult
IMDB Face dataset
- Download
- Place data in ./data/IMDB/raw_data/ and unzip auxiliary.zip into ./data/IMDB/raw_data/
- Preprocess data by python prepare_data.py --dataset IMDB
FFHQ dataset
- Download
- Place data in ./data/FFHQ/raw_data/images and labels in ./data/FFHQ/raw_data/labels

Custom

In addition to the datasets mentioned above, you can train and apply adversarial filter on other datasets of interest.

Dataset

Place data in ./data/Custom/raw_data. Write the methods __len__ and __getitem__ in ./dataloader/Custom.py, and set_data in ./models_train/Custom_filter.py and ./models_eval/set_Custom.py.

Training

To train adversarial dataset on the custom dataset, simply run:

python train.py --experiment Custom_filter --name NAME

Applying

To apply adversarial filter on any interested downstream tasks, simply run:

python eval.py --experiment Custom_downstream_our --name NAME

Experiments

The pre-defined experiments can be found in run.sh. You can either directly run these pre-defined experiments with chosen hyper-parameters to reproduce results in the paper, or train and evaluate your model.

Training

To train adversarial filter, please use train.py. To see all command-line accessible arguments run:

python train.py --help

Colored MNIST dataset

To train adversarial filter on universal dataset, simply run:

python train.py --experiment CMNIST_filter --name NAME --biased_var -1 \
    --mi 10.0 --gc 100.0 --dc 100.0 --gr 100.0

CelebA dataset

To train adversarial filter on universal dataset, simply run:

python train.py --experiment CelebA_filter --name NAME --CelebA_train_mode CelebA_FFHQ \
    --mi 50.0 --gc 50.0 --dc 50.0 --gr 100.0 \
    --shortcut_layers 1 --inject_layers 1 --enc_layers 5 --dec_layers 5 --dis_layers 5

Adult Income dataset

To train adversarial filter on universal dataset, simply run:

python train.py --experiment Adult_filter --name NAME --Adult_train_mode all \
    --mi 5.0 --gc 5.0 --dc 5.0 --gr 10.0 --epochs 100

IMDB Face dataset

To train adversarial filter on universal dataset, simply run:

python train.py --experiment IMDB_filter --name NAME --IMDB_train_mode all \
    --mi 50.0 --gc 50.0 --dc 50.0 --gr 100.0 \
    --shortcut_layers 1 --inject_layers 1 --enc_layers 5 --dec_layers 5 --dis_layers 5

Evaluation

To evaluate adversarial filter, please use eval.py. To see all command-line accessible arguments run:

python eval.py --help

To load the trained filter for evaluation, you can:

either (1) use --filter_name FILTER_NAME --filter_hp FILTER_HP --filter_idx FILTER_IDX. For example, by indicating --filter_name reproduce --filter_hp mi10.0_gc100.0_dc100.0_gr100.0 --filter_idx 19, the filter which is saved as result/reproduce/mi10.0_gc100.0_dc100.0_gr100.0/weights.19.pth will be loaded.

or (2) use --filter_path FILTER_PATH to explicitly indicate the absolute path of filter.

Colored MNIST dataset

To reproduce the results of adversarial filter in the extreme bias point, simply run:

python eval.py --experiment CMNIST_downstream_our --name NAME --biased_var 0 \
    --filter_train_mode universal

To reproduce the results of baseline model in the extreme bias point, simply run:

python eval.py --experiment CMNIST_downstream_baseline --name NAME --biased_var 0

CelebA dataset

To reproduce the results of adversarial filter in the extreme bias point, simply run:

python eval.py --experiment CelebA_downstream_our --name NAME --CelebA_train_mode CelebA_train_ex \
    --attributes Blond_Hair --CelebA_test_mode unbiased_ex --filter_train_mode universal
python eval.py --experiment CelebA_downstream_our --name NAME --CelebA_train_mode CelebA_train_ex \
    --attributes Blond_Hair --CelebA_test_mode conflict_ex --filter_train_mode universal

To reproduce the results of baseline model in the extreme bias point, simply run:

python eval.py --experiment CelebA_downstream_baseline --name NAME \
    --CelebA_test_mode unbiased_ex --attributes Blond_Hair
python eval.py --experiment CelebA_downstream_baseline --name NAME \
    --CelebA_test_mode conflict_ex --attributes Blond_Hair

Adult Income dataset

To reproduce the results of adversarial filter in the extreme bias point, simply run:

python eval.py --experiment Adult_downstream_our --name NAME \
--Adult_train_mode eb1_balanced --Adult_test_mode eb2_balanced --filter_train_mode universal
python eval.py --experiment Adult_downstream_our --name NAME \
--Adult_train_mode eb1_balanced --Adult_test_mode balanced --filter_train_mode universal

To reproduce the results of baseline model in the extreme bias point, simply run:

python eval.py --experiment Adult_downstream_baseline --name NAME \ 
--Adult_train_mode eb1_balanced --Adult_test_mode eb2_balanced 
python eval.py --experiment Adult_downstream_baseline --name NAME \ 
--Adult_train_mode eb1_balanced --Adult_test_mode balanced

IMDB Face dataset

To reproduce the results of adversarial filter in the extreme bias point, simply run:

python eval.py --experiment IMDB_downstream_our --name NAME \ 
--IMDB_train_mode eb1 --IMDB_test_mode eb2 --filter_train_mode universal
python eval.py --experiment IMDB_downstream_our --name NAME \ 
--IMDB_train_mode eb1 --IMDB_test_mode unbiased --filter_train_mode universal
python eval.py --experiment IMDB_downstream_our --name NAME \ 
--IMDB_train_mode eb2 --IMDB_test_mode eb1 --filter_train_mode universal
python eval.py --experiment IMDB_downstream_our --name NAME \ 
--IMDB_train_mode eb2 --IMDB_test_mode unbiased --filter_train_mode universal

To reproduce the results of baseline model in the extreme bias point, simply run:

python eval.py --experiment IMDB_downstream_baseline --name NAME \ 
--IMDB_train_mode eb1 --IMDB_test_mode eb2
python eval.py --experiment IMDB_downstream_baseline --name NAME \ 
--IMDB_train_mode eb1 --IMDB_test_mode unbiased
python eval.py --experiment IMDB_downstream_baseline --name NAME \ 
--IMDB_train_mode eb2 --IMDB_test_mode eb1
python eval.py --experiment IMDB_downstream_baseline --name NAME \ 
--IMDB_train_mode eb2 --IMDB_test_mode unbiased

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SABAF: Removing Strong Attribute Bias from Neural Networks with Adversarial Filtering

Setup

Datasets

Custom

Dataset

Training

Applying

Experiments

Training

Colored MNIST dataset

CelebA dataset

Adult Income dataset

IMDB Face dataset

Evaluation

Colored MNIST dataset

CelebA dataset

Adult Income dataset

IMDB Face dataset

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data/IMDB/raw_data		data/IMDB/raw_data
dataloader		dataloader
models_eval		models_eval
models_train		models_train
prepare_data		prepare_data
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
parse_args.py		parse_args.py
prepare_data.py		prepare_data.py
requirements.txt		requirements.txt
run.sh		run.sh
teaser.pdf		teaser.pdf
teaser.png		teaser.png
train.py		train.py

License

jiazhi412/strong_attribute_bias

Folders and files

Latest commit

History

Repository files navigation

SABAF: Removing Strong Attribute Bias from Neural Networks with Adversarial Filtering

Setup

Datasets

Custom

Dataset

Training

Applying

Experiments

Training

Colored MNIST dataset

CelebA dataset

Adult Income dataset

IMDB Face dataset

Evaluation

Colored MNIST dataset

CelebA dataset

Adult Income dataset

IMDB Face dataset

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages