This is the repo for Adversarial Robustness via Runtime Masking and Cleansing, Yi-Hsuan Wu, Chia-Hung Yuan, and Shan-Hung Wu, In Proceedings of ICML 2020. Our code is implemented in TensorFlow 2.0 using all the best practices.
We devise a new defense method, called runtime masking and cleansing (RMC), to improve adversarial robustness. RMC adapts the network at runtime before making a prediction to dynamically mask network gradients and cleanse the model of the non-robust features inevitably learned during the training process due to the size limit of the training set.
The following figure illustrates the defense mechanism in RMC:
- Augment dataset with adversarial examples
- Find K-nearest neighbors (KNN) of test data from the augmented dataset
- Adapt the network with KNN
- Make predictions
For more details, please refer to our main paper, supplementary materials, video, or slide.
Clone and install requirements.
git clone https://github.com/nthu-datalab/Runtime-Masking-and-Cleansing.git
cd Runtime-Masking-and-Cleansing
pip install -r requirements.txt
RMC works well with any existing model architecture. The following command evaluate pretrained ResNet-152v2 downloaded from TensorFlow website on ImageNet:
python evaluate.py
It is worth noticing that before running evaluate.py
, we have to create the augmented dataset, adversarial examples for evaluation, and extract features (hidden representations) from those data. All corresponding codes can be found in /prepare
folder. For example, the following command creates the perturbed training dataset with PGD (Projected Gradient Descent) attack.
cd prepare
python augment_dataset.py
To visualize the test data and its corresponding nearest neighbors, please refer to visualize.ipynb
.
Before running any code, please set directories first.
BASE_DIR
: Path to "Runtime-Masking-and-Cleansing" folder.TRAIN_DATA_DIR
: Path to training dataset.TRAIN_LABEL_DIR
: Path to labels of training dataset.AUG_DATA_DIR
: Path to augmented dataset.AUG_FEATURES
: Path to features of augmented dataset.EVAL_DATA_DIR
: Path to evaluation dataset.EVAL_LABEL_DIR
: Path to label of evaluation dataset.EVAL_FEATURES
: Path to features of evaluation dataset.ATTACK_DATA_DIR
: Path to perturbed evaluation dataset.ATTACK_LABEL_DIR
: Path to the target label of perturbed evaluation data. Only use when evaluating targeted attack.ATTACK_FEATURES
: Path to features of perturbed evaluation dataset.
Evaluate with different configurations:
K = 2048
EPOCHS = 100
EARLY_STOP = 5
LEARNING_RATE = 1e-5
BUFFER_SIZE = 10000
IMG_SIZE = 224
RESIZE_SIZE = 256
BATCH_SIZE = 64
IMG_SHAPE = (IMG_SIZE, IMG_SIZE, 3)
EPSILON = 16/255
EPS_ITERS = 1/255
NB_ITERS = 100
K
: Hyperparameter in k-NN.EARLY_STOP
: Early stop criteria.EPSILON
: Allowable perturbation when computing adversarial examples.EPS_ITERS
: Step size used in PGD attack.NB_ITERS
: Number of iterations used in PGD attack.
We use MNIST, CIFAR-10, and ImageNet dataset in our paper. First two can be downloaded through TensorFlow API.
Model | Clean Accuracy (%) | Error Rate / Attack Success Rate (%) | ||
---|---|---|---|---|
Clean Images | 10-step PGD(8/255) | 10-step PGD(16/255) | 100-step PGD(16/255) | |
None | 72.9 | 8.5 / 54.69 | 5.2 / 61.7 | 0.6 / 98.1 |
Adv. Trained | 62.3 | N/A | 52.5 / 5.5 | 41.7 / 31.0 |
Denoising Block | 65.3 | N/A | 55.7 / 4.9 | 45.5 / 26.6 |
DeepNN | 26.6 | 12.9 / 0.16 | 8.7 / 1.2 | 7.8 / 1.2 |
WebNN | 27.8 | 18.8 / 0.54 | 15.2 / 0.3 | 13.9 / 0.3 |
RMC | 73.6 | 62.4 / 0.28 | 55.9 / 1.6 | 55.6 / 1.3 |
If you find this code is helpful for your research, please cite our ICML 2020 paper:
@inproceedings{wu2020adversarial,
title={Adversarial Robustness via Runtime Masking and Cleansing},
author={Wu, Yi-Hsuan and Yuan, Chia-Hung and Wu, Shan-Hung},
booktitle={International Conference on Machine Learning},
pages={10399--10409},
year={2020},
organization={PMLR}
}