Applying noise to Split Neural networks. Code for Titcombe, T., Hall, A. J., Papadopoulos, P., & Romanini, D. (2021). Practical Defences Against Model Inversion Attacks for Split Neural Networks. arXiv preprint arXiv:2104.05743. (link)
Data input SplitNNs have been shown to be susceptible to, amongst other attacks, black box model inversion. In this attack, an adversary trains an "inversion" model to turn intermediate data (data sent between model parts) back into raw input data. This attack is a particularly relevant for a computational server colluding with a data holder. Applying differential privacy directly to the model (differentially private stochastic gradient descent - the Abadi method) does not defend against this attack as output from a trained model part is deterministic and therefore a decoder model can be trained.
This project aims to protect SplitNNs from black box model inversion attack by adding noise to the data being transferred between model parts. The idea is that the stochasticity of intermediate data can stop a model from learning to invert it back into raw data. Additionally, we combine the noise addition with NoPeekNN, in which the model learns to create an intermediate distribution as uncorrelated with the input data as possible. While NoPeekNN does not provide any guarantees on data leakage, unlike differential privacy, we aim to demonstrate that it can provide some protection against a model inversion attack.
Developed in Python 3.8
,
but similar minor versions should work.
A conda environment,
dpsnn
,
has been provided
with all packages required to run the experiments,
including the local source code
(Pytorch-cpu only - remove cpuonly
to enable GPU computation).
- Run
conda env create -f environment.yml
to create the environment using the latest packages OR - Run
conda env create -f environment-lock.yml
to use fixed package versions (for reproducibility). conda activate dpsnn
to activate the environment
To install the local source code only:
- Clone this repo
- In a terminal, navigate to the repo
- Run
pip install -e .
.
This installs the local package dpsnn
.
Scripts to train a classifier and attacker can be found in scripts
:
-
python scripts/train_model.py --noise_scale <noise_level> --nopeek_weight <weight>
to train a differentially private model using noise drawn from Laplacian distribution with scale<noise_level>
and NoPeek loss weighted by<weight>
. -
python scripts/train_attacker.py --model <name>
to train an attacker on a trained model,<name>
Classifiers are stored in models/classifiers
and are named like mnist_<noise>noise_<nopeek>nopeek_epoch=<X>.ckpt
,
where <noise>
is the scale of laplacian noise added to the intermediate
tensor during training as a decimal. ...05noise
means scale 0.5,
...10noise
means scale 1.0.
<nopeek>
is the weighting of NoPeek loss
in the loss function,
using the same decimal scheme as with noise.
<X>
is the number of training epochs
during which the classifier was performing the best.
Attack models are stored in models/attackers
and are named like
mnist_attacker_model<<classifier>>_set<noise>noise.ckpt
,
where <classifier>
is the stem
(everything but the ckpt suffix)
of the classifier it's attacking.
<noise>
refers to the scale of noise applied
to the intermediate tensor
after training.
_set<noise>noise
is not included
if the noise scale of the classifier
does not change from what it was trained on.
To replicate all experiments present in the paper,
run ./main.sh <arg>
,
where <arg>
is:
noise
to train models with noisenopeek
to train models with NoPeekcombo
to train models with both NoPeek and noiseplain
to train a model without defencesperformance
to calculate the accuracy and Distance Correlation of each model inmodels/classifiers/
all
to run all experiments
We have provided relevant analysis in the notebooks/
folder.
Be aware that previous exploratory notebooks were removed.
Look over previous commits for a full history of experimentation.
The data/
folder is intentionally
left empty
to preserve the project
structure.
This project uses the
MNIST
and EMNIST
datasets.
Each dataset
will be downloaded to
data/
when first used
by a script.
If you have a question about the paper/ experiments/ results, or have noticed a bug in the code, please open an issue in this repository.
If you are providing code, please follow these conventions:
black
to format codeisort
to format imports- Add type hints
- Add docstrings to functions and classes
- Use
pytorch_lightning
to build PyTorch models
Titcombe, T., Hall, A. J., Papadopoulos, P., & Romanini, D. (2021). Practical Defences Against Model Inversion Attacks for Split Neural Networks. arXiv preprint arXiv:2104.05743. (link)
You can cite this work using:
@article{titcombe2021practical,
title={Practical Defences Against Model Inversion Attacks for Split Neural Networks},
author={Titcombe, Tom and Hall, Adam J and Papadopoulos, Pavlos and Romanini, Daniele},
journal={arXiv preprint arXiv:2104.05743},
year={2021}
}
Apache 2.0. See the full license.