Adversarial Attacks on Neural Networks: A Survey

This repository contains implementations of various optimization methods for generating adversarial examples against convolutional neural networks. It accompanies the research paper "Adversarial Attacks on Neural Networks: A Survey."

Below are some example visualizations from the paper, illustrating the effects of different adversarial attacks.

Figure 1: Untargeted vs. Targeted Attack Example (from paper Figure 1) (Illustrates untargeted vs. targeted adversarial attack outcomes on ResNet-18. Original image correctly classified as "saluki". Untargeted attack results in "beagle". Targeted attack forces classification to "gorilla".)

Figure 2: Qualitative Comparison of Untargeted Attacks (from paper Figure 2) (Qualitative comparison of untargeted attacks on ResNet-18 generated by different methods for several input images. Original images are correctly classified. All attack methods induce misclassification to various incorrect labels.)

Figure 3: Grad-CAM Visualization of Model Attention (from paper Figure 6.1 / Figure 3 in PDF text) (Comparison of adversarial attacks and their effect on model attention using Grad-CAM. Illustrates how iterative methods successfully redirect model's attention for targeted attacks, while single-iteration methods struggle.)

Overview

Adversarial examples are carefully crafted perturbations that, when added to an input image, cause neural networks to misclassify the image while appearing visually imperceptible to humans. This project and the accompanying paper provide a systematic survey and empirical comparison of six foundational adversarial attack strategies:

Fast Gradient Sign Method (FGSM)
Fast FGSM (FFGSM)
DeepFool
Carlini & Wagner (C&W)
Projected Gradient Descent (PGD)
Conjugate Gradient (CG)

Recommended Attack Methods

Based on the findings in our survey, the following attack methods are highlighted:

C&W (Carlini & Wagner): Consistently achieves high success rates with minimal perturbations, particularly for targeted attacks, but incurs high computational overhead.

# Example for a strong targeted L2 attack (parameters from paper/tests):
# Assumes test.JPEG exists at the specified path
python demo.py -i path/to/your/test.JPEG -a cw -n L2 -t -tm least-likely -cv 10 -k 5 -s 500 -lr 0.01 -o results/demo_cw_targeted
# Example for a default untargeted L2 attack:
# python demo.py -i path/to/your/test.JPEG -a cw -n L2 -cv 1 -k 0 -s 1000 -lr 0.01 -o results/demo_cw_untargeted

PGD (Projected Gradient Descent): Offers a strong balance between attack effectiveness and computational cost. Effective for both untargeted and targeted attacks under various L_p norms.

# Example for Linf Untargeted (epsilon=8/255, 40 steps, step_size=eps/4):
python demo.py -i path/to/your/test.JPEG -a pgd -n Linf -e 0.03137 -s 40 -ss 0.00784 -o results/demo_pgd_linf_untargeted
# Example for Linf Targeted (epsilon=16/255, 200 steps, step_size=eps/10):
# python demo.py -i path/to/your/test.JPEG -a pgd -n Linf -t -tm least-likely -e 0.06274 -s 200 -ss 0.00627 -o results/demo_pgd_linf_targeted

DeepFool: Particularly effective for generating untargeted attacks with very small L2 perturbations, though computationally more intensive than FGSM or PGD. (DeepFool is generally untargeted).
```
# Example (L2 norm is implicit for DeepFool):
python demo.py -i path/to/your/test.JPEG -a deepfool -s 50 -os 0.02 -o results/demo_deepfool
```

CG (Conjugate Gradient): Can be more efficient than PGD on certain complex loss landscapes by utilizing approximate second-order information, offering a balance between cost and potency.

# Example for Linf Untargeted (epsilon=8/255, 40 steps):
python demo.py -i path/to/your/test.JPEG -a cg -n Linf -e 0.03137 -s 40 -al 0.00784 -o results/demo_cg_linf_untargeted
# Example for Linf Targeted (epsilon=16/255, 60 steps from paper):
# python demo.py -i path/to/your/test.JPEG -a cg -n Linf -t -tm least-likely -e 0.06274 -s 60 -al 0.00627 -o results/demo_cg_linf_targeted

FGSM/FFGSM: The fastest methods, suitable for scenarios requiring rapid generation (e.g., adversarial training). Less effective for targeted attacks and against robust models. FFGSM adds a small random initialization to potentially improve FGSM.

# Example FGSM (Untargeted, Linf, epsilon=4/255):
python demo.py -i path/to/your/test.JPEG -a fgsm -n Linf -e 0.01568 -o results/demo_fgsm_linf_untargeted
# Example FFGSM (Untargeted, Linf, epsilon=8/255, alpha=0.1*epsilon):
python demo.py -i path/to/your/test.JPEG -a ffgsm -n Linf -e 0.03137 -al 0.00313 -o results/demo_ffgsm_linf_untargeted
# Example FFGSM (Targeted, Linf, epsilon=32/255, alpha=0.02 from paper):
# python demo.py -i path/to/your/test.JPEG -a ffgsm -n Linf -t -tm least-likely -e 0.12549 -al 0.02 -o results/demo_ffgsm_linf_targeted

Repository Structure

adversarial-attacks/
├── data/                   # Directory for dataset storage
├── results/                # Experimental results
├── src/                    # Source code
│   ├── attacks/            # Attack implementations
│   │   ├── base.py         # Base attack class
│   │   ├── fgsm.py         # Fast Gradient Sign Method
│   │   ├── ffgsm.py        # Fast FGSM
│   │   ├── deepfool.py     # DeepFool
│   │   ├── cw.py           # Carlini & Wagner
│   │   ├── pgd.py          # Projected Gradient Descent
│   │   └── cg.py           # Conjugate Graident Method
│   ├── models/             # Model wrappers
│   ├── plot/               # Visualization tools
│   └── utils/              # Utility functions
│       ├── data.py         # Data loading utilities
│       ├── evaluation.py   # Evaluation metrics
│       ├── metrics.py      # Performance metrics
│       └── projections.py  # Projection operations
├── requirements.txt        # Dependencies
└── README.md               # This file

Installation

# Clone the repository
git clone https://github.com/ali-izhar/adversarial-attacks.git
cd adversarial-attacks

# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

Usage

Running Experiments

# Compare optimization methods
python experiments/compare_optimizers.py

Implementing Your Own Attacks

You can extend the base attack class to implement your own optimization methods:

from src.attacks.base import BaseAttack

class MyAttack(BaseAttack):
    def __init__(self, model, **kwargs):
        super().__init__(model, **kwargs)
        
    def generate(self, images, labels):
        # Implement your attack here
        pass

Evaluation Metrics

We evaluate each optimization method using criteria discussed in the paper:

Attack Effectiveness: Percentage of inputs successfully misclassified (Success Rate).
Perturbation Efficiency: Measured by $L_2$ norm, $L_\infty$ norm, and Structural Similarity Index (SSIM).
Computational Efficiency: Assessed via average iterations to convergence, total gradient computations, and wall-clock runtime per successful attack.

Citation

If you use this code or refer to the findings in your research, please cite our paper:

@article{ali2025survey,
  title={Adversarial Attacks on Neural Networks: A Survey},
  author={Ali, Izhar},
  journal={arXiv preprint arXiv:XXXX.XXXXX},
  year={2025}
}

License

This project is licensed under the MIT License - see the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Adversarial Attacks on Neural Networks: A Survey

Overview

Recommended Attack Methods

Repository Structure

Installation

Usage

Running Experiments

Implementing Your Own Attacks

Evaluation Metrics

Citation

License

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 90 Commits
assets		assets
config		config
data		data
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt

License

ali-izhar/adversarial-attacks

Folders and files

Latest commit

History

Repository files navigation

Adversarial Attacks on Neural Networks: A Survey

Overview

Recommended Attack Methods

Repository Structure

Installation

Usage

Running Experiments

Implementing Your Own Attacks

Evaluation Metrics

Citation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages