This project is a replication and modification of experiments from the paper "Black Box Adversarial Prompting for Foundation Models". It focuses on exploring adversarial prompting in foundation models using Google Colab for its GPU capabilities, as the experiments require significant computational resources.
- Create a Huggingface account for access tokens here.
- Obtain a W&B access token.
Use Google Colab to leverage GPU support for running experiments. Check if GPU is available with:
import torch
# Check if GPU is available
if torch.cuda.is_available():
gpu_name = torch.cuda.get_device_name(0)
print(f"GPU: {gpu_name}")
else:
print("CPU is being used.")
Install required dependencies:
!pip install transformers torch nltk pandas wandb gpytorch botorch diffusers torchvision
Run the following commands for text-to-text generation with smaller models that are suitable for lower RAM systems:
!time python3 run_text_exp.py --loss_type perplexity --seed 0 --language_model facebook/opt-350m --embedding_model tinybert --seed_text "Explain list comprehension in Python."
!time python3 run_text_exp.py --loss_type perplexity --seed 0 --language_model facebook/opt-125m --embedding_model tinybert --seed_text "Explain list comprehension in Python."
To run the adversarial prompt: It is related on the output observed from the previous command
!time python3 run_text_exp.py --loss_type perplexity --seed 0 --language_model facebook/opt-350m --embedding_model tinybert --seed_text "usc consumer hen finals Explain list comprehension in Python."
For text-to-image generation, adjust the query size based on PC requirements. The optimal class (e.g., 'bus') can be changed based on specific needs or as mentioned in the paper:
!time python3 image_optimization.py --optimal_class bus --max_allowed_calls_without_progress 1000 --max_n_calls 5000 --seed 0
!time python3 image_optimization.py --optimal_class bus --max_allowed_calls_without_progress 1000 --max_n_calls 5000 --seed 0 --exclude_high_similarity_tokens True
!time python3 image_optimization.py --optimal_class bus --max_allowed_calls_without_progress 3000 --max_n_calls 10000 --seed 0 --exclude_high_similarity_tokens True --prepend_task True --prepend_task_version 1
To use the Square Attack optimization method, add --square_attack True
to the command.
Remove PerplexityWithSeedLoss
from run_text_exp.py
. Replace .cuda()
with .cpu()
in various files for non-NVIDIA GPUs.
! wandb disabled
Methodology alterations may lead to different results from the original paper. This README provides a guide for replicating experiments under specific technical constraints and computational resources.