AgenticRed: Optimizing Agentic Systems for Automated Red-teaming

AgenticRed

AgenticRed is an automated pipeline that leverages LLMs’ in-context learning to iteratively design and refine red-teaming systems without human intervention, based on the performance metric of each system.

Repository Structure

.
├── server/           # Scripts to start attacker, classifier, and defender model servers
├── search.sh         # Main script to run the search process
├── eval.sh           # Main script to run the evaluation process
└── README.md

Prerequisites

If you want to host your model locally, you need at least 3 GPUs (e.g., 3× L40) or equivalent capacity
- One for the attacker model server
- One for the classifier / judge model server
- One for the defender / target model server
If you want to call API endpoints instead, we provide OpenRouter/OpenAI client API.

1. Start Model Servers

You can run local servers (recommended for experiments) or point to external APIs.

1.1 Local servers

Inside server/, you should have scripts or configs such as:

cd server/

bash attacker_server.sh          # Starts attacker model server
bash classifier_server.sh        # Starts classifier / guardrail model server
bash defender_server.sh          # Starts defender / target model server

Document the actual ports and endpoints so the rest of the pipeline can reference them. Eg. http://127.0.0.1/8000/v1

1.2 Using external APIs (optional)

Instead of local servers, you can configure the system to call:

OpenAI APIs
OpenRouter APIs

Update your API keys:

export OPENAI_API_KEY=''
export GEMINI_API_KEY=''
export DEEPSEEK_API_KEY=''
export OPENROUTER_API_KEY=''

2. Run the Search Process

The search step iteratively designs new red-teaming systems.

Typical usage:

cd _redteam
bash scripts/search.sh

e.g. bash search.sh --expr 1

Key arguments (adapt to your implementation):

--expr: Experiment index corresponding to different configurations.
--config: Path to a config file specifying:
- Attacker, classifier, defender endpoints
- Search hyperparameters (iterations, beam width, budgets, etc.)
- Output directory for JSON logs: save_dir
- Whether to enable OpenRouter

3. Run the Evaluation Process

After search completes, evaluate the discovered attacks on a separate evaluation dataset and target model / benchmark.

cd _redteam
bash scripts/eval.sh

Key arguments (adapt to your implementation):

evaluator_model: Models used for evaluation
benchmark: Currently only implemented HarmBench and StrongREJECT.

Example Workflow

# 1. Start servers if you are hosting local servers
bash server/attacker_server.sh
bash server/defender_server.sh
bash server/classifirt_server.sh

# 2. Run search
bash _redteam/search.sh --expr 1

# 3. Run evaluation
bash _redteam/eval.sh

Citation

@misc{yuan2026agenticredoptimizingagenticsystems,
      title={AgenticRed: Optimizing Agentic Systems for Automated Red-teaming}, 
      author={Jiayi Yuan and Jonathan Nöther and Natasha Jaques and Goran Radanović},
      year={2026},
      eprint={2601.13518},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2601.13518}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
_redteam		_redteam
results		results
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AgenticRed: Optimizing Agentic Systems for Automated Red-teaming

AgenticRed

Repository Structure

Prerequisites

1. Start Model Servers

1.1 Local servers

1.2 Using external APIs (optional)

2. Run the Search Process

3. Run the Evaluation Process

Example Workflow

Citation

About

Uh oh!

Releases

Packages

Languages

yuanjiayiy/AgenticRed

Folders and files

Latest commit

History

Repository files navigation

AgenticRed: Optimizing Agentic Systems for Automated Red-teaming

AgenticRed

Repository Structure

Prerequisites

1. Start Model Servers

1.1 Local servers

1.2 Using external APIs (optional)

2. Run the Search Process

3. Run the Evaluation Process

Example Workflow

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages