GitHub - GreyDGL/PentestGPT: Automated Penetration Testing Agentic Framework Powered by Large Language Models

PentestGPT

AI-Powered Autonomous Penetration Testing Agent
Published at USENIX Security 2024

Official Website: pentestgpt.com »

Research Paper · Report Bug · Request Feature

Demo

Installation

Watch on YouTube

PentestGPT in Action

Watch on YouTube

What's New in v1.0 (Agentic Upgrade)

Autonomous Agent - Agentic pipeline for intelligent, autonomous penetration testing
Session Persistence - Save and resume penetration testing sessions
Docker-First - Isolated, reproducible environment with security tools pre-installed

In Progress: Multi-model support for OpenAI, Gemini, and other LLM providers

Features

AI-Powered Challenge Solver - Leverages LLM advanced reasoning to perform penetration testing and CTFs
Live Walkthrough - Tracks steps in real-time as the agent works through challenges
Multi-Category Support - Web, Crypto, Reversing, Forensics, PWN, Privilege Escalation
Real-Time Feedback - Watch the AI work with live activity updates
Extensible Architecture - Clean, modular design ready for future enhancements

Quick Start

Prerequisites

Docker (required) - Install Docker
LLM Provider (choose one):
- Anthropic API Key from console.anthropic.com
- Claude OAuth Login (requires Claude subscription)
- OpenRouter for alternative models at openrouter.ai
- Tutorial: Using Local Models with Claude Code

Installation

# Clone and build
git clone --recurse-submodules https://github.com/GreyDGL/PentestGPT.git
cd PentestGPT
make install

# Configure authentication (first time only)
make config

# Connect to container
make connect

Note: The --recurse-submodules flag downloads the benchmark suite. If you already cloned without it, run: git submodule update --init --recursive

Try a Benchmark

cd benchmark/standalone-xbow-benchmark-runner
python3 run_benchmarks.py --range 1-1 --pattern-flag

See Benchmark Documentation for detailed usage.

Commands Reference

Command	Description
`make install`	Build the Docker image
`make config`	Configure API key (first-time setup)
`make connect`	Connect to container (main entry point)
`make stop`	Stop container (config persists)
`make clean-docker`	Remove everything including config

Usage

# Interactive TUI mode (default)
pentestgpt --target 10.10.11.234

# Non-interactive mode
pentestgpt --target 10.10.11.100 --non-interactive

# With challenge context
pentestgpt --target 10.10.11.50 --instruction "WordPress site, focus on plugin vulnerabilities"

Keyboard Shortcuts: F1 Help | Ctrl+P Pause/Resume | Ctrl+Q Quit

Using Local LLMs

PentestGPT supports routing requests to local LLM servers (LM Studio, Ollama, text-generation-webui, etc.) running on your host machine.

Prerequisites

Local LLM server with an OpenAI-compatible API endpoint
- LM Studio: Enable server mode (default port 1234)
- Ollama: Run ollama serve (default port 11434)

Setup

# Configure PentestGPT for local LLM
make config
# Select option 4: Local LLM

# Start your local LLM server on the host machine
# Then connect to the container
make connect

Customizing Models

Edit scripts/ccr-config-template.json to customize:

localLLM.api_base_url: Your LLM server URL (default: host.docker.internal:1234)
localLLM.models: Available model names on your server
Router section: Which models handle which operations

Route	Purpose	Default Model
`default`	General tasks	openai/gpt-oss-20b
`background`	Background operations	openai/gpt-oss-20b
`think`	Reasoning-heavy tasks	qwen/qwen3-coder-30b
`longContext`	Large context handling	qwen/qwen3-coder-30b
`webSearch`	Web search operations	openai/gpt-oss-20b

Troubleshooting

Connection refused: Ensure your LLM server is running and listening on the configured port
Docker networking: Use host.docker.internal (not localhost) to access host services from Docker
Check CCR logs: Inside the container, run cat /tmp/ccr.log

Telemetry

PentestGPT collects anonymous usage data to help improve the tool. This data is sent to our Langfuse project and includes:

Session metadata (target type, duration, completion status)
Tool execution patterns (which tools are used, not the actual commands)
Flag detection events (that a flag was found, not the flag content)

No sensitive data is collected - command outputs, credentials, or actual flag values are never transmitted.

Opting Out

# Via command line flag
pentestgpt --target 10.10.11.234 --no-telemetry

# Via environment variable
export LANGFUSE_ENABLED=false

Benchmarks

PentestGPT includes 104 XBOW validation benchmarks for comprehensive testing and evaluation.

cd benchmark/standalone-xbow-benchmark-runner

python3 run_benchmarks.py --range 1-10 --pattern-flag   # Run benchmarks 1-10
python3 run_benchmarks.py --all --pattern-flag          # Run all 104 benchmarks
python3 run_benchmarks.py --retry-failed                # Retry failed benchmarks
python3 run_benchmarks.py --dry-run --range 1-5         # Preview without executing

Performance Highlights

PentestGPT achieved an 86.5% success rate (90/104 benchmarks) on the XBOW validation suite:

Cost: Average $1.11, Median $0.42 per successful benchmark
Time: Average 6.1 minutes, Median 3.3 minutes per successful benchmark
Success rates by difficulty:
- Level 1: 91.1%
- Level 2: 74.5%
- Level 3: 62.5%

For detailed benchmark results, analysis, and automated testing instructions, see the Benchmark Documentation.

Legacy Version

The previous multi-LLM version (v0.15) supporting OpenAI, Gemini, Deepseek, and Ollama is archived in legacy/:

cd legacy && pip install -e . && pentestgpt --reasoning gpt-4o

Citation

If you use PentestGPT in your research, please cite our paper:

@inproceedings{299699,
  author = {Gelei Deng and Yi Liu and Víctor Mayoral-Vilches and Peng Liu and Yuekang Li and Yuan Xu and Tianwei Zhang and Yang Liu and Martin Pinzger and Stefan Rass},
  title = {{PentestGPT}: Evaluating and Harnessing Large Language Models for Automated Penetration Testing},
  booktitle = {33rd USENIX Security Symposium (USENIX Security 24)},
  year = {2024},
  isbn = {978-1-939133-44-1},
  address = {Philadelphia, PA},
  pages = {847--864},
  url = {https://www.usenix.org/conference/usenixsecurity24/presentation/deng},
  publisher = {USENIX Association},
  month = aug
}

License

Distributed under the MIT License. See LICENSE.md for more information.

Disclaimer: This tool is for educational purposes and authorized security testing only. The authors do not condone any illegal use. Use at your own risk.

Acknowledgments

Research supported by Quantstamp and NTU Singapore

(back to top)

Name		Name	Last commit message	Last commit date
Latest commit History 302 Commits
.github/workflows		.github/workflows
benchmark		benchmark
demo		demo
legacy		legacy
pentestgpt		pentestgpt
research		research
scripts		scripts
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
CLAUDE.md		CLAUDE.md
Dockerfile		Dockerfile
LICENSE.md		LICENSE.md
Makefile		Makefile
README.md		README.md
docker-compose.yml		docker-compose.yml
fix-workspace-permissions.sh		fix-workspace-permissions.sh
pyproject.toml		pyproject.toml
setup.sh		setup.sh
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

PentestGPT

Demo

Installation

PentestGPT in Action

What's New in v1.0 (Agentic Upgrade)

Features

Quick Start

Prerequisites

Installation

Try a Benchmark

Commands Reference

Usage

Using Local LLMs

Prerequisites

Setup

Customizing Models

Troubleshooting

Telemetry

Opting Out

Benchmarks

Performance Highlights

Legacy Version

Citation

License

Acknowledgments

About

Uh oh!

Releases 11

Packages

Contributors 23

Languages

License

GreyDGL/PentestGPT

Folders and files

Latest commit

History

Repository files navigation

PentestGPT

Demo

Installation

PentestGPT in Action

What's New in v1.0 (Agentic Upgrade)

Features

Quick Start

Prerequisites

Installation

Try a Benchmark

Commands Reference

Usage

Using Local LLMs

Prerequisites

Setup

Customizing Models

Troubleshooting

Telemetry

Opting Out

Benchmarks

Performance Highlights

Legacy Version

Citation

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 23

Languages

Packages