Security Verifiers

A composable suite of security and alignment RL environments with executable, verifiable rewards. Built for Prime Intellect's Verifiers framework.

Vision

Security Verifiers demonstrates how executable rewards can advance both security and alignment research. Rather than relying on LLM-as-judge, our environments use real tools (OPA, Semgrep, test suites) to verify agent behavior, producing rewards that are:

Executable: Rewards come from running actual security tools
Calibrated: Agents are rewarded for well-calibrated confidence
Cost-aware: Asymmetric penalties reflect real operational costs (missing malware >> false alarms)
Composable: Shared schemas and tools enable transfer across tasks

Environments

Environment	Type	Task	Status
E1: network-logs	SingleTurn	Anomaly detection with calibration & abstention	Production
E2: config-verification	ToolEnv	Security auditing with OPA/KubeLinter/Semgrep	Production
E3: code-vulnerability	ToolEnv	Vulnerability detection and repair	Alpha (Q1)
E4: phishing-detection	SingleTurn	Phishing classification with evidence	Alpha (Q1)
E5: redteam-attack	MultiTurn	Red team attack scenarios	Alpha (Q1)
E6: redteam-defense	MultiTurn	Red team defense scenarios	Alpha (Q1)

Quick Start

# Setup
make setup && source .venv/bin/activate

# Configure API keys
cp .env.example .env  # Edit with your OPENAI_API_KEY
set -a && source .env && set +a

# Run your first evaluation
make eval-e1 MODELS="gpt-5-mini" N=10

See docs/getting-started.md for detailed setup instructions.

Evaluation

# E1: Network log anomaly detection
make eval-e1 MODELS="gpt-5-mini,gpt-4.1-mini" N=100

# E2: Configuration verification (multi-turn with tools)
make eval-e2 MODELS="gpt-5-mini" N=10 INCLUDE_TOOLS=true

# Generate metrics reports
make report-network-logs
make report-config-verification

Results are written to outputs/evals/<env>--<model>/<run_id>/.

Hub Deployment

Deploy environments to Prime Intellect's Environments Hub:

make hub-deploy E=network-logs
vf-eval your-org/sv-env-network-logs --model gpt-5-mini --num-examples 10

See docs/hub-deployment.md for complete deployment guide.

Project Structure

security-verifiers/
├── environments/       # E1-E6 environment packages
├── sv_shared/          # Shared parsers, rewards, utilities
├── scripts/            # Evaluation and data building scripts
├── docs/               # Documentation
├── plans/              # Roadmap and productionization plans
└── outputs/            # Evaluation results

Documentation

Document	Description
Getting Started	Installation and first evaluation
Development Guide	Contributing, testing, CI
Hub Deployment	Deploy to Prime Intellect Hub
Prime Lab Integration	Hosted RL training and evaluation
Datasets Guide	Dataset access and management
Logging Guide	Weave tracing configuration
CLAUDE.md	Agent/LLM instructions

Baselines

Run quick baselines on the public mini sets:

make baseline-e1 MODEL="gpt-5-mini"
make baseline-e2 MODEL="gpt-5-mini" INCLUDE_TOOLS=true

Scoreboards are written to bench/scoreboards/.

Prime Lab Integration

Prime Lab integration infrastructure is complete in v0.3.0 (WP2.5/WP2.5a), with a hosted-first path and a validated fallback path:

# Check platform compatibility first
make lab-check

# Hosted training/eval (when lab compatibility + access are available)
make lab-run-e1 MODEL=Qwen/Qwen3-4B-Instruct-2507 TEAM=your-team
make lab-run-e2 MODEL=Qwen/Qwen3-4B-Instruct-2507 TEAM=your-team
make lab-eval-e1 MODEL=Qwen/Qwen3-4B-Instruct-2507 TEAM=your-team
make lab-eval-e2 MODEL=Qwen/Qwen3-4B-Instruct-2507 TEAM=your-team

# Fallback: hosted-style eval via prime env (keeps report/metadata parity)
make env-eval-e1 MODEL=Qwen/Qwen3-4B-Instruct-2507 TEAM=your-team N=100
make env-eval-e2 MODEL=Qwen/Qwen3-4B-Instruct-2507 TEAM=your-team N=50

Replace your-team with your Prime Intellect team slug (from prime auth status).

See docs/PRIME-LAB-INTEGRATION.md for the full integration guide.

Roadmap

See plans/ROADMAP-Q1-2026.md for current development priorities:

Work Package	Description	Status
WP0	Benchmark integrity hardening	Complete
WP1	Metrics contracts and report generator	Complete
WP2	Baselines and public mini sets	Complete
WP2.5	Prime Lab integration (v0.3.0)	Complete
WP2.5a	Hosted-eval fallback parity	Complete
WP3a/WP3b	Hosted RL proof on E1 and E2	Next
WP3c	Reward-source comparator (executable vs LLM-judge)	Next
WP4	Multi-reward RL stability research	Planned
WP5	SV-Bench v0.1 release + technical report	Planned

Contributing

See CONTRIBUTING.md for contribution guidelines.

License

MIT License - see LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 169 Commits
.claude		.claude
.github/workflows		.github/workflows
baselines		baselines
bench		bench
configs		configs
datasets/public_mini		datasets/public_mini
docs		docs
environments		environments
outputs		outputs
plans		plans
research		research
scripts		scripts
skills		skills
sv_shared		sv_shared
templates		templates
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.python-version		.python-version
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
VERSIONING.md		VERSIONING.md
pyproject.toml		pyproject.toml
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Security Verifiers

Vision

Environments

Quick Start

Evaluation

Hub Deployment

Project Structure

Documentation

Baselines

Prime Lab Integration

Roadmap

Contributing

License

About

Uh oh!

Releases 11

Packages

Contributors 4

Uh oh!

Languages

License

intertwine/security-verifiers

Folders and files

Latest commit

History

Repository files navigation

Security Verifiers

Vision

Environments

Quick Start

Evaluation

Hub Deployment

Project Structure

Documentation

Baselines

Prime Lab Integration

Roadmap

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 11

Packages 0

Contributors 4

Uh oh!

Languages

Packages