A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.
-
Updated
May 29, 2026 - Python
A full-stack AI Red Teaming platform securing AI ecosystems via OpenClaw Security Scan, Agent Scan, Skills Scan, MCP scan, AI Infra scan and LLM jailbreak evaluation.
The fastest Trust Layer for AI Agents
Leaderboard Comparing LLM Agent Security on System Prompt Leakage and Attack Probes
Open-source prompt injection detector — 5 layers, 91.7% F1, ~27ms, offline, Apache 2.0
Mithra Scanner is an interactive API testing tool for prompt injection, refusal detection, and LLM security benchmarking. It supports YAML-based rule definitions, custom refusal lists, REST API integration, and provides detailed CLI output for security testing of language model endpoints.
LLM Penetration Testing Framework - Discover vulnerabilities in AI applications before attackers do. 100attacks + AI-powered adaptive mode.
Lightweight AI security framework for prompt validation, output scanning, risk scoring, sensitive data detection, and Zero Trust policy enforcement.
CloakPrompt is a CLI tool that redacts secrets (passwords, API keys, credentials, etc.) before sending data to AI models.
Single-context metacognitive security framework for LLM prompt injection defense
🛡️ Enterprise-grade AI security framework protecting LLMs from prompt injection attacks using ML-powered detection
Industrial LLM agents, prompt safety, and orchestration
Lint your AI coding sessions. Define rules, check compliance, get verdicts.
Secure two-stage RAG orchestration blueprint with deterministic policy routing and fallback controls.
Semantic intent evaluation for agentic systems. Detects instruction-layer risks including authority laundering, consent bypass, concealed execution, and semantic attack patterns in SKILL.md and agent configuration files.
AI Prompt Security Detector - Detect unsolvable traps in prompts (self-reference paradox, undecidable problems, infinite recursion). Protect your LLM applications from prompt vulnerabilities.
SentinelShield: Advanced AI content moderation combining Llama Prompt Guard 2, rule-based filtering, and real-time analysis. Protect your applications from harmful content, prompt injection attacks, and inappropriate material with sub-second response times.
Prompt firewall with rule-based gate decisions for policy enforcement.
Self-hostable, OpenAI-compatible AI gateway with policy-driven PII and secret guardrails. Route sensitive prompts to a local model automatically.
Defensive prompt confidentiality audit for system prompt reconstruction through multi-turn probing
MalPromptSentinel (MPS) is a Claude Code skill that detects malicious prompts in uploaded files before Claude processes them. It provides two-tier scanning to identify prompt injection attacks, role manipulation attempts, privilege escalation, and other adversarial techniques.
Add a description, image, and links to the prompt-security topic page so that developers can more easily learn about it.
To associate your repository with the prompt-security topic, visit your repo's landing page and select "manage topics."