Skip to content

πŸ›‘οΈ Automatically catch security vulnerabilities in AI-generated code

License

Notifications You must be signed in to change notification settings

kocendavid/guardian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

1 Commit
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Guardian: AI Code Verification System

Stop the verification tax. Automatically catch vulnerabilities and anti-patterns in AI-generated code.

NEW: 🧠 Learning system that adapts to your project's patterns and reduces false positives over time.

Python 3.8+ License: MIT

The Problem

AI code generation tools in 2026 create 1.7x more problems than human-written code:

  • πŸ“ˆ 23.5% increase in incidents per pull request
  • ⚠️ 66% of developers report inaccurate code suggestions
  • πŸ› 45% report longer debugging times
  • πŸ” Security pattern degradation and architectural violations

The "verification tax" (time spent proving AI wrong) often exceeds time saved by generation.

The Solution

Guardian uses specialized AI agents to automatically verify AI-generated code for:

  • πŸ”’ Security: SQL injection, command injection, secrets exposure, weak crypto, XXE, deserialization
  • πŸ—οΈ Architecture: God classes, SOLID violations, circular dependencies
  • 🎯 Patterns: Code smells, deep nesting, code duplication
  • πŸ“ Maintainability: Long functions, too many parameters, magic numbers
  • 🧠 Learning: Adapts to your project's patterns, reduces false positives over time

Fast: 3-11ms verification time Parallel: Multi-agent concurrent analysis Actionable: Provides specific fixes, not just warnings Adaptive: Learns from your feedback to improve accuracy

Quick Start

Installation

# Clone repository
git clone https://github.com/yourorg/guardian
cd guardian

# Install (no dependencies required - uses Python stdlib)
pip install -e .

Basic Usage

# Verify a single file
python3 -m guardian.cli verify myfile.py

# Verify entire directory
python3 -m guardian.cli verify src/

# Verify only changed files (git)
python3 -m guardian.cli verify --changed

# CI/CD: Fail build on high+ severity
python3 -m guardian.cli verify src/ --fail-on high

# Save report to file
python3 -m guardian.cli verify src/ -o report.json --format json

# View verification history (learning system)
python3 -m guardian.cli history

# View learning metrics
python3 -m guardian.cli metrics

# Record feedback to improve accuracy
python3 -m guardian.cli feedback <finding-hash> fixed

Python API

from guardian import Guardian
import asyncio

async def verify_code():
    guardian = Guardian()

    # Verify single file
    report = await guardian.verify_file("myfile.py")

    # Verify directory
    report = await guardian.verify_directory("src/")

    # Check results
    print(f"Found {report.total_issues} issues")
    print(f"Critical: {len(report.critical_findings)}")

    # Get markdown report
    print(report.to_markdown())

    # Get JSON report
    print(report.to_json(pretty=True))

asyncio.run(verify_code())

Demo

Try Guardian on example vulnerable code:

python3 guardian_demo.py

Output:

======================================================================
Guardian: AI Code Verification System - Demo
======================================================================

Running Guardian verification...

Total files analyzed: 7
Total issues found: 20
Verification time: 11ms

Issues by severity:
  - CRITICAL: 9
  - HIGH: 6
  - MEDIUM: 3
  - LOW: 2

Learning System 🧠

Guardian learns from your feedback to continuously improve accuracy and reduce false positives.

How It Works

  1. Track History: Every verification is saved locally
  2. Provide Feedback: Mark findings as fixed/dismissed
  3. Learn Patterns: Guardian calculates accuracy metrics per finding type
  4. Adjust Confidence: Future verifications apply learned adjustments

Example

# Run verification
python3 -m guardian.cli verify src/
# Finding: "Potential SQL injection" (confidence: 85%)

# Dismiss false positive
python3 -m guardian.cli feedback abc123... dismissed --reason false_positive

# After 3-5 dismissals, Guardian learns
python3 -m guardian.cli metrics
# Suggested adjustment: ↓ sql_injection: 0.70x

# Next verification applies adjustment
python3 -m guardian.cli verify src/
# Finding: "Potential SQL injection" (confidence: 60%)
# Note: "[Learning: Confidence adjusted 0.70x based on historical feedback]"

Key Benefits

  • Reduces Alert Fatigue: Lower confidence for patterns you consistently dismiss
  • Improves Signal: Higher confidence for patterns you consistently fix
  • Project-Aware: Learns your project's conventions over time
  • Privacy-First: All data stored locally, never sent anywhere

πŸ“– Full Learning System Documentation


Features

Security Agent

Detects:

  • SQL injection (f-strings, string concatenation)
  • Command injection (subprocess, os.system)
  • Hardcoded secrets (API keys, passwords, tokens)
  • Weak cryptography (MD5, SHA1)
  • Path traversal vulnerabilities
  • XML External Entity (XXE) attacks
  • Insecure deserialization (pickle, yaml)
  • Dangerous functions (eval, exec, compile)

Example:

# Guardian detects this SQL injection:
query = f"SELECT * FROM users WHERE id={user_id}"  # ❌ CRITICAL

# Suggests this fix:
cursor.execute("SELECT * FROM users WHERE id=?", (user_id,))  # βœ…

Pattern Agent

Detects:

  • God classes (too many methods/responsibilities)
  • Long functions (>50 lines)
  • Deep nesting (>4 levels)
  • Code duplication
  • Too many parameters (>5)
  • Global state usage
  • Magic numbers (unexplained constants)

Example:

# Guardian detects this god class:
class DataManager:  # ❌ HIGH: 23 methods
    def add_data(self): pass
    def remove_data(self): pass
    # ... 21 more methods

# Suggests:
# Split into focused classes: DataReader, DataWriter, DataValidator

CI/CD Integration

GitHub Actions

name: Guardian Code Verification

on: [push, pull_request]

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.10'
      - name: Install Guardian
        run: pip install -e .
      - name: Run Guardian
        run: python3 -m guardian.cli verify src/ --fail-on high --format json -o report.json
      - name: Upload Report
        uses: actions/upload-artifact@v3
        with:
          name: guardian-report
          path: report.json

Pre-commit Hook

# .git/hooks/pre-commit
#!/bin/bash
python3 -m guardian.cli verify --changed --fail-on critical

Report Format

Guardian generates detailed reports with:

  • Severity levels: CRITICAL, HIGH, MEDIUM, LOW, INFO
  • Confidence scores: 0.0-1.0 (how certain is the finding)
  • False positive risk: 0.0-1.0 (likelihood of false alarm)
  • Effective severity: Combines severity Γ— confidence Γ— (1 - FP risk)
  • Actionable suggestions: Specific fixes with code examples
  • References: CWE, OWASP, documentation links

Markdown Output

### 1. πŸ”΄ Potential SQL Injection Vulnerability

**Severity**: CRITICAL | **Confidence**: 85% | **Agent**: security

**Location**: `src/api.py:42`

SQL query appears to use string formatting with user-controllable data...

**Recommendation**:
Use parameterized queries instead of string formatting:
- For sqlite3: cursor.execute('SELECT * FROM users WHERE id=?', (user_id,))

**References**: `CWE-89`, `OWASP A03:2021`

**Impact**: Attackers could read, modify, or delete database data.

JSON Output

{
  "summary": {
    "total_issues": 12,
    "critical": 2,
    "high": 4,
    "medium": 5,
    "low": 1,
    "verification_time_ms": 4523
  },
  "findings": [
    {
      "agent": "security",
      "severity": "critical",
      "confidence": 0.85,
      "type": "sql_injection",
      "location": {
        "file": "src/api.py",
        "line": 42
      },
      "title": "Potential SQL Injection Vulnerability",
      "description": "...",
      "suggestion": "...",
      "references": ["CWE-89", "OWASP A03:2021"]
    }
  ]
}

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   Guardian Orchestrator                  β”‚
β”‚  (Coordinates verification workflow across specialists)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚
       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
       β–Ό           β–Ό           β–Ό              β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚Security β”‚  β”‚ Pattern β”‚  β”‚ Idiom  β”‚  β”‚  Logic   β”‚
β”‚ Agent   β”‚  β”‚ Agent   β”‚  β”‚ Agent  β”‚  β”‚  Agent   β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”¬β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜
     β”‚            β”‚           β”‚            β”‚
     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                              β”‚
                              β–Ό
                   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                   β”‚  Result Aggregator β”‚
                   β”‚   & Report Gen     β”‚
                   β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Agents run in parallel using asyncio for maximum performance.

Performance

Files Lines of Code Verification Time Issues Found
1 50 3-5ms 0-5
7 500 11ms 20
50 5,000 100-200ms 50-150

Scales linearly with codebase size due to parallel execution.

Roadmap

Phase 1: Core Engine βœ… (Complete)

  • βœ… Guardian orchestrator framework
  • βœ… Security agent (8 vulnerability classes)
  • βœ… Pattern agent (7 anti-pattern checks)
  • βœ… CLI interface
  • βœ… Report generation (JSON, Markdown)
  • βœ… Learning system (history tracking, feedback, confidence adjustment)

Phase 2: Advanced Learning 🚧 (Current)

  • βœ… Verification history tracking
  • βœ… Feedback recording and metrics
  • βœ… Confidence adjustment based on accuracy
  • Convention learning (auto-whitelist patterns)
  • Context agent (project-aware analysis)
  • Team learning (share data across team)

Phase 3: Advanced Agents & Integration πŸ“‹ (Next)

  • Idiom agent (repo-specific conventions)
  • Logic agent (control flow analysis)
  • Impact agent (ripple effect detection)
  • GitHub Action / GitLab CI integration
  • VS Code / IDE plugin
  • Metrics dashboard UI

FAQ

Q: Does Guardian require external dependencies? A: No, Guardian uses only Python standard library (ast, asyncio, re). Zero external dependencies.

Q: Can Guardian verify languages other than Python? A: Currently Python-only. Support for JavaScript, TypeScript, Go planned.

Q: How accurate is it? A: High recall (catches most real issues), moderate precision (some false positives in strict mode). Confidence scores and false positive risk help prioritize.

Q: Does it replace code review? A: No, Guardian augments human review by catching common issues automatically, freeing reviewers to focus on logic and design.

Q: Can I customize rules? A: Yes (coming in Phase 2). You'll be able to define custom patterns and thresholds.

Q: Is it safe to run on untrusted code? A: Yes, Guardian only parses and analyzes code statically - it never executes it.

Contributing

Contributions welcome! Areas of interest:

  1. New Detection Rules: Add more security/pattern checks
  2. Language Support: Parsers for JavaScript, TypeScript, Go
  3. False Positive Reduction: Improve accuracy
  4. Performance: Optimize for large codebases
  5. Documentation: Examples, tutorials, blog posts

License

MIT License - see LICENSE file

Citation

If you use Guardian in research, please cite:

@software{guardian2026,
  title = {Guardian: AI Code Verification System},
  author = {Your Team},
  year = {2026},
  url = {https://github.com/yourorg/guardian}
}

Acknowledgments

Built with insights from:

  • OWASP Top 10 (2021)
  • CWE/SANS Top 25 Most Dangerous Software Errors
  • "Clean Code" by Robert C. Martin
  • Real-world analysis of AI-generated code issues

Made with ❀️ to make AI-assisted development safer and more productive.

About

πŸ›‘οΈ Automatically catch security vulnerabilities in AI-generated code

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages