Skip to content

ThorneShadowbane/ai-code-guard

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ AI Code Guard Pro

CI Python 3.10+ License: MIT Code style: ruff

Industry-grade security scanner for AI-generated code with AST analysis, taint tracking, and LLM-specific vulnerability detection.

AI coding assistants (GitHub Copilot, Claude, ChatGPT, Cursor) are revolutionizing developmentβ€”but they can introduce security vulnerabilities that slip past code review. AI Code Guard Pro is a next-generation security scanner specifically designed to catch these issues.

πŸš€ Key Improvements Over Basic Scanners

Feature Basic Scanners AI Code Guard Pro
Analysis Method Regex matching AST parsing + taint tracking
False Positives High Reduced via context awareness
Secret Detection Pattern only Pattern + Shannon entropy
Prompt Injection ❌ Not detected βœ… Direct + indirect detection
Supply Chain Basic Typosquatting + dependency confusion
Output Formats Limited Console, JSON, SARIF, Markdown
CI/CD Integration Basic Native SARIF for GitHub Security

🎯 What It Detects

πŸ” Secrets & Credentials

  • API Keys: OpenAI, Anthropic, AWS, GCP, GitHub, Stripe, and 15+ providers
  • Private Keys: RSA, SSH, PGP, EC
  • Database Credentials: Connection strings, passwords
  • High-Entropy Strings: AI placeholder secrets

πŸ’‰ Injection Vulnerabilities

  • SQL Injection: f-strings, .format(), concatenation in queries
  • Command Injection: os.system, subprocess with shell=True
  • Code Execution: eval(), exec() with user input
  • SSRF: User-controlled URLs in requests

πŸ€– AI/LLM-Specific Issues

  • Direct Prompt Injection: User input in system prompts
  • Indirect Injection: RAG/retrieval injection risks
  • Unsafe Deserialization: pickle, yaml.load without SafeLoader

πŸ“¦ Supply Chain Attacks

  • Typosquatting: Similar names to popular packages
  • Dependency Confusion: Internal package name patterns
  • Known Malicious Packages: Database of suspicious packages

πŸ“¦ Installation

pip install ai-code-guard

Or with development dependencies:

pip install ai-code-guard[dev]

πŸ”§ Quick Start

# Scan a directory
ai-code-guard scan ./src

# Scan with specific output format
ai-code-guard scan ./src --format sarif -o results.sarif

# Quick CI check
ai-code-guard check ./src

# List all rules
ai-code-guard rules

# Create config file
ai-code-guard init

πŸ“Š Example Output

πŸ›‘οΈ  AI Code Guard Pro v1.0.0
   Scanning ./my-project...

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ πŸ”΄ CRITICAL: SQL Injection Vulnerability                            β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ πŸ“ src/db/queries.py:42                                             β”‚
β”‚                                                                     β”‚
β”‚ SQL query constructed using f-string interpolation. User-controlled β”‚
β”‚ data may be interpolated directly into the query, enabling SQL      β”‚
β”‚ injection attacks.                                                  β”‚
β”‚                                                                     β”‚
β”‚ Code: query = f"SELECT * FROM users WHERE id = {user_id}"          β”‚
β”‚                                                                     β”‚
β”‚ βœ… Fix: Use parameterized queries:                                  β”‚
β”‚    cursor.execute('SELECT * FROM users WHERE id = ?', (user_id,))  β”‚
β”‚                                                                     β”‚
β”‚ CWE: CWE-89                                                         β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 🟠 HIGH: Prompt Injection Vulnerability                             β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚ πŸ“ src/api/chat.py:23                                               β”‚
β”‚                                                                     β”‚
β”‚ User input directly embedded in LLM prompt via f-string. Attackers  β”‚
β”‚ can inject malicious instructions to manipulate the AI's behavior.  β”‚
β”‚                                                                     β”‚
β”‚ Code: prompt = f"You are a helper. User says: {user_input}"        β”‚
β”‚                                                                     β”‚
β”‚ βœ… Fix:                                                              β”‚
β”‚ 1. Separate system prompts from user content using message roles    β”‚
β”‚ 2. Sanitize user input (remove control characters, limit length)    β”‚
β”‚ 3. Use structured output formats to detect injection attempts       β”‚
β”‚                                                                     β”‚
β”‚ CWE: CWE-74 | OWASP: LLM01                                          β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

────────────────────────────────────────────────────────────────────────
πŸ“Š SUMMARY
────────────────────────────────────────────────────────────────────────
Files scanned    47
Issues found     3
Scan time        127ms

πŸ”΄ CRITICAL: 1  🟠 HIGH: 2  🟑 MEDIUM: 0  πŸ”΅ LOW: 0
────────────────────────────────────────────────────────────────────────

βš™οΈ Configuration

Create .ai-code-guard.yaml in your project root:

# Minimum severity to report
min_severity: low  # critical, high, medium, low, info

# Patterns to ignore
ignore:
  - "tests/**"
  - "**/test_*.py"
  - "examples/**"
  - "docs/**"

# Rules to disable
disable_rules: []
  # - "SEC001"  # If using example API keys
  # - "PRI001"  # If false positives on prompt construction

# Secret detection tuning
entropy_threshold: 4.5  # Shannon entropy threshold
min_secret_length: 16

# AI-specific detection
detect_placeholder_secrets: true
detect_prompt_injection: true

# Performance
max_file_size_kb: 1024
parallel_workers: 4

πŸ”Œ CI/CD Integration

GitHub Actions

name: Security Scan

on: [push, pull_request]

jobs:
  security:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - uses: actions/setup-python@v5
        with:
          python-version: '3.11'
      
      - run: pip install ai-code-guard
      
      - name: Run security scan
        run: ai-code-guard scan . --format sarif -o results.sarif --fail-on high
      
      - name: Upload SARIF to GitHub Security
        uses: github/codeql-action/upload-sarif@v2
        if: always()
        with:
          sarif_file: results.sarif

Pre-commit Hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: ai-code-guard
        name: AI Code Guard Security Scan
        entry: ai-code-guard check
        language: python
        types: [python]
        pass_filenames: false

GitLab CI

security-scan:
  image: python:3.11
  script:
    - pip install ai-code-guard
    - ai-code-guard scan . --format json -o gl-sast-report.json
  artifacts:
    reports:
      sast: gl-sast-report.json

πŸ“‹ Rule Reference

Rule ID Category Severity Description
SEC001-015 Secrets CRITICAL/HIGH API keys (OpenAI, AWS, GitHub, Stripe, etc.)
SEC020-022 Secrets CRITICAL Private keys (RSA, SSH, PGP)
SEC030-031 Secrets CRITICAL Database credentials
SEC040 Secrets MEDIUM JWT tokens
SEC050 Secrets MEDIUM AI placeholder secrets
SEC099 Secrets MEDIUM High-entropy strings
INJ001 Injection CRITICAL SQL injection
INJ002 Injection CRITICAL Command injection
INJ003 Injection CRITICAL Code execution (eval/exec)
DES001 Deserialization CRITICAL Unsafe YAML
DES002 Deserialization CRITICAL Unsafe pickle
SSRF001 SSRF HIGH Server-side request forgery
PRI001-005 Prompt Injection HIGH Direct prompt injection
PRI006 Prompt Injection MEDIUM User input in prompts
PRI010-011 Prompt Injection MEDIUM Indirect injection
DEP001 Dependencies VARIES Known suspicious packages
DEP002 Dependencies HIGH Typosquatting detection
DEP003 Dependencies HIGH Dependency confusion

πŸ”¬ Technical Details

AST-Based Analysis

Unlike regex-based scanners, AI Code Guard Pro parses Python code into an Abstract Syntax Tree, enabling:

  • Taint tracking: Follow user input through variable assignments
  • Context awareness: Understand function calls and their arguments
  • Reduced false positives: Skip patterns in comments and strings

Entropy-Based Secret Detection

Uses Shannon entropy to distinguish real secrets from placeholders:

# High entropy (likely real secret) - DETECTED
api_key = "sk-proj-aB3xK9mL2pQrStUvWxYz..."

# Low entropy (placeholder) - IGNORED
api_key = "your-api-key-here"

LLM Security Focus

Specifically targets vulnerabilities in AI/LLM applications:

  • Detects prompt injection in OpenAI, Anthropic, and LangChain code
  • Identifies indirect injection risks in RAG pipelines
  • Flags unsafe patterns in agent/tool implementations

🀝 Contributing

Contributions welcome! See CONTRIBUTING.md for guidelines.

Adding Detection Patterns

# ai_code_guard_pro/analyzers/my_analyzer.py
from ai_code_guard_pro.models import Finding, Severity, Category

class MyAnalyzer:
    def analyze(self) -> list[Finding]:
        findings = []
        # Your detection logic
        return findings

πŸ“š References

πŸ“„ License

MIT License - see LICENSE for details.


Built for the AI era by security engineers who use AI coding assistants daily πŸ›‘οΈ