Skip to content

Enhancement: Add GGUF metadata scanning + chat template compliance signal to AI BOM generation #25

@afogel

Description

@afogel

Description

The generator currently does not extract GGUF model metadata or surface chat template consistency, leaving a major gap in AIBOM coverage as GGUF becomes the dominant distribution format for quantized LLMs. This misses a critical attack surface: Model Execution Configuration (tokenizer + chat template), which can be altered without touching model weights.

Context / Motivation

I’d like to propose a new area of data for the AIBOM that addresses a blind spot in current specifications. The current AIBOM focuses on
training data and model weights but misses execution‑time configuration. Formats like GGUF bundle the tokenizer and chat template alongside the weights. Recent research presented at OWASP Global AppSec shows an attacker can poison the chat template to influence model completions under certain conditions, even when weights remain untouched. This can effectively backdoor a model while bypassing weight scanning entirely. The attack targets instructions for execution rather than the model’s core intelligence.

Current Issues Found

  1. GGUF metadata is ignored
    • AIBOMs miss embedded fields like architecture, quantization details, and template metadata already present in GGUF artifacts.
  2. No chat template consistency signal
    • Instruction‑tuned models rely on standardized templates, but mismatches across quantizations or variants are not detected.
  3. Reporting lags registry‑driven scoring
    • Report messaging can drift from current registry logic, weakening clarity for compliance review.

Proposed Solution

  1. Add GGUF metadata extraction
    • Parse GGUF metadata and fold it into the existing AIBOM output alongside current fields already scanned.
  2. Introduce chat template integrity / consistency reporting
    • Compute and surface a consistency signal across GGUF variants to detect template mismatches.
    • Add fields to track Chat Template Integrity:
      • Template Source/Provenance
      • Chat Template Hash
      • Template Security Status (attestation scanned for malicious logic)
  3. Align reporting with registry‑driven scoring
    • Ensure the report reflects dynamic max points and current completeness logic.

Benefits

  • Completes AIBOM coverage for a now‑dominant model format
  • Surfaces compliance‑relevant template mismatches and execution‑time tampering risk
  • Improves trust and auditability of generated AIBOMs

Dependencies

  • None beyond existing parser utilities (implementation already staged)

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions