Skip to content

JSON scanner doesn't really work #211

Open
@n-sviridenko

Description

@n-sviridenko

Describe the bug

JSON scanner doesn't extract JSON strings from the raw model output

To Reproduce

execute this code:

#!/usr/bin/env python3

import sys
import json
from llm_guard.output_scanners import JSON
import json

def sanitize_json_output(prompt: str, model_output: str, required_elements: int = 1) -> dict:
  scanner = JSON(required_elements=required_elements)
  sanitized_output, is_valid, risk_score = scanner.scan(prompt, model_output)
  print(f"Sanitized output: {sanitized_output}")
  print(f"Is valid: {is_valid}")
  print(f"Risk score: {risk_score}")
  if not is_valid:
    raise ValueError(f"Invalid JSON output: {risk_score}")
  return json.loads(sanitized_output)

content = """
{
  "reasoning": "After analyzing the comment against the existing research results, most of the technical claims are actually already covered in the provided research. The H100 power consumption of 700 watts is verified in Research 2. The discussion of FLOPS utilization and efficiency is covered in Research 1. The general implications for data center operations and AI deployment are also addressed in the existing research. The comment mostly provides commentary and analysis on already-verified facts rather than introducing new technical claims that require verification.",
  "topics": null
}

The comment primarily:
1. Draws analogies (TPUs/Sohu to race cars/hypercars) which are subjective comparisons
2. References already-verified technical specs (H100 power consumption, FLOPS utilization)
3. Makes general observations about potential implications that are opinion-based rather than factual claims
4. Discusses theoretical benefits that are already covered in the research findings
5. Expresses personal interest and forward-looking statements that aren't verifiable claims

While the comment discusses important technical aspects, it doesn't introduce any new specific facts or metrics that would require additional verification beyond what's already been researched."""

def main():
    try:
        # Use a dummy prompt since we're just validating JSON
        dummy_prompt = "Validate JSON structure"
        result = sanitize_json_output(dummy_prompt, content)
        print("✅ JSON is valid!")
        print("Validated content:")
        print(json.dumps(result, indent=2))
        return 0
    except ValueError as e:
        print(f"❌ Error: {str(e)}", file=sys.stderr)
        return 1
    except Exception as e:
        print(f"❌ Unexpected error: {str(e)}", file=sys.stderr)
        return 1

if __name__ == "__main__":
    sys.exit(main()) 

this will print:

Sanitized output:
{
  "reasoning": "After analyzing the comment against the existing research results, most of the technical claims are actually already covered in the provided research. The H100 power consumption of 700 watts is verified in Research 2. The discussion of FLOPS utilization and efficiency is covered in Research 1. The general implications for data center operations and AI deployment are also addressed in the existing research. The comment mostly provides commentary and analysis on already-verified facts rather than introducing new technical claims that require verification.",
  "topics": null
}

The comment primarily:
1. Draws analogies (TPUs/Sohu to race cars/hypercars) which are subjective comparisons
2. References already-verified technical specs (H100 power consumption, FLOPS utilization)
3. Makes general observations about potential implications that are opinion-based rather than factual claims
4. Discusses theoretical benefits that are already covered in the research findings
5. Expresses personal interest and forward-looking statements that aren't verifiable claims

While the comment discusses important technical aspects, it doesn't introduce any new specific facts or metrics that would require additional verification beyond what's already been researched.
Is valid: True
Risk score: 0.0
❌ Error: Extra data: line 7 column 1 (char 601)

Expected behavior

Having only the JSON part

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions