Skip to content

Conversation

@davida-ps
Copy link

image## Title

Relevant issues

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • I have added a screenshot of my new test passing locally
  • My PR passes all unit tests on make test-unit
  • My PR's scope is as isolated as possible, it only solves 1 specific problem

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
✅ Test

Changes

pleas align this text

This pull request significantly improves the Prompt Security guardrail integration in prompt_security.py by introducing robust logging, clearer exception handling, more granular control over when guardrails are applied, and better support for complex response types. The changes also refactor how messages and outputs are sanitized, filtered, and updated after guardrail intervention. These improvements enhance reliability, observability, and maintainability of the guardrail logic.

Key changes include:

1. Exception Handling and Logging Enhancements

  • Introduced custom exceptions (PromptSecurityGuardrailAPIError, PromptSecurityBlockedMessage) for clearer error handling and more informative HTTP responses when content is blocked or API errors occur.
  • Added comprehensive logging of guardrail actions and failures, including timing, status, and details of each guardrail invocation, to facilitate debugging and monitoring. [1] [2]

2. Guardrail Hook Improvements and Control Flow

  • Updated all guardrail hooks (async_pre_call_hook, async_moderation_hook, async_post_call_success_hook, and streaming iterator hook) to ensure metadata is present, check if the guardrail should run for each event type, and consistently update applied guardrail headers. [1] [2] [3]
  • Improved input sanitization and message transformation, including a workaround to filter out system-generated metadata before sending messages to the Prompt Security API.

3. Output and Message Handling for Advanced Response Types

  • Added logic to handle and update messages for complex response types (such as ResponsesAPIResponse), including helper methods to normalize and update messages and instructions after guardrail intervention.
  • Implemented _scan_responses_api_output to scan and potentially modify or block outputs in batch response APIs, ensuring consistent guardrail enforcement.

4. Refactoring and Code Quality

  • Refactored code for clarity and maintainability, such as extracting message normalization and update logic into helper methods, and improving exception handling structure throughout. [1] [2]

5. Minor Cleanup

  • Removed an unused comment from the XAI Responses API tests.

@vercel
Copy link

vercel bot commented Nov 25, 2025

@davida-ps is attempting to deploy a commit to the CLERKIEAI Team on Vercel.

A member of the Team first needs to authorize it.

@CLAassistant
Copy link

CLAassistant commented Nov 25, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants