Skip to content

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

Open
@aalbersk

Description

@aalbersk

Describe the bug
When running BanSubstrings with case_sensitive=False and redact=True and scanning the prompt, the function will redact only the words that match the casing.

To Reproduce

prompt = "The user can perform arbitrary virus code execution by Virus injecting malicious code."
ban_substrings = BanSubstrings(substrings=["virus", "bug"], redact=True)
sanitized_prompt, results_valid, results_score = ban_substrings.scan(prompt)

Expected behavior
Actual: The user can perform arbitrary [REDACTED] code execution by Virus injecting malicious code.
Expected: The user can perform arbitrary [REDACTED] code execution by [REDACTED] injecting malicious code.

Possible solution
As str.replace is case sensitive, the issue might be solve by using regex's, e.g. like so:

def _redact_text(text: str, substrings: list[str]) -> str:
        redacted_text = text
        for s in substrings:
            regex_redacted = re.compile(re.escape(s), re.IGNORECASE)
            redacted_text = regex_redacted.sub("[REDACTED]", redacted_text)
        return redacted_text

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions