BanSubstrings doesn't redacts all correct words if case_sensitive=False

**Describe the bug**
When running BanSubstrings with case_sensitive=False and redact=True and scanning the prompt, the function will redact only the words that match the casing.

**To Reproduce**
```python
prompt = "The user can perform arbitrary virus code execution by Virus injecting malicious code."
ban_substrings = BanSubstrings(substrings=["virus", "bug"], redact=True)
sanitized_prompt, results_valid, results_score = ban_substrings.scan(prompt)
```

**Expected behavior**
Actual: `The user can perform arbitrary [REDACTED] code execution by Virus injecting malicious code.`
Expected: `The user can perform arbitrary [REDACTED] code execution by [REDACTED] injecting malicious code.`

**Possible solution**
As str.replace is case sensitive, the issue might be solve by using regex's, e.g. like so:
```python
def _redact_text(text: str, substrings: list[str]) -> str:
        redacted_text = text
        for s in substrings:
            regex_redacted = re.compile(re.escape(s), re.IGNORECASE)
            redacted_text = regex_redacted.sub("[REDACTED]", redacted_text)
        return redacted_text
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

BanSubstrings doesn't redacts all correct words if case_sensitive=False #210

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions