Profanity Checker is a lightweight and efficient text moderation tool designed to detect and clean offensive language at scale. It helps teams maintain content quality by identifying profanity and transforming unsafe text into clean, publish-ready content. Ideal for platforms that value content safety and user trust.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for profanity-checker you've just found your team β Letβs Chat. ππ
This project analyzes text inputs to detect profanity, obscenity, and unwanted language, then replaces them based on configurable rules. It solves the challenge of maintaining clean and compliant text across user-generated and automated content. It is built for developers, moderators, and data teams handling large volumes of text.
- Detects offensive words using a built-in profanity dictionary
- Supports custom word lists to match domain-specific needs
- Replaces profanity with configurable text or masked characters
- Processes multiple text inputs in a single run
- Produces structured, audit-friendly output
| Feature | Description |
|---|---|
| Bulk Text Processing | Analyze and clean multiple text entries in one execution. |
| Built-in Profanity List | Uses a predefined list of common offensive terms. |
| Custom Word Support | Allows adding extra words for stricter moderation. |
| Flexible Replacement | Replace profanity with custom text or masked characters. |
| Detailed Results | Returns original text, cleaned text, and detection status. |
| Character Normalization | Detects obfuscated profanity using character alternates. |
| Field Name | Field Description |
|---|---|
| originalText | The raw text provided for analysis. |
| containsProfanity | Boolean flag indicating if profanity was detected. |
| newText | Sanitized version of the text after filtering. |
[
{
"originalText": "This is a piece of text",
"containsProfanity": false,
"newText": "This is a piece of text"
},
{
"originalText": "This is another piece of shit",
"containsProfanity": true,
"newText": "This is another piece of ****"
}
]
profanity-checker/
βββ src/
β βββ main.py
β βββ processor.py
β βββ profanity/
β β βββ default_list.txt
β β βββ custom_loader.py
β βββ utils/
β βββ normalizer.py
βββ data/
β βββ input.sample.json
β βββ output.sample.json
βββ requirements.txt
βββ README.md
- Content platforms use it to moderate user submissions, so they can enforce community guidelines automatically.
- Data teams use it to sanitize text datasets, so analytics and models remain clean and reliable.
- Comment systems use it to filter offensive language, so discussions stay respectful.
- Publishers use it to clean articles and reviews, so content remains brand-safe.
- Developers use it in pipelines to preprocess text, so downstream systems receive safe input.
Can I add my own profanity words? Yes, you can include a custom list of additional words to extend the default detection rules.
How does replacement work? You can replace detected profanity with a fixed text string or a masking character that matches word length.
Does it handle obfuscated profanity? Yes, character alternates such as symbols replacing letters are normalized during detection.
Is there a limit on custom words? Custom additions are limited to a small, controlled set to maintain performance and accuracy.
Primary Metric: Processes thousands of text entries per minute with consistent detection accuracy.
Reliability Metric: Maintains a high success rate across varied text formats and input sizes.
Efficiency Metric: Minimal memory footprint with fast string processing and low overhead.
Quality Metric: High precision in profanity detection while minimizing false positives through safe-word handling.
