ML-Powered Profanity Detection for the Modern Web
Most profanity filters are trivially bypassed. Users type f*ck, sh1t, or fսck (with Cyrillic characters) and walk right through. Glin Profanity doesn't just check against a word list—it understands evasion tactics.
┌─────────────────────────────────────────────────────────────────────────────┐
│ GLIN PROFANITY v3 │
├─────────────────────────────────────────────────────────────────────────────┤
│ │
│ Input Text ──► Unicode ──► Leetspeak ──► Dictionary ──► ML │
│ Normalization Detection Matching Check│
│ (homoglyphs) (f4ck→fuck) (23 langs) (opt) │
│ │
│ "fսck" ──► "fuck" ──► "fuck" ──► MATCH ──► ✓ │
│ │
└─────────────────────────────────────────────────────────────────────────────┘
Tested on Node.js 20, M1 MacBook Pro, single-threaded:
| Operation | Glin Profanity | bad-words | leo-profanity | obscenity |
|---|---|---|---|---|
| Simple check | 21M ops/sec | 890K ops/sec | 1.2M ops/sec | 650K ops/sec |
| With leetspeak | 8.5M ops/sec | N/A | N/A | N/A |
| Multi-language (3) | 18M ops/sec | N/A | 400K ops/sec | N/A |
| Unicode normalization | 15M ops/sec | N/A | N/A | N/A |
| Feature | Glin Profanity | bad-words | leo-profanity | obscenity |
|---|---|---|---|---|
Leetspeak detection (f4ck, sh1t) |
Yes | No | No | Partial |
| Unicode homoglyph detection | Yes | No | No | No |
| ML toxicity detection | Yes (TensorFlow.js) | No | No | No |
| Multi-language support | 23 languages | English only | 14 languages | English only |
| Result caching (LRU) | Yes | No | No | No |
| Severity levels | Yes | No | No | No |
| React hook | Yes | No | No | No |
| Python package | Yes | No | No | No |
| TypeScript types | Full | Partial | Partial | Full |
| Bundle size (minified) | 12KB + dictionaries | 8KB | 15KB | 6KB |
| Active maintenance | Yes | Limited | Limited | Limited |
JavaScript/TypeScript
npm install glin-profanityPython
pip install glin-profanityJavaScript
import { checkProfanity, Filter } from 'glin-profanity';
// Simple check
const result = checkProfanity("This is f4ck1ng bad", {
detectLeetspeak: true,
languages: ['english']
});
result.containsProfanity // true
result.profaneWords // ['fucking']
// With replacement
const filter = new Filter({
replaceWith: '***',
detectLeetspeak: true
});
filter.checkProfanity("sh1t happens").processedText // "*** happens"Python
from glin_profanity import Filter
filter = Filter({"languages": ["english"], "replace_with": "***"})
filter.is_profane("damn this") # True
filter.check_profanity("damn this") # Full result objectReact
import { useProfanityChecker } from 'glin-profanity';
function ChatInput() {
const { result, checkText } = useProfanityChecker({
detectLeetspeak: true
});
return (
<input onChange={(e) => checkText(e.target.value)} />
{result?.containsProfanity && <span>Clean up your language</span>}
);
}flowchart LR
subgraph Input
A[Raw Text]
end
subgraph Processing
B[Unicode Normalizer]
C[Leetspeak Decoder]
D[Word Tokenizer]
end
subgraph Detection
E[Dictionary Matcher]
F[Fuzzy Matcher]
G[ML Toxicity Model]
end
subgraph Output
H[Result Object]
end
A --> B --> C --> D
D --> E --> H
D --> F --> H
D -.->|Optional| G -.-> H
const filter = new Filter({
detectLeetspeak: true,
leetspeakLevel: 'aggressive' // basic | moderate | aggressive
});
filter.isProfane('f4ck'); // true
filter.isProfane('5h1t'); // true
filter.isProfane('@$$'); // true
filter.isProfane('ph.u" "ck'); // true (aggressive mode)const filter = new Filter({ normalizeUnicode: true });
filter.isProfane('fսck'); // true (Armenian 'ս' → 'u')
filter.isProfane('shіt'); // true (Cyrillic 'і' → 'i')
filter.isProfane('ƒuck'); // true (Latin 'ƒ' → 'f')import { loadToxicityModel, checkToxicity } from 'glin-profanity/ml';
await loadToxicityModel({ threshold: 0.9 });
const result = await checkToxicity("You're the worst player ever");
// { toxic: true, categories: { toxicity: 0.92, insult: 0.87, ... } }23 languages with curated dictionaries:
| Arabic | Chinese | Czech | Danish |
| Dutch | English | Esperanto | Finnish |
| French | German | Hindi | Hungarian |
| Italian | Japanese | Korean | Norwegian |
| Persian | Polish | Portuguese | Russian |
| Spanish | Swedish | Thai | Turkish |
| Document | Description |
|---|---|
| Getting Started | Installation and basic usage |
| API Reference | Complete API documentation |
| Framework Examples | React, Vue, Angular, Express, Next.js |
| Advanced Features | Leetspeak, Unicode, ML, caching |
| ML Guide | TensorFlow.js integration |
| Changelog | Version history |
Run the interactive playground locally to test profanity detection:
# Clone the repo
git clone https://github.com/GLINCKER/glin-profanity.git
cd glin-profanity/packages/js
# Install dependencies
npm install
# Start the local testing server
npm run dev:playgroundOpen http://localhost:4000 to access the testing interface with:
- Real-time profanity detection
- Toggle leetspeak, Unicode normalization, ML detection
- Multi-language selection
- Visual results with severity indicators
| Application | How Glin Profanity Helps |
|---|---|
| Chat platforms | Real-time message filtering with React hook |
| Gaming | Detect obfuscated profanity in player names/chat |
| Social media | Scale moderation with ML-powered detection |
| Education | Maintain safe learning environments |
| Enterprise | Filter internal communications |
| AI/ML pipelines | Clean training data before model ingestion |
MIT License - free for personal and commercial use.
Enterprise licensing with SLA and support available from GLINCKER.
See CONTRIBUTING.md for guidelines. We welcome:
- Bug reports and fixes
- New language dictionaries
- Performance improvements
- Documentation updates