Skip to content

Open-source ML-powered profanity filter with TensorFlow.js toxicity detection, leetspeak & Unicode obfuscation resistance. 21M+ ops/sec, 23 languages, React hooks, LRU caching. npm & PyPI.

License

Notifications You must be signed in to change notification settings

GLINCKER/glin-profanity

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

71 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GLIN PROFANITY

ML-Powered Profanity Detection for the Modern Web

npm version PyPI version npm downloads PyPI downloads

CI Status Bundle Size License TypeScript

GitHub Stars GitHub Forks GitHub Issues Contributors

Glin Profanity - ML-Powered Profanity Detection

Live Demo


Why Glin Profanity?

Most profanity filters are trivially bypassed. Users type f*ck, sh1t, or fսck (with Cyrillic characters) and walk right through. Glin Profanity doesn't just check against a word list—it understands evasion tactics.

┌─────────────────────────────────────────────────────────────────────────────┐
│                           GLIN PROFANITY v3                                 │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│   Input Text ──►  Unicode       ──►  Leetspeak    ──►  Dictionary  ──► ML  │
│                   Normalization      Detection         Matching        Check│
│                   (homoglyphs)       (f4ck→fuck)       (23 langs)     (opt) │
│                                                                             │
│   "fսck"     ──►  "fuck"        ──►  "fuck"       ──►  MATCH       ──► ✓   │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

Performance Benchmarks

Tested on Node.js 20, M1 MacBook Pro, single-threaded:

Operation Glin Profanity bad-words leo-profanity obscenity
Simple check 21M ops/sec 890K ops/sec 1.2M ops/sec 650K ops/sec
With leetspeak 8.5M ops/sec N/A N/A N/A
Multi-language (3) 18M ops/sec N/A 400K ops/sec N/A
Unicode normalization 15M ops/sec N/A N/A N/A

Feature Comparison

Feature Glin Profanity bad-words leo-profanity obscenity
Leetspeak detection (f4ck, sh1t) Yes No No Partial
Unicode homoglyph detection Yes No No No
ML toxicity detection Yes (TensorFlow.js) No No No
Multi-language support 23 languages English only 14 languages English only
Result caching (LRU) Yes No No No
Severity levels Yes No No No
React hook Yes No No No
Python package Yes No No No
TypeScript types Full Partial Partial Full
Bundle size (minified) 12KB + dictionaries 8KB 15KB 6KB
Active maintenance Yes Limited Limited Limited

Installation

JavaScript/TypeScript

npm install glin-profanity

Python

pip install glin-profanity

Quick Start

JavaScript

import { checkProfanity, Filter } from 'glin-profanity';

// Simple check
const result = checkProfanity("This is f4ck1ng bad", {
  detectLeetspeak: true,
  languages: ['english']
});

result.containsProfanity  // true
result.profaneWords       // ['fucking']

// With replacement
const filter = new Filter({
  replaceWith: '***',
  detectLeetspeak: true
});
filter.checkProfanity("sh1t happens").processedText  // "*** happens"

Python

from glin_profanity import Filter

filter = Filter({"languages": ["english"], "replace_with": "***"})

filter.is_profane("damn this")           # True
filter.check_profanity("damn this")      # Full result object

React

import { useProfanityChecker } from 'glin-profanity';

function ChatInput() {
  const { result, checkText } = useProfanityChecker({
    detectLeetspeak: true
  });

  return (
    <input onChange={(e) => checkText(e.target.value)} />
    {result?.containsProfanity && <span>Clean up your language</span>}
  );
}

Architecture

flowchart LR
    subgraph Input
        A[Raw Text]
    end

    subgraph Processing
        B[Unicode Normalizer]
        C[Leetspeak Decoder]
        D[Word Tokenizer]
    end

    subgraph Detection
        E[Dictionary Matcher]
        F[Fuzzy Matcher]
        G[ML Toxicity Model]
    end

    subgraph Output
        H[Result Object]
    end

    A --> B --> C --> D
    D --> E --> H
    D --> F --> H
    D -.->|Optional| G -.-> H
Loading

Detection Capabilities

Leetspeak Detection

const filter = new Filter({
  detectLeetspeak: true,
  leetspeakLevel: 'aggressive'  // basic | moderate | aggressive
});

filter.isProfane('f4ck');     // true
filter.isProfane('5h1t');     // true
filter.isProfane('@$$');      // true
filter.isProfane('ph.u" "ck'); // true (aggressive mode)

Unicode Homoglyph Detection

const filter = new Filter({ normalizeUnicode: true });

filter.isProfane('fսck');   // true (Armenian 'ս' → 'u')
filter.isProfane('shіt');   // true (Cyrillic 'і' → 'i')
filter.isProfane('ƒuck');   // true (Latin 'ƒ' → 'f')

ML-Powered Detection

import { loadToxicityModel, checkToxicity } from 'glin-profanity/ml';

await loadToxicityModel({ threshold: 0.9 });

const result = await checkToxicity("You're the worst player ever");
// { toxic: true, categories: { toxicity: 0.92, insult: 0.87, ... } }

Supported Languages

23 languages with curated dictionaries:

Arabic Chinese Czech Danish
Dutch English Esperanto Finnish
French German Hindi Hungarian
Italian Japanese Korean Norwegian
Persian Polish Portuguese Russian
Spanish Swedish Thai Turkish

Documentation

Document Description
Getting Started Installation and basic usage
API Reference Complete API documentation
Framework Examples React, Vue, Angular, Express, Next.js
Advanced Features Leetspeak, Unicode, ML, caching
ML Guide TensorFlow.js integration
Changelog Version history

Local Testing Interface

Run the interactive playground locally to test profanity detection:

# Clone the repo
git clone https://github.com/GLINCKER/glin-profanity.git
cd glin-profanity/packages/js

# Install dependencies
npm install

# Start the local testing server
npm run dev:playground

Open http://localhost:4000 to access the testing interface with:

  • Real-time profanity detection
  • Toggle leetspeak, Unicode normalization, ML detection
  • Multi-language selection
  • Visual results with severity indicators

Use Cases

Application How Glin Profanity Helps
Chat platforms Real-time message filtering with React hook
Gaming Detect obfuscated profanity in player names/chat
Social media Scale moderation with ML-powered detection
Education Maintain safe learning environments
Enterprise Filter internal communications
AI/ML pipelines Clean training data before model ingestion

License

MIT License - free for personal and commercial use.

Enterprise licensing with SLA and support available from GLINCKER.


Contributing

See CONTRIBUTING.md for guidelines. We welcome:

  • Bug reports and fixes
  • New language dictionaries
  • Performance improvements
  • Documentation updates

Star History

Star History Chart

Live Demo · NPM · PyPI · GitHub


Star on GitHub



Built by GLINCKER · Part of the GLINR ecosystem

About

Open-source ML-powered profanity filter with TensorFlow.js toxicity detection, leetspeak & Unicode obfuscation resistance. 21M+ ops/sec, 23 languages, React hooks, LRU caching. npm & PyPI.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Sponsor this project

  •  

Contributors 8