Skip to content

A self-learning sensitive data detection and redaction engine combining deterministic rules (regex) with LLM-based evaluation and feedback loops.

License

Notifications You must be signed in to change notification settings

breaktoprotect/auto-dedact

Repository files navigation

auto-dedact

A self-learning sensitive data detection and redaction engine combining deterministic rules (regex) with LLM-based evaluation and feedback loops.

Auto-Dedact is intentionally named with dual meaning:

  • Automated Detect & Redact — the system automatically detects sensitive data and redacts it using deterministic rules and learned patterns.
  • A deliberate sound-pun on autodidact — reflecting that the engine learns by itself through LLM-driven feedback loops, validation, and rule refinement.

Early Development Screenshots

These screenshots reflect early development runs executed locally. Output is intentionally raw to demonstrate decision logic and failure handling.

Self-learning redaction run (GPT-5.2)

GPT-5.2 early run

Self-learning redaction run (GPT-4o-mini)

GPT-4o-mini early run

Failure → retry → success (Bank Account Number)

Retry loop

Comparison between 4 models

Basic comparison between 4 models

About

A self-learning sensitive data detection and redaction engine combining deterministic rules (regex) with LLM-based evaluation and feedback loops.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages