Open-source PII Detection for Retrieval Systems.
Scan, redact, and manage PII in your documents before they get uploaded to a Retrieval Augmented Generation (RAG) system.
DataFog works by scanning and redacting-out PII in files before are uploaded to a RAG system.
DataFog can be installed via pip:
pip install datafog # python client
- Clone repo
- Run 'poetry install' to install dependencies (recommend entering poetry shell for preserving dependencies)
- Justfile commands:
just format
to apply formatting.just lint
to check formatting and style.just tag
to tag your project on gitjust upload
to publish to PyPi.
To run the datafog unit tests, check out this repository and do
tox
This software is published under the MIT license.