Fast, multilingual PII detection. 50+ languages, single model, no GPU required.
Live Demo Β· Benchmarks Β· API Docs Β· Models Β· Blog
| PII Engineer | Presidio | spaCy | AWS Comprehend | |
|---|---|---|---|---|
| F1 (multilingual) | 0.86 | 0.44 | 0.64 | 0.52 |
| F1 (English) | 0.88 | 0.80 | 0.83 | 0.82 |
| Languages | 50+ | ~10 locales | 1 per model | 12 |
| Latency (p50) | 180ms | 80ms (w/ NER) | 120ms | 200ms |
| GPU required | No | No | Optional | N/A |
| Self-hosted | Yes | Yes | Yes | No |
| Cost (1M req/mo) | $42 | $42 | $42 | ~$1,000 |
- Multilingual β single model handles 50+ languages including CJK, SEA, South Asian, and European languages
- High accuracy β 0.90 F1 overall, outperforms regex-based tools on non-English text
- Fast β ~180ms p50 on CPU (INT8 quantized ONNX inference)
- Zero-shot labels β detect custom entity types without retraining
- Self-hosted β runs on a $42/mo VPS, no external API calls, your data never leaves your server
- Single binary β Rust binary with embedded static assets, no Python runtime or dependency hell
- Auto-redaction β returns both detected entities and redacted text in one call
- 9 PII types β person names, phone numbers, government IDs, addresses, DOB, emails, passports, license plates, bank accounts
cargo build --release --package pii-engineer-server
cargo run --release --package pii-engineer-server
# Models auto-download from HuggingFace on first run
# API ready at http://localhost:8000docker build -t pii-engineer .
docker run -p 8000:8000 -v ./models:/app/models pii-engineercurl -X POST http://localhost:8000/api/detect \
-H "Content-Type: application/json" \
-d '{"text": "John Doe, NRIC S9012345B, born 12 March 1985"}'Response:
{
"entities": [
{ "type": "person_name", "value": "John Doe", "score": 0.99 },
{ "type": "government_id", "value": "S9012345B", "score": 0.99 },
{ "type": "date_of_birth", "value": "12 March 1985", "score": 0.97 }
],
"redacted": "[PERSON_NAME], NRIC [GOVERNMENT_ID], born [DATE_OF_BIRTH]"
}import requests
response = requests.post("http://localhost:8000/api/detect", json={
"text": "Ahmad bin Abdullah, +60 12-345 6789, IC 901201-14-5678"
})
data = response.json()
print(data["redacted"])
# [PERSON_NAME], [PHONE_NUMBER], IC [GOVERNMENT_ID]const res = await fetch("http://localhost:8000/api/detect", {
method: "POST",
headers: { "Content-Type": "application/json" },
body: JSON.stringify({
text: "Nguyα»
n VΔn A, CCCD 079201012345, sinh ngΓ y 15/03/1990"
}),
});
const { entities, redacted } = await res.json();
console.log(redacted);
// [PERSON_NAME], CCCD [GOVERNMENT_ID], sinh ngΓ y [DATE_OF_BIRTH]curl -X POST http://localhost:8000/api/detect \
-H "Content-Type: application/json" \
-d '{
"text": "Call me at 9123 4567 or email john@acme.com",
"labels": ["phone_number", "email_address"]
}'| Type | Examples |
|---|---|
person_name |
Sarah Lim, Ahmad bin Abdullah, ιδΌζ |
phone_number |
+65 9123 4567, 0812-3456-7890 |
government_id |
S9012345B (NRIC), 3201010512890001 (NIK), Aadhaar |
street_address |
42 Orchard Road #08-12, Jl. Sudirman No. 1 |
date_of_birth |
12 March 1985, 1990-05-15 |
email_address |
john@example.com |
passport_number |
E12345678 |
license_plate |
SBA1234A, B 1234 CD |
bank_account_number |
1234-5678-9012 |
Primary (highest accuracy): English, Malay, Tamil, Chinese, Indonesian, Vietnamese
Secondary: Thai, Hindi, Bengali, Korean, Japanese, German, French, Spanish, Portuguese, Russian, Arabic, Turkish, Polish, Dutch, Italian, Swedish, and 35+ more
The model handles multilingual text natively β mixed-language documents (e.g., English + Chinese + Malay in one paragraph) work without language selection.
- PDPA / GDPR compliance β scan documents, databases, and logs for personal data before audits
- LLM guardrails β redact PII before sending user input to GPT/Claude/Gemini
- Data pipelines β clean PII from ETL outputs, data warehouse columns, Kafka streams
- Chat moderation β detect PII in real-time in Slack, support tickets, or chat apps
- Code review β catch hardcoded PII in test fixtures, config files, and documentation
- Document redaction β auto-redact contracts, resumes, medical records before sharing
| Field | Type | Default | Description |
|---|---|---|---|
text |
string | required | Input text (max 50,000 chars) |
labels |
string[] | all 9 types | PII types to detect |
boost |
string[] | [] | Labels to boost with description matching |
Response:
{
"entities": [
{ "type": "person_name", "value": "John Doe", "start": 0, "end": 8, "score": 0.99, "needs_review": false }
],
"redacted": "[PERSON_NAME] lives at [STREET_ADDRESS]",
"original": "John Doe lives at 123 Main St"
}{ "status": "ok", "version": "1.0.0", "gliner_loaded": true, "chinese_loaded": true }Request β Language detection β GLiNER2 NER + (Chinese NER if CJK)
β
Post-processing pipeline (8 stages)
reclassify β validate β filter β normalize β email/IP detect β threshold β dedup β merge
β
Response (entities + redacted text)
Model: Fine-tuned GLiNER2 (mDeBERTa-v3-base, 280M params) split into 5 ONNX models. INT8 quantized encoder for CPU inference.
Stack: Rust + Axum + ONNX Runtime + HuggingFace Tokenizers + mimalloc
How it works:
- Text and entity labels are encoded together by the transformer encoder
- Span representation layer scores all possible token spans (up to 8 tokens wide)
- Classifier determines which spans match which PII labels
- 8-stage post-processing pipeline validates, deduplicates, and merges results
- Regex-based detection supplements NER for emails and IP addresses
| Variable | Default | Description |
|---|---|---|
PORT |
8000 |
Server port |
GLINER_MODELS |
models/PII-Engineer-Multi-NER-v2.1 |
GLiNER model path |
CHINESE_NER_MODEL |
models/PII-Engineer-Chinese-NER-v1.0 |
Chinese NER model path |
ORT_DYLIB_PATH |
auto-detect | Path to libonnxruntime.so / .dylib |
ORT_INTRA_THREADS |
4 |
ONNX Runtime intra-op threads |
ORT_INTER_THREADS |
1 |
ONNX Runtime inter-op threads |
PII_ENGINEER_RATE_LIMIT_RPM |
120 |
Max requests per minute per IP |
| Setup | Latency (p50) | Throughput |
|---|---|---|
| MacBook M-series (FP32) | ~150ms | ~6 req/s |
| 4-vCPU AMD (INT8) | ~250ms | ~4 req/s |
| 8-vCPU AMD (INT8) | ~180ms | ~5 req/s |
Memory usage: ~800MB (model weights loaded in RAM).
Tips:
- Set
ORT_INTRA_THREADSequal to your vCPU count - INT8 encoder gives ~40% speedup with <0.5% accuracy loss
- First request after idle is slower β the server runs periodic warmup to mitigate this
cargo build --workspace
cargo test --workspace
cargo clippy --workspace
cargo run --release -p pii-engineer-servercrates/
βββ pii-engineer-core/ # NER engine, pipeline, model loading
β βββ src/
β βββ gliner/ # GLiNER2 ONNX inference (v1, v2-compat, v2-full)
β βββ pipeline.rs # 8-stage post-processing
β βββ labels.rs # PII label definitions and canonicalization
β βββ lang.rs # Language detection (CJK)
βββ pii-engineer-server/ # HTTP server (Axum)
β βββ src/
β βββ routes.rs # API endpoints
β βββ state.rs # App state, model loading
β βββ middleware.rs # Rate limiting, error handling
static/ # Embedded frontend (rust-embed)
models/ # ONNX models (auto-downloaded)
See CONTRIBUTING.md for guidelines. We especially welcome:
- Validation rules for country-specific ID formats
- Test cases for underrepresented languages
- Performance optimizations
See NOTICE for upstream attributions.