Advanced PII Detection, GDPR Compliance, and EU AI Act Monitoring with MCP Integration
Clyra is a comprehensive privacy and compliance platform that combines cutting-edge AI technology with regulatory compliance tools. Built with modern web technologies and powered by lovable.dev, it provides real-time PII detection, anonymization, and compliance monitoring for GDPR and EU AI Act requirements.
- π Advanced PII Detection: Multiple NER models with confidence scoring
- π‘οΈ GDPR & AI Act Compliance: Real-time monitoring and risk assessment
- π€ Speech-to-Text Integration: OpenAI Whisper for audio transcription
- π MCP Server: Model Context Protocol integration for AI assistants
- π Compliance Dashboard: Real-time monitoring and analytics
- π Modern Frontend: Built with React, TypeScript, and Tailwind CSS
- β‘ FastAPI Backend: High-performance PII detection API
- ποΈ Vector Database: Weaviate integration for data storage
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Clyra Ecosystem β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Frontend (React/Vite) β Backend (FastAPI) β
β ββ Compliance Dashboard β ββ PII Detection API β
β ββ Real-time Monitoring β ββ Hugging Face Integration β
β ββ Speech-to-Text UI β ββ Multiple NER Models β
β ββ Admin Interface β ββ Anonymization Tools β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β MCP Server (Cloudflare) β Vector DB (Weaviate) β
β ββ GitHub OAuth β ββ User Login Tracking β
β ββ PII Detection Tools β ββ Compliance Data β
β ββ Compliance Prompts β ββ Audit Logs β
β ββ AI Act Monitoring β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Node.js 18+ and npm
- Python 3.8+ (for PII detection API)
- GitHub OAuth App (for MCP authentication)
- Hugging Face API Token (for PII detection)
- OpenAI API Key (for speech-to-text)
git clone <your-repo-url>
cd clyra
npm install# Copy the comprehensive environment template
cp env.example .env
# Edit .env with your actual values
nano .envRequired Environment Variables:
# GitHub OAuth (Required)
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret
# Hugging Face API (Required for PII detection)
HF_API_TOKEN=your_huggingface_token
# OpenAI API (Required for speech-to-text)
OPENAI_API_KEY=your_openai_api_keyTerminal 1 - PII Detection API:
# Activate Python virtual environment
source presidio_env/bin/activate
# Start FastAPI server
cd streamlit
uvicorn hf_api_fastapi:app --reload --port 8000Terminal 2 - Frontend:
cd frontend
npm run devTerminal 3 - MCP Server (Optional):
npm run dev- Frontend: http://localhost:5173
- PII API: http://localhost:8000
- API Docs: http://localhost:8000/docs
clyra/
βββ π frontend/ # React/Vite Frontend (Lovable.dev)
β βββ π src/
β β βββ π components/ # UI Components
β β β βββ Dashboard.tsx # Compliance Dashboard
β β β βββ AIInputField.tsx # Speech-to-Text Input
β β β βββ ui/ # shadcn/ui Components
β β βββ π pages/ # Application Pages
β β βββ π lib/ # Utilities & API Clients
β β βββ App.tsx # Main Application
β βββ package.json # Frontend Dependencies
βββ π src/ # MCP Server (Cloudflare Worker)
β βββ index.ts # Main MCP Server
β βββ pii-client.ts # PII API Client
β βββ weaviate.ts # Vector DB Integration
β βββ auth/ # GitHub OAuth Handlers
βββ π streamlit/ # FastAPI Backend
β βββ hf_api_fastapi.py # PII Detection API
β βββ README_FASTAPI.md # API Documentation
β βββ test_fastapi.py # API Tests
βββ π presidio_env/ # Python Virtual Environment
βββ π env.example # Environment Template
βββ π wrangler.jsonc # Cloudflare Worker Config
βββ π mcp-inspector.json # MCP Inspector Config
βββ π README.md # This File
Technology Stack:
- React 18 with TypeScript
- Vite for fast development
- Tailwind CSS for styling
- shadcn/ui for components
- React Query for state management
- React Router for navigation
Key Features:
- Compliance Dashboard: Real-time monitoring of GDPR and AI Act compliance
- Speech-to-Text: OpenAI Whisper integration for audio input
- PII Detection: Real-time sensitive data detection
- Admin Interface: Management and configuration tools
- Responsive Design: Mobile-first approach
Built with Lovable.dev:
- Modern, accessible UI components
- Optimized performance
- Type-safe development
- Hot reloading and instant preview
Technology Stack:
- FastAPI for high-performance API
- Hugging Face Models for PII detection
- Pydantic for data validation
- Uvicorn for ASGI server
Available Models:
obi/deid_roberta_i2b2- Medical de-identification (Recommended)StanfordAIMI/stanford-deidentifier-base- Stanford de-identifierdslim/bert-base-NER- General NERJean-Baptiste/roberta-large-ner-english- RoBERTa Large NERdslim/bert-large-NER- BERT Large NER
API Endpoints:
GET /health- Health check and available modelsGET /models- List available NER modelsPOST /detect- Detect PII entitiesPOST /anonymize- Anonymize PII with multiple operations
Anonymization Operations:
- Redact: Remove PII completely
- Replace: Replace with entity tags (
<PERSON>,<EMAIL>) - Mask: Replace with asterisks (
********) - Highlight: Mark PII for review
Technology Stack:
- Cloudflare Workers for serverless deployment
- TypeScript for type safety
- Hono for lightweight web framework
- GitHub OAuth for authentication
MCP Tools:
hello- Basic greeting toolpii_detect- Detect PII entitiespii_anonymize_redact- Remove PII completelypii_anonymize_replace- Replace with placeholderspii_anonymize_mask- Mask with asteriskspii_anonymize_highlight- Highlight PIIpii_models- List available modelspii_health- Check API health
MCP Prompts:
pii-privacy-assistant- Privacy-focused AI assistantpii-compliance-checker- GDPR/AI Act compliance expertpii-safe-sharing- Safe data sharing guidanceai-act-gdpr-lawyer- Legal compliance expert
Features:
- Interactive Testing: Test MCP tools and prompts
- Debug Interface: Monitor MCP protocol messages
- GitHub OAuth: Secure authentication flow
- Real-time Logs: Debug server responses
Usage:
npm run inspectFeatures:
- User Tracking: Store GitHub login information
- Compliance Data: Track compliance events
- Audit Logs: Maintain detailed logs
- Vector Search: Semantic search capabilities
Schema:
class GitHubUser {
username: string
lastLoginAt: date
}Features:
- Real-time Transcription: Convert audio to text
- Multiple Formats: Support for various audio formats
- High Accuracy: OpenAI Whisper model
- Seamless Integration: Built into input components
Implementation:
const transcribeAudio = async (audioBlob: Blob): Promise<string> => {
const formData = new FormData();
formData.append("file", audioBlob, "audio.webm");
formData.append("model", "whisper-1");
const response = await fetch("https://api.openai.com/v1/audio/transcriptions", {
method: "POST",
headers: { Authorization: `Bearer ${apiKey}` },
body: formData,
});
return response.json().text;
};- Data Minimization: Collect only necessary data
- Purpose Limitation: Use data for stated purposes only
- Storage Limitation: Automatic data retention policies
- Integrity & Confidentiality: Protect against unauthorized access
- Risk Assessment: Identify high-risk AI systems
- Documentation: Maintain detailed system records
- Transparency: Clear AI system information
- Human Oversight: Ensure human control over AI systems
- Compliance Score: Overall compliance percentage
- Issue Tracking: GDPR and AI Act violations
- Trend Analysis: Compliance trends over time
- Alert System: Immediate violation notifications
- Push changes to your repository
- Lovable.dev automatically deploys
- Access via your custom domain
# Using Docker
docker build -t clyra-api .
docker run -p 8000:8000 -e HF_API_TOKEN=your_token clyra-api
# Using Cloud Run
gcloud run deploy clyra-api --source .# Deploy to Cloudflare Workers
npm run deploy# Start all services
npm run dev:all
# Start individual services
npm run dev:frontend
npm run dev:backend
npm run dev:mcp# Frontend tests
cd frontend && npm test
# Backend tests
cd streamlit && python test_fastapi.py
# MCP Inspector
npm run inspectSee env.example for comprehensive environment configuration.
- API Documentation:
streamlit/README_FASTAPI.md - MCP Inspector:
MCP-INSPECTOR.md - Frontend Components:
frontend/src/components/ - Environment Setup:
env.example
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: GitHub Issues
- Documentation: See individual component READMEs
- API Testing: Use the interactive docs at
/docs
- Enhanced compliance reporting
- Multi-language PII detection
- Advanced anonymization techniques
- Real-time collaboration features
- Mobile application
- Enterprise integrations
π Clyra - Protecting Privacy in the AI Era
Built with β€οΈ using modern web technologies and powered by lovable.dev