Skip to content

niranjanxprt/Clyra

Repository files navigation

πŸ”’ Clyra - AI Privacy & Compliance Platform

Advanced PII Detection, GDPR Compliance, and EU AI Act Monitoring with MCP Integration

Clyra is a comprehensive privacy and compliance platform that combines cutting-edge AI technology with regulatory compliance tools. Built with modern web technologies and powered by lovable.dev, it provides real-time PII detection, anonymization, and compliance monitoring for GDPR and EU AI Act requirements.

🌟 Key Features

  • πŸ” Advanced PII Detection: Multiple NER models with confidence scoring
  • πŸ›‘οΈ GDPR & AI Act Compliance: Real-time monitoring and risk assessment
  • 🎀 Speech-to-Text Integration: OpenAI Whisper for audio transcription
  • πŸ”— MCP Server: Model Context Protocol integration for AI assistants
  • πŸ“Š Compliance Dashboard: Real-time monitoring and analytics
  • 🌐 Modern Frontend: Built with React, TypeScript, and Tailwind CSS
  • ⚑ FastAPI Backend: High-performance PII detection API
  • πŸ—„οΈ Vector Database: Weaviate integration for data storage

πŸ—οΈ Architecture Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                        Clyra Ecosystem                      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  Frontend (React/Vite)     β”‚  Backend (FastAPI)            β”‚
β”‚  β”œβ”€ Compliance Dashboard   β”‚  β”œβ”€ PII Detection API        β”‚
β”‚  β”œβ”€ Real-time Monitoring   β”‚  β”œβ”€ Hugging Face Integration β”‚
β”‚  β”œβ”€ Speech-to-Text UI      β”‚  β”œβ”€ Multiple NER Models       β”‚
β”‚  └─ Admin Interface         β”‚  └─ Anonymization Tools      β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚  MCP Server (Cloudflare)   β”‚  Vector DB (Weaviate)         β”‚
β”‚  β”œβ”€ GitHub OAuth           β”‚  β”œβ”€ User Login Tracking      β”‚
β”‚  β”œβ”€ PII Detection Tools     β”‚  β”œβ”€ Compliance Data          β”‚
β”‚  β”œβ”€ Compliance Prompts      β”‚  └─ Audit Logs              β”‚
β”‚  └─ AI Act Monitoring       β”‚                              β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸš€ Quick Start

Prerequisites

  • Node.js 18+ and npm
  • Python 3.8+ (for PII detection API)
  • GitHub OAuth App (for MCP authentication)
  • Hugging Face API Token (for PII detection)
  • OpenAI API Key (for speech-to-text)

1. Clone and Install

git clone <your-repo-url>
cd clyra
npm install

2. Environment Setup

# Copy the comprehensive environment template
cp env.example .env

# Edit .env with your actual values
nano .env

Required Environment Variables:

# GitHub OAuth (Required)
GITHUB_CLIENT_ID=your_github_client_id
GITHUB_CLIENT_SECRET=your_github_client_secret

# Hugging Face API (Required for PII detection)
HF_API_TOKEN=your_huggingface_token

# OpenAI API (Required for speech-to-text)
OPENAI_API_KEY=your_openai_api_key

3. Start the Services

Terminal 1 - PII Detection API:

# Activate Python virtual environment
source presidio_env/bin/activate

# Start FastAPI server
cd streamlit
uvicorn hf_api_fastapi:app --reload --port 8000

Terminal 2 - Frontend:

cd frontend
npm run dev

Terminal 3 - MCP Server (Optional):

npm run dev

4. Access the Application

πŸ“ Project Structure

clyra/
β”œβ”€β”€ πŸ“ frontend/                 # React/Vite Frontend (Lovable.dev)
β”‚   β”œβ”€β”€ πŸ“ src/
β”‚   β”‚   β”œβ”€β”€ πŸ“ components/        # UI Components
β”‚   β”‚   β”‚   β”œβ”€β”€ Dashboard.tsx    # Compliance Dashboard
β”‚   β”‚   β”‚   β”œβ”€β”€ AIInputField.tsx # Speech-to-Text Input
β”‚   β”‚   β”‚   └── ui/              # shadcn/ui Components
β”‚   β”‚   β”œβ”€β”€ πŸ“ pages/            # Application Pages
β”‚   β”‚   β”œβ”€β”€ πŸ“ lib/              # Utilities & API Clients
β”‚   β”‚   └── App.tsx              # Main Application
β”‚   └── package.json             # Frontend Dependencies
β”œβ”€β”€ πŸ“ src/                      # MCP Server (Cloudflare Worker)
β”‚   β”œβ”€β”€ index.ts                 # Main MCP Server
β”‚   β”œβ”€β”€ pii-client.ts           # PII API Client
β”‚   β”œβ”€β”€ weaviate.ts              # Vector DB Integration
β”‚   └── auth/                    # GitHub OAuth Handlers
β”œβ”€β”€ πŸ“ streamlit/                # FastAPI Backend
β”‚   β”œβ”€β”€ hf_api_fastapi.py        # PII Detection API
β”‚   β”œβ”€β”€ README_FASTAPI.md        # API Documentation
β”‚   └── test_fastapi.py          # API Tests
β”œβ”€β”€ πŸ“ presidio_env/             # Python Virtual Environment
β”œβ”€β”€ πŸ“„ env.example               # Environment Template
β”œβ”€β”€ πŸ“„ wrangler.jsonc            # Cloudflare Worker Config
β”œβ”€β”€ πŸ“„ mcp-inspector.json        # MCP Inspector Config
└── πŸ“„ README.md                 # This File

πŸ”§ Component Details

1. 🎨 Frontend (Lovable.dev Power)

Technology Stack:

  • React 18 with TypeScript
  • Vite for fast development
  • Tailwind CSS for styling
  • shadcn/ui for components
  • React Query for state management
  • React Router for navigation

Key Features:

  • Compliance Dashboard: Real-time monitoring of GDPR and AI Act compliance
  • Speech-to-Text: OpenAI Whisper integration for audio input
  • PII Detection: Real-time sensitive data detection
  • Admin Interface: Management and configuration tools
  • Responsive Design: Mobile-first approach

Built with Lovable.dev:

  • Modern, accessible UI components
  • Optimized performance
  • Type-safe development
  • Hot reloading and instant preview

2. ⚑ Backend (FastAPI)

Technology Stack:

  • FastAPI for high-performance API
  • Hugging Face Models for PII detection
  • Pydantic for data validation
  • Uvicorn for ASGI server

Available Models:

  • obi/deid_roberta_i2b2 - Medical de-identification (Recommended)
  • StanfordAIMI/stanford-deidentifier-base - Stanford de-identifier
  • dslim/bert-base-NER - General NER
  • Jean-Baptiste/roberta-large-ner-english - RoBERTa Large NER
  • dslim/bert-large-NER - BERT Large NER

API Endpoints:

  • GET /health - Health check and available models
  • GET /models - List available NER models
  • POST /detect - Detect PII entities
  • POST /anonymize - Anonymize PII with multiple operations

Anonymization Operations:

  • Redact: Remove PII completely
  • Replace: Replace with entity tags (<PERSON>, <EMAIL>)
  • Mask: Replace with asterisks (********)
  • Highlight: Mark PII for review

3. πŸ”— MCP Server (Model Context Protocol)

Technology Stack:

  • Cloudflare Workers for serverless deployment
  • TypeScript for type safety
  • Hono for lightweight web framework
  • GitHub OAuth for authentication

MCP Tools:

  • hello - Basic greeting tool
  • pii_detect - Detect PII entities
  • pii_anonymize_redact - Remove PII completely
  • pii_anonymize_replace - Replace with placeholders
  • pii_anonymize_mask - Mask with asterisks
  • pii_anonymize_highlight - Highlight PII
  • pii_models - List available models
  • pii_health - Check API health

MCP Prompts:

  • pii-privacy-assistant - Privacy-focused AI assistant
  • pii-compliance-checker - GDPR/AI Act compliance expert
  • pii-safe-sharing - Safe data sharing guidance
  • ai-act-gdpr-lawyer - Legal compliance expert

4. πŸ” MCP Inspector

Features:

  • Interactive Testing: Test MCP tools and prompts
  • Debug Interface: Monitor MCP protocol messages
  • GitHub OAuth: Secure authentication flow
  • Real-time Logs: Debug server responses

Usage:

npm run inspect

5. πŸ—„οΈ Weaviate Integration

Features:

  • User Tracking: Store GitHub login information
  • Compliance Data: Track compliance events
  • Audit Logs: Maintain detailed logs
  • Vector Search: Semantic search capabilities

Schema:

class GitHubUser {
  username: string
  lastLoginAt: date
}

6. 🎀 Speech-to-Text (OpenAI Whisper)

Features:

  • Real-time Transcription: Convert audio to text
  • Multiple Formats: Support for various audio formats
  • High Accuracy: OpenAI Whisper model
  • Seamless Integration: Built into input components

Implementation:

const transcribeAudio = async (audioBlob: Blob): Promise<string> => {
  const formData = new FormData();
  formData.append("file", audioBlob, "audio.webm");
  formData.append("model", "whisper-1");
  
  const response = await fetch("https://api.openai.com/v1/audio/transcriptions", {
    method: "POST",
    headers: { Authorization: `Bearer ${apiKey}` },
    body: formData,
  });
  
  return response.json().text;
};

πŸ›‘οΈ Compliance Features

GDPR Compliance

  • Data Minimization: Collect only necessary data
  • Purpose Limitation: Use data for stated purposes only
  • Storage Limitation: Automatic data retention policies
  • Integrity & Confidentiality: Protect against unauthorized access

EU AI Act Compliance

  • Risk Assessment: Identify high-risk AI systems
  • Documentation: Maintain detailed system records
  • Transparency: Clear AI system information
  • Human Oversight: Ensure human control over AI systems

Real-time Monitoring

  • Compliance Score: Overall compliance percentage
  • Issue Tracking: GDPR and AI Act violations
  • Trend Analysis: Compliance trends over time
  • Alert System: Immediate violation notifications

πŸš€ Deployment

Frontend Deployment (Lovable.dev)

  1. Push changes to your repository
  2. Lovable.dev automatically deploys
  3. Access via your custom domain

Backend Deployment

# Using Docker
docker build -t clyra-api .
docker run -p 8000:8000 -e HF_API_TOKEN=your_token clyra-api

# Using Cloud Run
gcloud run deploy clyra-api --source .

MCP Server Deployment

# Deploy to Cloudflare Workers
npm run deploy

πŸ”§ Development

Local Development

# Start all services
npm run dev:all

# Start individual services
npm run dev:frontend
npm run dev:backend
npm run dev:mcp

Testing

# Frontend tests
cd frontend && npm test

# Backend tests
cd streamlit && python test_fastapi.py

# MCP Inspector
npm run inspect

Environment Variables

See env.example for comprehensive environment configuration.

πŸ“š Documentation

  • API Documentation: streamlit/README_FASTAPI.md
  • MCP Inspector: MCP-INSPECTOR.md
  • Frontend Components: frontend/src/components/
  • Environment Setup: env.example

🀝 Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests if applicable
  5. Submit a pull request

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

πŸ†˜ Support

  • Issues: GitHub Issues
  • Documentation: See individual component READMEs
  • API Testing: Use the interactive docs at /docs

🎯 Roadmap

  • Enhanced compliance reporting
  • Multi-language PII detection
  • Advanced anonymization techniques
  • Real-time collaboration features
  • Mobile application
  • Enterprise integrations

πŸ”’ Clyra - Protecting Privacy in the AI Era

Built with ❀️ using modern web technologies and powered by lovable.dev

About

Big Berlin Hack -Libra Path

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5