Skip to content

makr-code/VCC-UDS3

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

44 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

uds3_logo # UDS3 - Unified Database Strategy v3.0

Enterprise-ready multi-database distribution system with PKI-integrated security

UDS3 is a state-of-the-art multi-database framework for administrative and legal documents β€” an AI LLM RAG framework with full SAGA support, GDPR compliance, Search API and comprehensive secu[...]


πŸ›οΈ The VCC Ecosystem

Vision: Digital transformation of public administration

The VCC (Veritas-Covina-Clara) ecosystem is a self-optimizing AI system for digitalizing public administration with a focus on digital sovereignty, legal certainty and continuous [...].

VCC = Three symbiotic AI components:

  • Veritas: AI legal advisory system (Human-in-the-Loop)
  • Covina: Automated knowledge updating (Knowledge Update)
  • Clara: Continuous model training (Continuous Learning)

Core components

πŸ” VCC PKI (Public Key Infrastructure)

Repository: VCC-PKI

  • Function: Enterprise-grade certificate management and mTLS communication
  • Features:
    • Root CA and Intermediate CA management
    • Automatic certificate issuance for services
    • Certificate Revocation Lists (CRL)
    • Web GUI and CLI for administration
  • Integration: All VCC services use PKI certificates for secure communication
  • Status: Production-ready βœ…

πŸ‘€ User Service (.NET/C#)

Repository: VCC-User

  • Function: Central user management and authentication
  • Features:
    • Keycloak integration for SSO
    • Active Directory connection
    • Role and permission management
    • JWT-based authentication
  • Integration: Authenticates access to all VCC services
  • Status: Production-ready βœ…

πŸ—„οΈ UDS3 (Unified Database Strategy)

Repository: VCC-UDS3 (this project)

  • Function: Multi-database backend for structured and unstructured data
  • Responsibilities in the ecosystem:
    • Data persistence: Central storage layer for all VCC applications
    • Polyglot persistence: Optimal database choice per use case
      • Neo4j: Legal hierarchies, reference structures, process graphs
      • ChromaDB: Semantic search across legal documents
      • PostgreSQL: Structured metadata, audit logs
      • CouchDB: Binary attachments, original documents
    • Search API: High-performance search with hybrid retrieval (vector + graph + relational)
    • SAGA transactions: Distributed transaction safety across multiple databases
    • GDPR compliance: Automatic data classification and retention periods
    • Security layer: Row-level security and RBAC for all data access
  • Consumers: VERITAS, Clara, Covina (see below)
  • Status: Production-ready βœ…

βš–οΈ VERITAS (Administrative Law Information & Text Analysis AI System)

Repository: VCC-Veritas

  • Function: AI-powered legal advisory system for administrative law
  • Features:
    • RAG (Retrieval-Augmented Generation) over laws, regulations, court decisions
    • Natural language Q&A for public administration staff
    • Source references with paragraph citations
  • UDS3 usage:
    • Neo4j: Legal hierarchies (e.g., Building Code β†’ State Building Regulations β†’ Municipal statutes)
    • ChromaDB: Semantic similarity search
    • PostgreSQL: Metadata filtering (scope, date)
  • Status: Prototype ⚠️

πŸ“„ Clara (Document Processing & Classification)

Repository: VCC-Clara

  • Function: Automatic document processing and classification
  • Features:
    • OCR for scanned documents
    • Automatic classification (building application, notice, objection, etc.)
    • Metadata extraction (date, case number, parties)
    • Workflow routing based on document type
  • UDS3 usage:
    • PostgreSQL: Document metadata and workflow status
    • CouchDB: Original documents and OCR results
    • Neo4j: Document relationships (response to an application, etc.)
  • Status: Prototype ⚠️

πŸ”„ Covina (Process Mining & Orchestration)

Repository: VCC-Covina

  • Function: Administrative process analysis and optimization
  • Features:
    • Process mining over historical cases
    • Bottleneck detection in approval procedures
    • BPMN import and export
    • Workflow orchestration
  • UDS3 usage:
    • Neo4j: Process graphs and flow models
    • PostgreSQL: Event logs and performance metrics
  • Status: Prototype ⚠️

System architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½[...]
β”‚                  VCC: Veritas-Covina-Clara                      β”‚
β”‚              (Sovereign Administration AI System)               β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€οΏ½οΏ½[...]
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚ VERITAS  β”‚   β”‚  Clara   β”‚   β”‚ Covina   β”‚   β”‚  User    β”‚   β”‚
β”‚  β”‚   AI     β”‚   β”‚  Learn   β”‚   β”‚Knowledge β”‚   β”‚  Mgmt    β”‚   β”‚
β”‚  β”‚ Legal Q&Aβ”‚   β”‚  Loop    β”‚   β”‚  Update  β”‚   β”‚  Auth    β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜   β”‚
β”‚       β”‚              β”‚              β”‚              β”‚          β”‚
β”‚       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                           β”‚                                    β”‚
β”‚              β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”                       β”‚
β”‚              β”‚    UDS3 Backend         β”‚                       β”‚
β”‚              β”‚  (Multi-DB Strategy)    β”‚                       β”‚
β”‚              β””β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜                       β”‚
β”‚                β”‚     β”‚     β”‚     β”‚                             β”‚
β”‚         β”Œβ”€β”€β”€β”€β”€β”€β–Όβ” β”Œβ”€β”€β–Όβ”€β”€β” β”Œβ–Όβ”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”                     β”‚
β”‚         β”‚ Neo4j β”‚ β”‚Chromaβ”‚β”‚PgSQLβ”‚β”‚Couchβ”‚                     β”‚
β”‚         β”‚ (VPB) β”‚ β”‚(Vect)β”‚β”‚(Txn)β”‚β”‚(Bin)β”‚                     β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜                     β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚           VCC PKI (Certificate Authority)                β”‚ β”‚
β”‚  β”‚    - Root CA  - Service Certs  - mTLS  - CRL            β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚                                                                 β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚  β”‚         Keycloak SSO + Active Directory Stub             β”‚ β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
└─────────────────────────────────────────────────────────────────��[...]

UDS3 as the heart of the ecosystem

Why multi-database?

Administrative data is heterogeneous:

  • Structured: Citizen data, case numbers β†’ PostgreSQL
  • Graph-based: Legal hierarchies, process flows β†’ Neo4j
  • Semantic: Full-text legal documents β†’ ChromaDB (vector search)
  • Binary: Scans, PDFs, signatures β†’ CouchDB (file storage)

UDS3 unifies these storage types under one API and enables:

  • Transactional safety across database boundaries (SAGA pattern)
  • Optimal performance through specialization (right tool for the job)
  • GDPR compliance via a central compliance layer
  • Security through PKI integration and row-level security

Deployment model

Containerized with Docker Compose:

services:
  uds3-backend:     # UDS3 API Gateway
  neo4j:            # Graph Database
  chromadb:         # Vector Database
  postgresql:       # Relational Database
  couchdb:          # Document Database
  
  veritas:          # AI Legal Assistant
  clara:            # Document Processor
  covina:           # Process Mining
  
  user-service:     # User Management
  keycloak:         # SSO Provider
  pki-manager:      # Certificate Authority

Production environment (planned):

  • Kubernetes with Helm charts
  • Automatic scaling for UDS3 and AI services
  • High availability: PostgreSQL (Patroni), Neo4j (Cluster)
  • Monitoring: Prometheus + Grafana

πŸš€ Quick Start

Installation

pip install -e .

Basic Usage

from uds3 import get_optimized_unified_strategy

# Initialize strategy
strategy = get_optimized_unified_strategy()

# Create document
doc = strategy.saga_crud.create_document(
    content="Example document",
    metadata={"type": "regulation"}
)

# Search documents (NEW in v1.4.0 ⭐)
results = await strategy.search_api.hybrid_search(
    query="Photovoltaics requirements",
    top_k=10
)

✨ Features

� Security Layer (NEW in v1.4.0 ⭐)

Enterprise-grade security with PKI integration and least-privilege access control:

from security import User, UserRole, UDS3SecurityManager
from database.secure_api import SecureDatabaseAPI

# Initialize security with PKI
security = UDS3SecurityManager(
    pki_ca_cert_path="/path/to/ca.pem",
    enable_pki_auth=True
)

# Wrap database API with security
secure_api = SecureDatabaseAPI(database_api, security)

# All operations require authenticated user
user = User("alice", "alice", "alice@vcc.local", UserRole.USER)
doc_id = secure_api.create(user, {"title": "My Document"})
docs = secure_api.read(user, {})  # Only sees own documents

Security features:

  • βœ… Row-Level Security (RLS): Users can only access their own data
  • βœ… Role-Based Access Control (RBAC): 5 roles, 15 granular permissions
  • βœ… PKI Certificate Authentication: Integration with VCC PKI system
  • βœ… Comprehensive Audit Logging: All operations tracked
  • βœ… API Rate Limiting: DoS protection and fair resource allocation
  • βœ… Zero-Trust Architecture: Every request authenticated and authorized

See Security Documentation for complete details.

οΏ½πŸ” Search API (NEW in v1.4.0)

High-level search interface across vector, graph and relational backends:

from uds3 import get_optimized_unified_strategy
from uds3.search import SearchQuery

strategy = get_optimized_unified_strategy()

# Vector Search (Semantic Similarity)
results = await strategy.search_api.vector_search(embedding, top_k=10)

# Graph Search (Relationships)
results = await strategy.search_api.graph_search("Photovoltaics", top_k=10)

# Hybrid Search (Best of Both Worlds)
query = SearchQuery(
    query_text="What does Β§ 58 LBO BW regulate?",
    top_k=10,
    search_types=["vector", "graph"],
    weights={"vector": 0.5, "graph": 0.5}
)
results = await strategy.search_api.hybrid_search(query)

Benefits:

  • βœ… Unified API: One interface for all search types
  • βœ… Type Safety: Dataclasses for queries and results
  • βœ… Error Handling: Automatic retry logic and graceful degradation
  • βœ… Lazy Loading: Efficient resource management
  • βœ… Production Ready: 100% test coverage

See UDS3 Search API Production Guide for details.

πŸ“Š Multi-Database Support

  • Vector DB (ChromaDB): Semantic search, content embeddings
  • Graph DB (Neo4j): Relationships, hierarchies, network analysis
  • Relational DB (PostgreSQL): ACID transactions, fast filtering
  • File Storage (CouchDB): Binary assets, original files

πŸ”„ SAGA Pattern

Distributed transactions with automatic compensation:

from uds3.saga import SagaDatabaseCRUD

saga_crud = strategy.saga_crud

# Transactional document creation
doc = saga_crud.create_document(
    content="...",
    metadata={...}
)

πŸ”’ GDPR Compliance

Built-in data protection:

from uds3.dsgvo import DSGVOOperationType, PIIType

# Track PII processing
strategy.dsgvo_core.track_processing(
    operation=DSGVOOperationType.READ,
    pii_type=PIIType.NAME,
    subject_id="user_123"
)

# Anonymize after retention
strategy.dsgvo_core.anonymize_expired_data()

πŸ—‘οΈ Advanced CRUD Operations

  • Soft delete: Recoverable deletion with archive
  • Hard delete: Permanent removal with cascade
  • Restore: Recover soft-deleted documents
  • Archive: Long-term storage with retention policies

πŸ“š Documentation

πŸ”„ Migration from v1.3.x

Search API (v1.4.0)

Old way (deprecated):

from uds3.uds3_search_api import UDS3SearchAPI
from uds3.uds3_core import get_optimized_unified_strategy

strategy = get_optimized_unified_strategy()
search_api = UDS3SearchAPI(strategy)
results = await search_api.hybrid_search(query)

New way (recommended ⭐):

from uds3 import get_optimized_unified_strategy

strategy = get_optimized_unified_strategy()
results = await strategy.search_api.hybrid_search(query)

Benefits:

  • -50% imports (2 β†’ 1)
  • -33% code (3 LOC β†’ 2 LOC)
  • +100% discoverability (IDE autocomplete)

The old import path (uds3.uds3_search_api) still works with a deprecation warning and was removed in v1.5.0 (~3 months).

πŸ§ͺ Testing

# Run all tests
pytest tests/ -v

# Run with coverage
pytest tests/ --cov=uds3 --cov-report=html

# Run specific test suite
pytest tests/test_search_api.py -v

πŸ“Š Backends status

  • Neo4j: 1930 documents, PRODUCTION-READY βœ…
  • ChromaDB: Remote API, PRODUCTION-READY βœ…
  • PostgreSQL: Active (metadata storage) βœ…
  • CouchDB: Active (file storage) βœ…

🎯 Roadmap

See ROADMAP.md for detailed development planning (v1.6.0 - v3.0.0).

v1.5.0 (Current - October 2025)

  • βœ… All backends production-ready (ChromaDB, Neo4j, PostgreSQL, CouchDB)
  • βœ… Removed deprecated uds3.uds3_search_api import
  • βœ… Documentation improvements and status updates
  • βœ… VCC ecosystem documentation (Veritas-Covina-Clara)
  • πŸ”„ PostgreSQL execute_sql() API (in progress)
  • πŸ”„ Enhanced search filters (planned)
  • πŸ”„ Advanced reranking algorithms (planned)

Future versions

Next milestones:

  • v1.6.0 (Q1 2026): RAG pipeline maturity (Hybrid Search, RRF, Cross-Encoder Re-Ranking)
  • v2.0.0 (Q2-Q3 2026): Clara continuous learning (PEFT/LoRA), VPB integration
  • v2.5.0 (Q4 2026): Governance & compliance (EU AI Act, formal data governance)
  • v3.0.0 (2027+): Production readiness (security audits, K8s, high availability)

Detailed feature planning, technical requirements and implementation steps are in ROADMAP.md.


πŸ›οΈ UDS3 in the VCC ecosystem: Strategic role

The VCC triangle: Sovereign administration AI

UDS3 is not just a database abstraction β€” it is the fundamental backbone of the VCC (Veritas-Covina-Clara) ecosystem, a strategic AI system for the digital sovereignty of [...].

Strategic positioning:

  • Political anchoring: Part of "Digital Program 2025" Brandenburg (DABB)
  • Primary goal: Digital sovereignty (avoid vendor lock-in)
  • Architecture principle: On-premise, open-source, zero-trust
  • Purpose: Personnel augmentation (not replacement) in face of skill shortages (93.9% vacancy gap)

The three symbiotic cycles (powered by UDS3)

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚            VCC: Self-optimizing ecosystem                   β”‚
β”œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€
β”‚                                                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”       β”‚
β”‚  β”‚ VERITAS  │─────▢│  Clara   │◀────│ Covina   β”‚       β”‚
β”‚  β”‚ (Human-  β”‚      β”‚ (Learningβ”‚      β”‚(Knowledgeβ”‚       β”‚
β”‚  β”‚   Loop)  β”‚      β”‚   Loop)  β”‚      β”‚  Update) β”‚       β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜      β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜       β”‚
β”‚       β”‚                 β”‚                  β”‚             β”‚
β”‚       β”‚    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
β”‚       β”‚    β”‚         UDS3 Backend                        β”‚
β”‚       β”‚    β”‚   (Unified Database Strategy)               β”‚
β”‚       β”‚    └─┬─────┬──────┬──────┬────┐                 β”‚
β”‚       β”‚      β”‚     β”‚      β”‚      β”‚    β”‚                 β”‚
β”‚       β””β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”˜                 β”‚
β”‚              β”‚     β”‚      β”‚      β”‚                       β”‚
β”‚       β”Œβ”€β”€β”€β”€β”€β”€β–Όβ” β”Œβ”€β”€β–Όβ”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β” β”Œβ–Όβ”€β”€β”€β”€β”               β”‚
β”‚       β”‚ Neo4j β”‚ β”‚Chromaβ”‚ β”‚PgSQLβ”‚ β”‚Couchβ”‚               β”‚
β”‚       β”‚ (VPB) β”‚ β”‚(Vect)β”‚ β”‚(Txn)β”‚ β”‚(Bin)β”‚               β”‚
β”‚       β””β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”˜               β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

1. Veritas (User interaction cycle)

Function: AI-supported legal advisory system for administrative experts

UDS3 role:

  • Neo4j (VPB): Traversal of legal hierarchies (e.g., Building Code β†’ State Building Regulations β†’ Municipal statutes)
  • ChromaDB: Semantic search over laws, regulations, decisions
  • PostgreSQL: Metadata filtering (scope, effective date, jurisdiction)
  • Graph-RAG: Multi-hop reasoning over connected legal structures

Output: Not only "what" (content) but also "where" (position in the process), "who" (actors), "why" (legal basis)

Critical feature: Human-in-the-loop (legally & ethically mandatory)

  • Capture feedback for Clara training
  • Protection against "automation bias"
  • Ensuring accountability

2. Covina (Knowledge update cycle)

Function: Automated ingestion pipeline to combat knowledge obsolescence

UDS3 role:

  • Worker-based pipeline: Asynchronous processing with saga compensation
  • PostgreSQL: Transactional safety for metadata updates
  • ChromaDB: Continuous updating of the vector database
  • Neo4j (VPB): Incremental expansion of the process knowledge graph
  • CouchDB: Versioning of original documents (legal certainty)

GDPR integration:

  • Automatic pseudonymization of personal data
  • Data-minimizing views (Art. 25 GDPR)
  • Retention management

Risk mitigation:

  • Bias amplification: Historical admin data can perpetuate systemic biases
  • Solution: Quality control before ingesting into the knowledge base

3. Clara (Continuous learning cycle)

Function: The system's "self-improving brain"

Technology:

  • Parameter-Efficient Fine-Tuning (PEFT): LoRA/QLoRA adapters
  • Advantage: Extremely resource-efficient (small adapter modules instead of full training)
  • Input: User feedback collected by Veritas
  • Validation: Golden dataset (curated reference dataset)

UDS3 role:

  • PostgreSQL: Storage of feedback data and training metrics
  • CouchDB: Versioning of trained model adapters
  • SAGA log: Complete audit trail of all model updates (legally admissible)

Critical security challenge:

  • Cascading integrity compromise: False information in the knowledge base β†’ feedback β†’ permanently embedded in the model
  • Backdoor injection: Malicious LoRA adapter with hidden trigger
  • Solution: Just-in-time integrity verification (digital signature immediately before GPU loading)

The Administrative Process Backbone (VPB): The core

Problem: Historically grown "island solutions" fragment process data

Solution: VPB on a graph database (Neo4j)

  • Consolidation: Heterogeneous source systems β†’ unified, connected structure
  • Native: Administrative processes are networks (actors, entities, relationships)
  • Process intelligence: AI understands not only content but position, context, legal basis

UDS3 architecture principle: Polyglot persistence

Each storage system for its strength ("Right Tool for the Job"):

Database Specialization VCC use case
Neo4j Graph traversal VPB: process graphs, legal hierarchies, multi-hop reasoning
ChromaDB Vector similarity Semantic search, RAG, content embeddings
PostgreSQL ACID transactions Structured metadata, audit logs, JSONB for semi-structured data
CouchDB Offline-first, versioning Binary attachments, original documents, legal certainty

Transactional integrity: SAGA pattern

Distributed transactions across database boundaries:

  • Orchestration: Central orchestrator drives the process
  • Saga log: Complete, legally admissible audit trail (GDPR Art. 5 (2))
  • Compensation: Automatic rollback actions on errors
  • Legal certainty: Traceability of governmental action

πŸ”’ Security-by-design: Zero-trust architecture

UDS3 implements "never trust, always verify" at every layer:

1. Identity as the new perimeter

Hybrid IAM strategy:

  • On-prem AD: Source of truth for users
  • Keycloak: Modern federation (OIDC/OAuth 2.0)
  • Kerberos SSO: Seamless sign-on in the domain network
  • JWT propagation: Signed "digital passport" across microservices

2. PKI as the trust anchor

Sovereign public key infrastructure:

  • X.509 certificates: For every machine identity
  • Mutual TLS (mTLS): Mutual authentication of all services
  • Two-level PKI: Offline root CA + online intermediate CA
  • HSM-backed: Hardware Security Module for key ceremonies

3. Software supply chain integrity

Manifest principle:

  • Central manifest file: Hashes of all code files
  • Digital signature: Cryptographic integrity proof
  • Clara special: Just-in-time verification of dynamic LoRA adapters before GPU load

4. Immutable audit logs

Qualified electronic timestamps (QET):

  • eIDAS-compliant: EU-legally binding integrity proof
  • Periodic hashing: Saga log β†’ certified timestamp authority
  • Tamper detection: Any subsequent modification detectable

5. EU compliance: AI Act & GDPR

EU AI Act (high-risk AI):

  • βœ… Robustness via SAGA compensation
  • βœ… Transparency via complete logging
  • βœ… Human oversight (Human-in-the-Loop)
  • βœ… Bias monitoring and mitigation

GDPR principles:

  • βœ… Accountability (Art. 5 (2)): Saga log
  • βœ… Privacy-by-design (Art. 25): Pseudonymization, data minimization
  • ⚠️ Eventual consistency: Tension with data accuracy (Art. 5 (1)(d))

πŸ“Š Current maturity gap: VCC vs. hyperscalers

Status: Stable functional prototype (October 14, 2025)

Component VCC (UDS3) AWS/Azure/GCP Gap
Retrieval Pure vector search Native hybrid search (keyword+vector) High
Result fusion Score normalization Reciprocal Rank Fusion (RRF) Medium
Re-ranking Generic LLM Specialized cross-encoder (managed) High
Multi-hop Basic graph traversal Optimized knowledge graph queries Medium
Monitoring Basic logging Prometheus/Grafana, distributed tracing High
High availability Single-instance Auto-scaling, managed services Critical

Strategic choice required:

  1. Sovereign in-house development: Full control, high effort
  2. Hybrid approach: Integrate hyperscaler services (loss of sovereignty)
  3. Phased migration: Incremental improvement with clear roadmap

UDS3 philosophy: Option 1 + 3 (preserve sovereignty, systematically catch up)


🚨 AI-specific threat matrix

Threat Mechanism Impact UDS3 mitigation
Backdoor adapter Poisoned LoRA adapter with trigger Controlled malicious behavior Just-in-time digital signature verification
Cascading corruption False information β†’ feedback β†’ model lock-in Systemic knowledge distortion Golden dataset validation, human review
Adapter swap Malware replaces legitimate adapter after boot Bypass initial checks Runtime verification before each load
Data poisoning Manipulated training data in Covina Bias amplification, hallucinations Quality control before knowledge ingestion

πŸŽ“ Change management & ethics

Human-in-the-loop imperative:

  • Technical: Feedback integration for Clara training
  • Legal: Accountability (no autopilot)
  • Ethical: AI as a tool for augmentation, not replacement

Risks:

  • Automation bias: Uncritical trust in AI outputs
  • Diffusion of responsibility: Blurring of duties
  • Bias perpetuation: Historical prejudices in training data

Solutions:

  • Multistage employee qualification
  • AI ethics board with formal mandate
  • Continuous bias audits

πŸ“ˆ Measurable success: KPIs & benchmarks

Technical KPIs:

  • retrieval_latency_ms < 100ms (95th percentile)
  • saga_completion_rate > 99.9%
  • model_accuracy > 90% on Golden Dataset
  • feedback_loop_latency < 24h (Veritas β†’ Clara)

Operational KPIs:

  • Recovery Time Objective (RTO) < 4h
  • Recovery Point Objective (RPO) < 15min
  • Service availability > 99.5%

Business KPIs:

  • Time saved per legal request: 30-50%
  • Reduction in legal errors: 40-60%
  • Employee satisfaction: >4.0/5.0

πŸ”¬ Next steps to production readiness

Detailed development planning in ROADMAP.md

3-phase plan:

  • Phase 1 (Q4 2025 - Q1 2026): Validation

    • Independent security audit, performance benchmarks, architecture review
  • Phase 2 (Q2 2026): Hardening

    • Just-in-time adapter verification, data governance, red-team program
  • Phase 3 (Q3-Q4 2026): Production rollout

    • Kubernetes deployment, high availability, monitoring (Prometheus/Grafana)

🌟 Strategic value of UDS3

Sovereignty:

  • On-premise, open-source, vendor-lock-in free
  • Full data control (GDPR, secrecy protection)
  • Independence from US hyperscalers

Innovation:

  • Graph-RAG for administrative processes
  • Self-learning architecture (Clara)
  • Process-native AI (VPB integration)

Legal certainty:

  • eIDAS-compliant audit trail
  • Human-in-the-Loop mandatory
  • Traceability of government actions

Economics:

  • Personnel augmentation for skill shortages (93.9% vacancy gap)
  • 30-50% time saved per legal request
  • Scalable at the state level

  • βœ… Search API integrated into core
  • βœ… Property-based access (strategy.search_api)
  • βœ… Backward-compatible migration path
  • βœ… 100% test coverage
  • βœ… Security Layer: PKI-integrated RBAC/RLS with audit logging
  • βœ… Secure Database API: Row-level security for all database operations
  • βœ… Zero-Trust Architecture: Certificate-based authentication

v2.0.0 (Future)

  • Complete RAG framework
  • Reranking API
  • Generation API
  • Evaluation API

πŸ“ Changelog

v1.5.0 (2025-10-24) πŸš€ PRODUCTION RELEASE

All backends production-ready:

  • βœ… ChromaDB: Remote API fully operational (removed fallback mode)
  • βœ… Neo4j: 1930+ documents validated
  • βœ… PostgreSQL: Active metadata storage
  • βœ… CouchDB: Active file storage

Breaking changes:

  • ⚠️ Removed: Deprecated uds3.uds3_search_api module
    • Migration: Use strategy.search_api property instead
    • Deprecation period: 3 months (announced in v1.4.0)

Documentation:

  • πŸ“„ Updated backend status across all documentation
  • πŸ“„ Removed obsolete "fallback mode" warnings
  • πŸ“„ Updated roadmap and version information

v1.4.0 (2025-10-24) πŸ”’ SECURITY RELEASE

Security features (NEW ⭐):

  • ✨ Row-Level Security (RLS): Automatic data ownership filtering
  • ✨ RBAC system: 5 roles (SYSTEM, ADMIN, SERVICE, USER, READONLY) with 15 granular permissions
  • ✨ PKI Authentication: Certificate-based authentication with VCC PKI integration
  • ✨ Audit logging: Complete audit trail for all database operations
  • ✨ Rate limiting: DoS protection with per-role quotas
  • ✨ Secure Database API: Security wrapper for all database backends
  • ✨ Zero-Trust Architecture: Every request authenticated and authorized

Search features:

  • ✨ Search API property: Direct access via strategy.search_api (lazy-loaded)
  • ✨ Improved DX: -50% imports, +100% discoverability
  • ✨ Type safety: Enhanced dataclasses for SearchQuery and SearchResult

Migration:

  • βœ… Backward compatible: Old import path still works with deprecation warning
  • ⏱️ Deprecation period: 3 months (removed in v1.5.0)
  • πŸ“š Migration guide: See README and docs/UDS3_SEARCH_API_INTEGRATION_DECISION.md

Testing:

  • βœ… 100% test coverage for Search API
  • βœ… 3/3 security test suites passed
  • βœ… 3/3 integration test suites passed
  • βœ… Production validation with 1930 Neo4j documents

Documentation:

  • πŸ“„ New: docs/SECURITY.md (680 LOC) - Complete security architecture
  • πŸ“„ New: UDS3_SEARCH_API_PRODUCTION_GUIDE.md (1950 LOC)
  • πŸ“„ New: UDS3_SEARCH_API_INTEGRATION_DECISION.md (2000 LOC)
  • πŸ“„ Updated: README.md with Security and Search API examples

🀝 Contributing

Contributions welcome! Please read CONTRIBUTING.md first.

Government & public sector partners:
Organizations working in government or public administration are especially encouraged to contribute improvements back to the project. See our Government Partnership Commons Clause for deta[...]

πŸ“„ License

MIT License with Government Partnership Commons Clause

This project is licensed under the permissive MIT License, with a non-binding request for government and public sector users to share improvements back to the community.

  • βœ… Free to use commercially and privately
  • βœ… No legal obligation to share modifications
  • 🀝 Encouraged to contribute improvements, especially for public sector use cases

See LICENSE file for complete details.

Why this license? UDS3 is designed for government partnerships and public administration. Shared improvements strengthen security and reduce duplicated efforts across agencies, while respecting the[...]

πŸ‘₯ Contributors

See CONTRIBUTORS.md for organizations and individuals who have contributed to UDS3.

πŸ”— Related projects

  • VERITAS: Administrative law Q&A system using UDS3
  • Clara: Document processing pipeline with UDS3
  • Covina: Process mining with UDS3 backend

Developed by Martin KrΓΌger (ma.krueger@outlook.com)
Made with ❀️ for Government & Public Sector

About

Uniform database strategies v3 - Central data management and API integration for all VCC services

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages