claude governance control plane

real-time ai safety governance • anthropic rsp implementation • enterprise controls

quick start • architecture • deployment • live demo

🎯 what this is

the claude governance control plane (cgcp) transforms anthropic's responsible scaling policy from paper into production-ready controls. it monitors every claude interaction in real-time, enforces tier-based policies, and generates compliance evidence automatically.

✨ key capabilities

real-time risk detection - identifies cbrn, self-harm, jailbreak, and exploitation risks in <100ms
tier-based enforcement - different thresholds for general (0.15), enterprise (0.18), and research (0.25) users
asl-3 monitoring - triggers at biological (20%), cyber (50%), and deception (50%) capability thresholds
human-in-the-loop - escalation workflow with 24-hour sla for high-risk events
compliance automation - generates iso 42001, nist ai rmf, and eu ai act evidence instantly

📈 production metrics

metric	before cgcp	with cgcp	improvement
incident response	24+ hours	<1 hour	96% faster
compliance reporting	2-4 weeks	<5 minutes	99.9% faster
policy consistency	manual	automated	100% coverage
risk monitoring	quarterly	real-time	continuous

🚀 quick start

prerequisites

# required
python 3.8+
4gb ram
git

# optional (for production)
docker
kubernetes

one-command installation

# clone and deploy
git clone https://github.com/dipampaul17/cgcp.git
cd cgcp
python deploy.py local

30-second demo

# run the complete demo with synthetic data
python demo/run_complete_demo.py

this will:

start all services automatically
generate realistic enterprise scenarios
demonstrate risk detection and policy enforcement
show compliance reporting
keep services running for exploration

access the system

📊 dashboard → http://localhost:8501
🔧 api → http://localhost:8000
📚 api docs → http://localhost:8000/docs

🏗️ architecture

system design

graph TB
    subgraph "claude interactions"
        A1[web interface]
        A2[api calls]
        A3[aws bedrock]
        A4[applications]
    end
    
    subgraph "ingestion layer"
        B[ingestion api<br/>fastapi • port 8000]
    end
    
    subgraph "processing layer"
        C1[risk detection engine<br/>• cbrn detector<br/>• self-harm detector<br/>• jailbreak detector<br/>• exploitation detector]
        C2[policy engine<br/>• tier logic<br/>• asl triggers<br/>• enforcement<br/>• escalation]
    end
    
    subgraph "data layer"
        D[duckdb<br/>fast • embedded • sql]
    end
    
    subgraph "output layer"
        E1[dashboard<br/>streamlit • 8501]
        E2[compliance api<br/>iso/nist/eu exports]
    end
    
    A1 & A2 & A3 & A4 --> B
    B --> C1 & C2
    C1 & C2 --> D
    D --> E1 & E2

risk detection flow

flowchart LR
    subgraph "event input"
        E[claude event<br/>prompt + completion]
    end
    
    subgraph "risk analysis"
        R1[cbrn tagger<br/>biological/chemical/nuclear]
        R2[self-harm tagger<br/>mental health risks]
        R3[jailbreak tagger<br/>safety bypasses]
        R4[exploitation tagger<br/>malicious use]
    end
    
    subgraph "scoring"
        S[risk scores<br/>0.0 - 1.0 confidence]
    end
    
    subgraph "policy decision"
        P1{tier<br/>thresholds}
        P2{asl-3<br/>triggers}
    end
    
    subgraph "actions"
        A1[allow]
        A2[block]
        A3[redact]
        A4[escalate]
    end
    
    E --> R1 & R2 & R3 & R4
    R1 & R2 & R3 & R4 --> S
    S --> P1 & P2
    P1 --> A1 & A2 & A3
    P2 --> A4

policy enforcement tiers

graph TD
    subgraph "access tiers"
        T1[general users<br/>claude.ai public]
        T2[enterprise<br/>verified organizations]
        T3[research sandbox<br/>safety researchers]
    end
    
    subgraph "thresholds"
        TH1[cbrn: 0.15<br/>strict safety]
        TH2[cbrn: 0.18<br/>balanced]
        TH3[cbrn: 0.25<br/>controlled testing]
    end
    
    subgraph "example query"
        Q[viral vector design<br/>risk score: 0.20]
    end
    
    subgraph "outcomes"
        O1[blocked ❌]
        O2[escalated ⚡]
        O3[allowed ✅<br/>with logging]
    end
    
    T1 --> TH1
    T2 --> TH2
    T3 --> TH3
    
    Q --> TH1 --> O1
    Q --> TH2 --> O2
    Q --> TH3 --> O3

🚢 deployment

local development

# quick start
python deploy.py local

# manual setup
python -m venv venv
source venv/bin/activate  # windows: venv\Scripts\activate
pip install -r requirements.txt
./start.sh

docker

# automated deployment
python deploy.py docker

# or build manually
docker build -t cgcp:latest .
docker run -p 8000:8000 -p 8501:8501 cgcp:latest

docker compose (with monitoring)

python deploy.py docker-compose

# includes:
# - cgcp application
# - prometheus metrics
# - grafana dashboards

kubernetes

# automated deployment
python deploy.py kubernetes

# verify deployment
kubectl get pods -l app=cgcp

cloud platforms

# aws ecs
python deploy.py aws

# google cloud run
python deploy.py gcp

# azure container instances
python deploy.py azure

see deployment guide for detailed production setup.

🎮 live demo

run production scenarios

python demo/production_demo.py

this demonstrates:

enterprise baseline - 500+ normal business queries
capability evaluation - quarterly asl-3 threshold testing
incident response - biotech escalation, security research, jailbreaks
tier enforcement - same query, different responses by access level
compliance export - automated iso 42001 evidence generation

demo flow visualization

sequenceDiagram
    participant U as user
    participant D as demo script
    participant A as api
    participant DB as database
    participant UI as dashboard
    
    U->>D: run demo
    D->>A: generate baseline traffic
    A->>DB: store 500 events
    A-->>UI: update metrics
    
    D->>A: capability evaluation
    Note over A: test asl-3 thresholds
    A->>DB: log triggers
    A-->>UI: show alerts
    
    D->>A: incident scenarios
    A->>DB: escalate events
    A-->>UI: review queue
    
    D->>A: tier comparison
    Note over A: same query, 3 tiers
    A-->>U: show differential response
    
    D->>A: compliance export
    A->>DB: aggregate evidence
    A-->>U: iso 42001 report

synthetic data generation

# generate realistic test data
python data/synthetic_generator.py

# creates 2000+ events including:
# - pharma research patterns
# - ai safety evaluations
# - financial services queries
# - jailbreak attempts

📡 api reference

event ingestion

POST /ingest
{
    "events": [{
        "event_id": "uuid",
        "timestamp": "2024-01-15T10:00:00Z",
        "user_id": "user_12345",
        "org_id": "acme_pharma",
        "surface": "api",
        "tier": "enterprise",
        "prompt": "user input text",
        "completion": "claude response",
        "model_version": "claude-3-sonnet"
    }]
}

# response
{
    "processed": 1,
    "actions": {
        "allow": 1,
        "block": 0,
        "redact": 0,
        "escalate": 0
    },
    "asl_triggers": 0
}

real-time metrics

GET /metrics

{
    "total_events": 125000,
    "events_by_surface": {"api": 100000, "claude_web": 25000},
    "events_by_tier": {"general": 50000, "enterprise": 70000},
    "risk_detections": {"cbrn": 150, "self_harm": 89},
    "actions_taken": {"allow": 124500, "block": 300},
    "asl_triggers": 12
}

compliance export

GET /export/iso-evidence?days=30

{
    "report_date": "2024-01-15T10:00:00Z",
    "period_days": 30,
    "summary": {
        "total_events": 500000,
        "blocked_events": 1250,
        "asl_triggers": 45,
        "compliance_rate": "99.75%"
    },
    "controls": [{
        "control_id": "iso_9.2.1",
        "control_name": "user access management",
        "evidence_count": 125000
    }]
}

📊 dashboard features

operations view

real-time processing metrics
risk category distribution
tier usage patterns
response time monitoring

policy enforcement

current thresholds by tier
escalated events queue
decision audit trail
sla tracking

analytics

time series trends
risk pattern analysis
organization insights
model comparison

compliance

automated reporting
framework mapping
audit downloads
control effectiveness

⚙️ configuration

environment variables

# api settings
API_HOST=0.0.0.0
API_PORT=8000
API_WORKERS=4

# database
DATABASE_PATH=/data/governance.db
DATABASE_BACKUP_ENABLED=true

# monitoring
METRICS_ENABLED=true
PROMETHEUS_PORT=9090

# security
ENABLE_AUTH=true
JWT_SECRET_KEY=your-secret-key

policy configuration

edit policy/policy_map.yaml:

risk_thresholds:
  cbrn:
    general: 0.15
    enterprise: 0.18
    research_sandbox: 0.25

monitoring setup

grafana dashboards in monitoring/dashboards/:

system overview
risk detection rates
policy enforcement
resource usage

🧪 testing

system verification

python verify_system.py

# tests:
# ✓ api connectivity
# ✓ risk detection accuracy  
# ✓ policy enforcement logic
# ✓ asl trigger thresholds
# ✓ compliance generation

load testing

# generate test load
python data/synthetic_generator.py
python demo/ingest_data.py

# verify performance
# - <100ms response time
# - 1000+ events/second
# - accurate risk scoring

🔧 troubleshooting

common issues

port conflicts

# free up ports
lsof -ti:8000 | xargs kill -9
lsof -ti:8501 | xargs kill -9

database locked

# reset database
rm governance.db.wal
python demo/reset_database.py

slow performance

enable database indexing
increase worker processes
add redis caching

debug mode

export LOG_LEVEL=DEBUG
python -m uvicorn backend.app:app --log-level debug

🤝 contributing

we welcome contributions to make ai governance better:

fork the repository
create a feature branch
commit with clear messages
test thoroughly
submit a pull request

focus areas:

risk detection improvements
dashboard visualizations
compliance frameworks
performance optimization

📄 license

mit license - see LICENSE

🙏 acknowledgments

built to operationalize anthropic's responsible scaling policy into verifiable enterprise controls. special thanks to the ai safety community for guidance on risk thresholds and evaluation methodologies.

making ai safety operational, one policy at a time

report issue • documentation • discussions

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.streamlit		.streamlit
backend		backend
data		data
demo		demo
policy		policy
ui		ui
.gitignore		.gitignore
DEPLOYMENT_GUIDE.md		DEPLOYMENT_GUIDE.md
LICENSE		LICENSE
README.md		README.md
deploy.py		deploy.py
deployment_complete.sh		deployment_complete.sh
governance.db.wal		governance.db.wal
requirements.txt		requirements.txt
start.sh		start.sh
verify_system.py		verify_system.py

License

dipampaul17/cgcp

Folders and files

Latest commit

History

Repository files navigation