Skip to content

feat: Implement HTTP/2 optimized ArangoDB client with neural process isolation #51

@r3d91ll

Description

@r3d91ll

Overview

Implement a custom HTTP/2 Python client for ArangoDB that bypasses python-arango limitations and achieves 2-5x performance improvement over the current PHP subprocess bridge.

Background

As documented in PRD: ArangoDB Optimized Connection Architecture (/docs/prd/arango_optimized_connection_prd.md), we need to optimize database connections for our neural memory architecture where ArangoDB serves as the hippocampus for local AI models.

Current Performance

  • PHP subprocess bridge: ~100ms per operation
  • Target: 20-50ms (2-5x improvement)
  • Ultimate goal: ~5ms with C++/CUDA (future)

Design Principles

Neural Process Isolation - No authentication, just like the hippocampus:

  • Inference Mode (wake state): Read-only access via RO proxy
  • Consolidation Mode (sleep state): Write access via RW proxy
  • Unix sockets only: Never exposed to network
  • Zero auth overhead: Process isolation via proxies

Architecture

┌─────────────┐      ┌──────────┐      ┌──────────┐
│  Inference  │─────▶│ RO Proxy │─────▶│          │
│  (reads)    │      │(allowlist)│      │ ArangoDB │
└─────────────┘      └──────────┘      │          │
                                        │ (single  │
┌─────────────┐      ┌──────────┐      │  socket) │
│Consolidation│─────▶│ RW Proxy │─────▶│          │
│  (writes)   │      │(allowlist)│      │          │
└─────────────┘      └──────────┘      └──────────┘

Implementation Tasks

Phase 1: Proof of Concept & Benchmark (Days 1-3)

  • Create minimal HTTP/2 client with Unix socket support
  • Implement basic operations (get, insert, query)
  • Verify HTTP/2 negotiation via response.http_version
  • Benchmark against PHP baseline with split metrics
  • GO/NO-GO DECISION POINT (must achieve 2x improvement)

Phase 2: Full Implementation (Days 4-7) - ONLY IF Phase 1 Passes

  • Implement RO proxy with allowlist enforcement
  • Implement RW proxy with controlled write access
  • Add connection pooling and keep-alive
  • Implement streaming cursor support
  • Add NDJSON bulk import with Content-Length header
  • Optimize batch sizing based on benchmarks

Phase 3: Integration (Days 8-10) - ONLY IF Phase 2 Successful

  • Update DatabaseFactory to use new client
  • Modify workflows to use optimized batching
  • Run end-to-end performance tests
  • Document configuration and tuning

Critical Requirements

From expert review validation:

  1. HTTPX Configuration: http2=True MUST be on Client
  2. Content-Length header REQUIRED for NDJSON (ArangoDB requirement)
  3. Proxy-based enforcement - Unix permissions alone don't restrict HTTP verbs
  4. Socket modes: 0660 for proxy sockets (not 0644)
  5. No authentication - Process isolation provides security

Success Metrics

Go/No-Go Criteria (Phase 1)

  • Single document fetch: <50ms (2x improvement)
  • Bulk insert (1000 docs): <400ms (2x improvement)
  • Query execution: <200ms (2x improvement)
  • Stretch goal: Any operation reaching ~20ms range

Files to Create

  1. /core/database/arango/optimized_client.py - Custom HTTP/2 client
  2. /core/database/arango/proxies/ro_proxy.go - Read-only proxy
  3. /core/database/arango/proxies/rw_proxy.go - Read-write proxy
  4. /tests/benchmarks/arango_connection_test.py - Benchmark suite

Fallback Plan

If benchmarks don't meet 2x improvement target, we continue using the existing PHP bridge which already works reliably.

References

  • Full PRD: /docs/prd/arango_optimized_connection_prd.md
  • Conveyance Framework: /CLAUDE.md
  • Expert review validated approach as "sign-off ready"

Priority: High
Effort: 10 days (with go/no-go gates)

Metadata

Metadata

Assignees

No one assigned

    Labels

    documentationImprovements or additions to documentationenhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions