feat: GoZen v3.0.0 - Context Compression & Middleware Pipeline#11
Open
feat: GoZen v3.0.0 - Context Compression & Middleware Pipeline#11
Conversation
…ing (v2.2.0) This release adds comprehensive observability and smart routing capabilities: - Usage Tracking: Record API usage with cost calculation based on model pricing - Budget Control: Set daily/weekly/monthly limits with warn/downgrade/block actions - Provider Health: Monitor provider health with success rate and latency metrics - Smart Load Balancing: Support failover, round-robin, least-latency, least-cost strategies - Session Insights: Track per-session usage with turn-by-turn details - Webhook Notifications: Send alerts for budget warnings, provider status, failovers - Web UI: New Usage tab with cost summary, budget status, and provider health New files: - internal/proxy/usage.go, budget.go, healthcheck.go, loadbalancer.go, metrics.go - internal/notify/webhook.go - internal/web/api_usage.go, api_health.go, api_sessions.go, api_webhooks.go, api_pricing.go Config version: 7 → 8 SQLite schema version: 2 → 3 Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
… Mistral, Qwen models Expand default model pricing to cover common programming models: - OpenAI: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo, o1, o1-mini, o3-mini - DeepSeek: deepseek-chat, deepseek-coder, deepseek-reasoner - MiniMax: abab6.5s/6.5/6.5t/5.5-chat - GLM (Zhipu): glm-4-plus/0520/air/airx/long/flash/flashx, codegeex-4 - Google Gemini: gemini-2.0-flash, gemini-1.5-pro/flash - Mistral: mistral-large/small, codestral, ministral, pixtral - Qwen (Alibaba): qwen-max/plus/turbo/long, qwen-coder-plus/turbo Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
[BETA] Context Compression:
- Add CompressionConfig for transparent context compression
- Implement ContextCompressor with token estimation and summarization
- Add compression Web API endpoints
[BETA] Middleware Pipeline:
- Add pluggable middleware architecture with Middleware interface
- Implement Pipeline executor with priority-based ordering
- Add Registry for middleware lifecycle management
- Add PluginLoader for local (.so) and remote plugin support
Built-in Middleware:
- context-injection: Auto-inject .cursorrules, CLAUDE.md
- request-logger: Log all requests and responses
- session-memory: Cross-session intelligence (v3.1 feature)
- orchestration: Multi-model orchestration - voting, chain, review (v3.2 feature)
Web API:
- GET/PUT /api/v1/compression - Compression config
- GET /api/v1/compression/stats - Compression statistics
- GET/PUT /api/v1/middleware - Middleware config
- POST /api/v1/middleware/{name}/enable|disable
- POST /api/v1/middleware/reload
Documentation:
- Add middleware development guide for third-party developers
All features are disabled by default and marked as BETA.
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Observatory: session monitoring, stuck detection, idle timeout - Guardrails: spending caps, rate limiting, sensitive operation detection - Coordinator: file locking, change awareness, context warnings - TaskQueue: priority-based task management with retry support - Runtime: autonomous agent execution with planning/execution/validation phases - Web API endpoints for all agent components All features are BETA and disabled by default. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add tests for proxy package (metrics, usage, budget, healthcheck, loadbalancer, session, compression, logger) - Add tests for web package (API v2 endpoints, server helpers) - Add tests for config and daemon packages - Achieve 82% coverage for proxy package (target: ≥80%) - Achieve 80.1% coverage for web package (target: ≥80%) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/session.go: Fix race condition in GetSessionUsage, GetSessionInsight, and GetContextWarning by holding lock during sync.Map access - proxy/healthcheck.go: Fix double-close panic in Stop() by tracking stopped state - agent/runtime.go: Fix ignored rand.Read error, add lock for task.Plan assignment - agent/observatory.go: Fix data race by reading config.StuckThreshold under lock - config/migrate.go: Clean up incomplete file on copy failure - web/auth.go: Add graceful shutdown for sessionCleanupLoop, fix rand.Read error - web/server.go: Add sync.RWMutex for syncMgr access - web/api_sync.go: Use lock when accessing/modifying syncMgr Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- config/config.go: Add nil check in ScenarioRoute.UnmarshalJSON to prevent panic when providers array contains null elements - config/config.go: Add nil checks in ProviderNames and ModelForProvider methods - proxy/logdb.go: Handle stmt.Exec errors in flushBatch, rollback on failure - web/auth.go: Fix IP spoofing in clientIP by properly parsing X-Forwarded-For header (extract first IP from comma-separated list) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- agent/taskqueue.go: Handle rand.Read error with timestamp fallback - daemon/server.go: Fix randomID to properly check os.Open and Read errors - notify/webhook.go: Handle json.Marshal errors in format functions - proxy/logdb.go: Add explicit error ignoring with comments for best-effort operations (os.Chmod, os.Remove, setSchemaVersion) - update/check.go: Add explicit error ignoring for cache operations Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- cmd/web.go: ignore exec.Command().Start() errors for browser open - internal/daemon/daemon.go: ignore os.Remove errors in cleanup functions - internal/middleware/loader.go: ignore os.Remove errors in cache operations Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/server.go: safe type assertion for message role - proxy/session.go: use fmt.Sprintf for duration formatting (fixes overflow) - web/server.go: explicitly ignore JSON encode errors (best-effort) - middleware/loader.go: ensure temp file closed via defer Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/logdb.go: explicitly ignore tx.Rollback/Commit errors (best-effort) - daemon/server.go: call pullCancel() immediately after Pull() returns Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- proxy/profile_proxy.go: ignore JSON encode error in writeError - daemon/api.go: close request body after JSON decode Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add tests for sessionCleanupLoop, StopCleanup, clientIP - Add tests for HandleFunc, SetSyncManager - Web coverage: 79.7% -> 80.7% - Add disclaimer to usage page: data is for reference only Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add tests for Shutdown resource cleanup (syncCancel, pushTimer, watcher) - Add tests for session cleanup logic (stale session removal) - Add tests for initSync cancellation of existing sync - Add tests for DaemonSysProcAttr, IsDaemonRunning, StopDaemonProcess - Add test for startProxy - Update CI coverage requirement: daemon 40% -> 50% These tests specifically target memory leak prevention by verifying: - Context cancellation on shutdown - Timer cleanup - Goroutine termination paths - Stale session cleanup Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Update version from 4.0.0 to 3.0.0 - Consolidate all features (v2.2-v4.0) into single v3.0 release - Create unified release plan document (.dev/v3.0-release-plan.md) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Website updates: - Upgrade docs version from 2.1 to 3.0 - Add Japanese (ja) and Korean (ko) locale support - Add v3.0 feature documentation: - Usage Tracking & Budget Control - Health Monitoring - Load Balancing - Webhooks - Context Compression - Middleware Pipeline - Agent Infrastructure Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add TDD requirement for new feature development - Add formal release checklist: 1. Bug check 2. Version number verification 3. Website documentation review 4. README files update - Add v3.0.0 to version history Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add v3.0 new features section covering usage tracking, budget control, provider health monitoring, smart load balancing, webhooks, context compression, middleware pipeline, and agent infrastructure. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Replace vanilla JS frontend with modern React stack: - React 18 + TypeScript + Vite build system - shadcn/ui components (Radix UI + Tailwind CSS) - TanStack Query for server state, Zustand for UI state - React Router v6 for navigation - react-i18next with 6 languages (en, zh-CN, zh-TW, es, ja, ko) - Dark/light/system theme support - Type-safe API client with React Query hooks Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Implement Bot Agent system (Phase 1-4): - Add bot gateway with IPC communication via Unix socket - Support 5 chat platforms: Telegram, Discord, Slack, Lark, FB Messenger - Natural language intent parsing for commands - Process registry with auto-generated unique names - Session management and approval workflow - Bot configuration in zen.json with platform-specific settings Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add GET/PUT /api/v1/bot API endpoints with token masking - Create Bot config page with 5 tabs: General, Platforms, Interaction, Aliases, Notifications - Support 5 chat platforms: Telegram, Discord, Slack, Lark, Facebook Messenger - Add Collapsible UI component for platform config sections - Add i18n translations for all 6 languages (en, zh-CN, zh-TW, es, ja, ko) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add comprehensive unit and integration tests for the bot package: - gateway_test.go: Start/Stop, handleConnection, IPC message handling - handlers_test.go: intent processing, message handling, approvals - client_test.go: client initialization and error cases - nlu_test.go: NLU parser for various intents and languages - registry_test.go: process registry operations - session_test.go: session management - protocol_test.go: IPC protocol types - adapters/adapter_test.go: adapter config helpers Use short socket paths (/tmp/zen-test-*.sock) for macOS compatibility with Unix socket 104-byte path limit. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
The custom UnmarshalJSON was missing Sync, Pricing, Budgets, Webhooks, HealthCheck, Compression, Middleware, Agent, and Bot fields, causing them to be nil after JSON parsing. Also adds comprehensive tests for the bot API endpoints, bringing internal/web coverage from 73.8% to 81.2%. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add Bot Gateway section to all README files (EN, zh-CN, zh-TW, es) - Create comprehensive bot.md documentation for website with: - Platform setup guides (Telegram, Discord, Slack, Lark, FB Messenger) - Bot commands and natural language support - Configuration examples - Security best practices - Update sidebars to include bot documentation - Bump version to 3.0.0-alpha.3 Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
87f657b to
02d2e09
Compare
…oval When IsDaemonRunning() checked if the daemon was listening on the expected port, it would remove the PID file if the port check failed (e.g., timeout). This made it impossible to stop the daemon later, causing upgrade and restart commands to fail silently while the old daemon kept running. Now IsDaemonRunning() returns the PID even when port check fails (as long as the process is alive), and StopDaemonProcess() will attempt to stop any alive process found in the PID file. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add comprehensive integration tests for the daemon module covering: - Daemon start: PID file creation, port listening, status API - Daemon stop: process termination, PID file removal, port release - Daemon restart: old process cleanup, PID file update - Upgrade scenario: stopping daemon even when port check fails - Stale PID file handling - Graceful shutdown with active requests These tests run against the actual binary and verify real-world behavior that unit tests cannot catch (like the PID file removal bug fixed in the previous commit). Run with: go test -tags=integration ./test/integration/... Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
When the daemon is running but the PID file is missing (orphaned state), IsDaemonRunning now checks if the proxy port is in use and returns -1 as PID to indicate unknown. This prevents false "not running" reports and failed restart attempts due to port conflicts. Also cleans up PID file when daemon fails to start (e.g., port in use). Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Cover all real-world daemon management scenarios: - Clean state (no daemon, no PID file) - Normal running daemon with valid PID file - Stale PID file (process dead) - Orphaned daemon (port in use, no PID file) - critical upgrade scenario - Stale PID file + port taken by different process - Process alive but not listening (startup phase) - Stop normal daemon - Stop when not running - Stop orphaned daemon (unknown PID) - Upgrade with running daemon - Upgrade with lost PID file - Rapid start attempts - Port released but PID file remains Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Previously, upgrade would stop the daemon and tell users it would restart on next use. This caused issues for users with active sessions as the proxy would disappear mid-work. Now: - upgrade: properly restarts daemon after installing new version - upgrade: starts daemon even if it wasn't running (for convenience) - migrate: starts daemon after migration completes Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add middleware types to api.ts - Add middlewareApi with get/update/enable/disable/reload methods - Add use-middleware.ts hooks for React Query integration - Create middleware page with global toggle and per-middleware controls - Add middleware nav item to sidebar - Add i18n translations for all 6 locales (en, zh-CN, zh-TW, ja, ko, es) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Remove ANTHROPIC_API_KEY, OPENAI_API_KEY, and model-related hints since these are configured via provider settings. Only show tool-related environment variables like CLAUDE_CODE_MAX_TOKENS, BASH_MAX_TIMEOUT_MS, etc. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add install plugin dialog with local file and remote URL options - Add remove button for non-builtin middlewares - Display plugin path/url in middleware list - Update MiddlewareEntry type with path and url fields - Add i18n translations for all 6 locales Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Backend: - Add POST /api/v1/middleware/upload endpoint - Upload .so files to ~/.zen/plugins/ directory - Generate checksum-based filename for cache busting - Limit upload size to 50MB Frontend: - Replace local path input with file upload - Auto-fill plugin name from filename - Show selected file info (name, size) - Update i18n translations for all 6 locales Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Add OpenAI → Anthropic SSE stream transformation for Claude Code/OpenCode to use OpenAI-style providers - Add Anthropic → OpenAI SSE stream transformation for Codex to use Anthropic-style providers - Fix Web UI provider creation API format (fields not saving correctly) - Add comprehensive stream transformation tests Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
- Make AuthManager.StopCleanup() idempotent using sync.Once to prevent panic on repeated calls - Add deep copy methods (Clone) for ProviderConfig and ProfileConfig - Return deep copies from Store.GetProvider, GetProfileConfig, ProviderMap, GetProjectBinding, GetAllProjectBindings to prevent mutable reference leaks - Fix TestGetProvider to expect unmasked token (matches dc1c25e behavior) Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Socket path is an internal implementation detail that should not be exposed to users. The default value is sufficient for all use cases. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add comprehensive tests for authApi, sessionsApi, webhooksApi, middlewareApi, budgetApi, providerHealthApi, and extended coverage for syncApi, usageApi, and settingsApi. Add corresponding MSW mock handlers. Coverage improved from 68.83% to 83.11% statements. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Add tests for TransformPath, convertInputToMessages, convertContent, extractTextFromContent, mapRole, Responses API transformation, sanitizePluginName, middleware upload, and usage API time range/groupBy params. Transform: 74.6% → 97.8%, Web: 77.4% → 81.6%. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Bot tokens are local config values like provider tokens — masking them in the web UI caused a bug where masked values were saved back to config. Remove masking from the API and switch frontend inputs to type="text". Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
StopDaemonProcess now tries POST /api/v1/daemon/shutdown first, which works even when the PID file is missing (orphaned daemon). Falls back to SIGTERM only if the HTTP approach fails. Fixes "PID is unknown" error on daemon restart. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Previously the bot gateway was only started once at daemon startup. Now onConfigReload calls reinitBot() to stop and restart the gateway when bot config changes, avoiding the need for a manual daemon restart. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Each bot tab was only sending its own section (e.g. { platforms: ... }),
causing the backend merge to overwrite other sections with zero values.
All tabs now send the complete config object.
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Use a Select with common timezones instead of free-text input. Defaults to the user's local timezone and shows UTC offset labels (e.g. Asia/Shanghai (UTC+8)). Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Remove explicit handlers for status/list/logs/errors queries - these now go through handleChat with enriched process context in the system prompt. Only action commands (bind, control, send, persona, forget, approve) keep explicit regex handlers. Key changes: - BuildSystemPrompt now accepts []*ProcessInfo with full state (status, path, uptime, current task) instead of just process names - Remove Polish() function - LLM generates natural responses directly - Remove ParseNaturalLanguage() and extractTarget() from NLU - Remove sendProcessList() and handleStatusQuery() handlers - Simplify processIntent() switch - query intents fall through to chat Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
…dpoint Rewrite BuildSystemPrompt to clearly define bot identity as "Zen" with explicit instructions about GoZen-managed sessions vs OS processes. Memory/persona is now appended rather than replacing base identity. Remove regex-based IntentQueryList - task/status queries now handled by LLM through IntentChat fallback for better multi-language support. Add configurable model field to BotConfig and LLMClient. Add SSE streaming chat endpoint and CLI test harness. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Mark the bot gateway as beta across the codebase: config struct comments, route comments, web UI badge, and amber beta notice card. Add betaNotice i18n translations for all 6 locales. Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
… sidebar beta badges - Add Discord/Lark/Slack adapters with real message handling - Add bot memory (persona/instructions) with file persistence - Add bot-proxy bridge for session visibility in system prompt - Add ChatTab with SSE streaming and sessionStorage persistence - Add sidebar beta badges for Bot and Middleware nav items - Register CLI sessions with daemon for bot awareness - Improve sync settings UI with timezone select dropdown Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
Co-Authored-By: Claude (claude-opus-4-6) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
This PR implements GoZen v3.0.0 with two major BETA features:
All features are disabled by default and marked as BETA.
Features
Context Compression (BETA)
Transparent context compression that intercepts large conversation histories, summarizes them with a cheap model, and forwards compressed requests upstream.
Middleware Pipeline (BETA)
Transform GoZen into a programmable AI API gateway with a pluggable middleware chain.
Architecture:
Middlewareinterface for custom middleware developmentPipelineexecutor with priority-based orderingRegistryfor middleware lifecycle managementPluginLoaderfor local (.so) and remote plugin supportBuilt-in Middleware:
context-injectionsession-memoryrequest-loggerorchestrationThird-Party Plugin Support
Configuration
{ "compression": { "enabled": false, "threshold_tokens": 50000, "target_tokens": 20000, "summary_model": "claude-3-haiku-20240307", "preserve_recent": 4 }, "middleware": { "enabled": false, "middlewares": [ { "name": "context-injection", "enabled": true, "source": "builtin" } ] } }Web API
Compression
GET/PUT /api/v1/compression- ConfigurationGET /api/v1/compression/stats- StatisticsMiddleware
GET/PUT /api/v1/middleware- ConfigurationGET /api/v1/middleware/{name}- DetailsPOST /api/v1/middleware/{name}/enable- EnablePOST /api/v1/middleware/{name}/disable- DisablePOST /api/v1/middleware/reload- Reload allFiles Changed
New Files
internal/proxy/compression.go- Context compressorinternal/middleware/*.go- Middleware packageinternal/web/api_compression.go- Compression APIinternal/web/api_middleware.go- Middleware APIdocs/middleware-development.md- Development guideModified Files
internal/config/config.go- New config typesinternal/config/store.go- New getters/settersinternal/proxy/server.go- Integrationinternal/daemon/server.go- Initializationcmd/root.go- Version bump to 3.0.0Testing
All existing tests pass. New tests added for:
go test ./...Breaking Changes
None. All features are opt-in and disabled by default.