-
Notifications
You must be signed in to change notification settings - Fork 0
Add Langfuse deployment for kind (Phase 1) #30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Created e2e/scripts/deploy-langfuse-kind.sh for automated deployment - Added comprehensive documentation in docs/deployment/langfuse-helm-poc.md - Added Makefile target: deploy-langfuse-kind - Follows project conventions from existing e2e scripts - Uses official Langfuse Helm chart (v1.5.9) with minimal customization - Supports automatic secret generation and validation - Includes troubleshooting guide and cleanup instructions
- Created e2e/scripts/cleanup-langfuse.sh following cleanup.sh conventions - Deletes Langfuse namespace - Removes langfuse.local from /etc/hosts (with backup) - Cleans up .env.langfuse credentials file - Supports --delete-cluster flag to also remove kind cluster - Follows project emoji/status message style
- Move container engine detection before kind cluster check - Set KIND_EXPERIMENTAL_PROVIDER before running kind commands - Ensures Podman users can check for existing clusters correctly
- Use langfuse.nextauth.secret.value instead of langfuse.nextauth.secret - Use langfuse.salt.value instead of langfuse.salt - Fix password generation to use openssl instead of /dev/urandom - Prevents hanging on password generation and Helm template errors
- Set clickhouse.replicaCount=1 (was 3 by default) - Disable pod anti-affinity for ClickHouse, PostgreSQL, Redis, ZooKeeper - Prevents pods from being stuck in Pending state on single-node clusters - Uses podAntiAffinityPreset=none for all StatefulSets
After thorough investigation of the langfuse-k8s Helm chart and its Bitnami dependencies, determined that: - Headless services (clusterIP: None) correctly omit sessionAffinity - Regular services only include sessionAffinity when explicitly configured - Issue is in upstream Bitnami charts, not langfuse-k8s repository - No PR needed for langfuse-k8s Documented three options if SessionAffinity warnings occur: 1. Override values at deployment time 2. Report to Bitnami upstream charts 3. Verify warnings are actually occurring 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
- ClickHouse: 3Gi → 512Mi requests, 1Gi limits - ZooKeeper: Reduce to 1 replica (was 3), 256Mi requests, 512Mi limits - Fixes 'Insufficient memory' scheduling errors on kind nodes - Total memory footprint now fits within kind node capacity (~2Gi)
- Change zookeeper.replicaCount to zookeeper.replicas - Bitnami ZooKeeper chart uses 'replicas' not 'replicaCount' - Will now correctly deploy 1 ZooKeeper pod instead of 3
The Helm chart expects langfuse.ingress.* not just ingress.* This was preventing the Ingress resource from being created. Fixed URLs: - ingress.enabled -> langfuse.ingress.enabled - ingress.className -> langfuse.ingress.className - ingress.hosts -> langfuse.ingress.hosts
- Deploy script now shows correct URL based on container engine - Documentation updated to reflect port 8080 for Podman - Docker uses standard port 80
This agent provides expertise in LangFuse deployment on ROSA: - Installation and configuration - Upgrading and troubleshooting - Performance optimization - Architecture design for scale - Integration with OpenShift features
Pull Request Review: Langfuse Deployment for kind (Phase 1)SummaryThis PR adds a comprehensive Langfuse observability platform deployment for local kind clusters. The implementation is well-structured, properly documented, and follows established patterns from the existing e2e infrastructure. Overall, this is a solid foundation for Phase 1 with some recommendations for improvement. ✅ Strengths1. Excellent Documentation
2. Consistent with Project Patterns
3. Security Best Practices
4. Operational Excellence
5. Thoughtful Agent Design
🔧 RecommendationsPriority 1: Critical Issues1. Missing .gitignore EntryIssue: Fix: Verify .env.langfuseLocation: 2. Unquoted Variable in sed CommandIssue: Line 47 in # Current (line 47)
sudo sed -i.bak '/langfuse.local/d' /etc/hosts
# Better
sudo sed -i.bak '/langfuse\.local/d' /etc/hostsLocation: Rationale: Escape the dot to match literal 3. StatefulSet Wait Condition FragilityIssue: Lines 130-137 in # Current (lines 132-136)
kubectl wait --namespace langfuse \
--for=jsonpath='{.status.readyReplicas}'=1 \
--timeout=300s \
statefulset/$statefulset &>/dev/null || true
# More robust
kubectl wait --namespace langfuse \
--for=jsonpath='{.status.readyReplicas}'=1 \
--timeout=300s \
statefulset/$statefulset 2>/dev/null || echo " ⚠️ Warning: $statefulset may still be starting"Location: Rationale: Better error visibility when pods don't reach ready state. Priority 2: Enhancements4. Resource Limits for Local TestingObservation: The script configures significant resources:
Suggestion: Document total resource requirements in the script header: # Resource Requirements:
# CPU: ~9 cores
# Memory: ~19.5GB RAM
# Disk: ~50GB
# For smaller environments, consider reducing replica countsLocation: 5. Helm Chart Version PinningIssue: Line 80 uses # Current (line 80)
helm upgrade --install langfuse langfuse/langfuse \
# Better (with version pin)
LANGFUSE_CHART_VERSION="1.5.9" # Or make configurable
helm upgrade --install langfuse langfuse/langfuse \
--version "$LANGFUSE_CHART_VERSION" \Location: Rationale: Reproducibility and avoiding unexpected breaking changes from chart updates. 6. Error Handling for Helm FailuresIssue: Line 80-111 helm install has Suggestion: Add explicit error handling: if \! helm upgrade --install langfuse langfuse/langfuse \
# ... all the flags ...
--wait \
--timeout=10m; then
echo "❌ Helm installation failed. Check logs:"
echo " kubectl logs -n langfuse -l app.kubernetes.io/name=langfuse --tail=100"
exit 1
fiLocation: 7. Documentation: OpenShift Route ClarificationIssue: Documentation mentions OpenShift but only kind deployment is implemented. Suggestion: In ## OpenShift Deployment (Phase 2 - Not Yet Implemented)
OpenShift deployment script (`deploy-langfuse-openshift.sh`) is planned for a future PR with:
- Security Context Constraints (SCC) configuration
- OpenShift Route support
- ...
**Status**: Phase 2 work - not included in this PR.Location: Priority 3: Nice-to-Haves8. Add Validation for Required PortsSuggestion: Check if ports 80/8080 are available before deployment: # Add after line 36 in deploy-langfuse-kind.sh
echo ""
echo "Checking port availability..."
if [ "$CONTAINER_ENGINE" = "podman" ]; then
PORT=8080
else
PORT=80
fi
if lsof -i:$PORT >/dev/null 2>&1; then
echo " ⚠️ Warning: Port $PORT is already in use"
echo " Langfuse may not be accessible at expected URL"
fi9. Add Smoke Test TargetSuggestion: Add a test-langfuse: ## Test Langfuse deployment
\t@cd e2e && ./scripts/test-langfuse.shWith a simple test script: #\!/bin/bash
# e2e/scripts/test-langfuse.sh
set -euo pipefail
echo "Testing Langfuse deployment..."
# Check pods
kubectl get pods -n langfuse
# Test HTTP endpoint
URL="http://langfuse.local:8080" # Adjust for Docker
if curl -s -o /dev/null -w "%{http_code}" "$URL" | grep -q "200\|30[0-9]"; then
echo "✅ Langfuse is accessible at $URL"
else
echo "❌ Langfuse is not responding at $URL"
exit 1
fi10. Agent Definition: Missing ExamplesObservation: The Suggestion: Add a note in the agent definition about kind deployment: ## Local Development
For local kind cluster deployments, use `make deploy-langfuse-kind` instead of this agent.
This agent is specialized for production ROSA deployments.Location: 🔍 Code Quality AssessmentBash Scripting
Documentation
Security
Testing
🎯 VerdictRecommendation: ✅ Approve with minor changes This is a well-implemented Phase 1 foundation. The critical issues are minor and easily addressed. The code quality is high, follows project conventions, and the documentation is excellent. Before Merge (Required):
Post-Merge (Recommended):
📊 Alignment with CLAUDE.md✅ Follows Project Standards:
✅ Security Practices:
|
Comprehensive planning document covering: - Phase 1 completion status and deployment details - Integration points (runner, backend, operator) - SDK usage patterns for Python and Go - Configuration strategies (3 options) - Implementation plan (6 steps) - Success criteria and testing strategy
PR Review: Add Langfuse deployment for kind (Phase 1)SummaryThis PR adds Langfuse (LLM observability platform) deployment capability to local kind clusters as Phase 1 of LLM observability integration. The implementation is well-structured, follows project conventions, and includes comprehensive documentation. ✅ Strengths1. Excellent Documentation
2. Script Quality
3. Security Practices
4. Integration with Existing Infrastructure
5. Langfuse ROSA Expert Agent
|
Add automatic S3 credential patching to resolve trace storage issue where langfuse-web and langfuse-worker deployments were missing proper S3 secret references, preventing traces from being persisted to storage. Changes: - Patch langfuse-web deployment with S3 credentials from langfuse-s3 secret - Patch langfuse-worker deployment with S3 credentials from langfuse-s3 secret - Add automatic rollout wait after patching - Document S3 credential fix in troubleshooting section - Set replica counts to 1 for kind deployments (already in script) This ensures traces are successfully uploaded to S3 (web) and downloaded for processing (worker), making them visible in the UI. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
S3 Credential Fix AppliedAdded automatic S3 credential configuration to resolve the trace storage issue discovered during testing. ProblemThe upstream Langfuse Helm chart deploys
Result: Traces appear to send successfully (200 OK) but never show up in the UI. SolutionThe deployment script now automatically patches both deployments after Helm installation to reference credentials from the
Changes in this commit
ValidationTested with Python Langfuse SDK client - traces now successfully persist to S3 and appear in UI within ~5 seconds. # Verify fix works
curl -s -u "pk-lf-xxx:sk-lf-xxx" http://localhost:3000/api/public/traces | jq '.meta.totalItems'
# Returns: 3 (traces successfully stored) |
Phase 2 Changes: - Focus on Claude Code Runner instrumentation only (removed Backend/Operator) - Simplified to single global configuration (ConfigMap + Secret) - Reduced metrics to MVP essentials (token usage, success/failure, basic latency) - Streamlined testing to integration tests only - Updated success criteria to match simplified scope - Document reduced from 389 to 317 lines (19% reduction) Phase 3 Ideas (New Document): - Extracted advanced features to langfuse-phase3-ideas.md - Includes: Backend/Operator instrumentation, multi-tenancy, feedback loops - Includes: Prompt management, cost alerts, ROSA deployment - Provides clear roadmap for post-Phase 2 enhancements 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Code Review - Langfuse Phase 1 DeploymentOverall Assessment✅ Excellent work! This PR delivers a well-structured, production-ready foundation for Langfuse observability. The implementation follows infrastructure-as-code best practices with thoughtful documentation and clear separation of concerns. Strengths1. Deployment Script Quality ⭐The deploy-langfuse-kind.sh script demonstrates excellent DevOps practices:
Highlight: The S3 credential fix (lines 141-254) shows deep understanding of the Helm chart limitations and proactive problem-solving. 2. Documentation Excellence 📚The langfuse-helm-poc.md is comprehensive and well-organized:
3. Agent Definition 🤖The langfuse-rosa-expert agent is well-designed with comprehensive competency mapping and clear operational methodology. 4. Security Practices 🔒
Issues & Concerns🔴 Critical: Missing .gitignore EntryFile: e2e/.gitignore or root .gitignore The script generates e2e/.env.langfuse with sensitive credentials, but I don't see this file added to .gitignore. Required action: Add .env.langfuse to .gitignore Risk: Without this, developers could accidentally commit sensitive credentials. 🟡 Medium Issues1. Helm Timeout Configuration (e2e/scripts/deploy-langfuse-kind.sh:111)
2. Cleanup Script Host File Management (e2e/scripts/cleanup-langfuse.sh:43-48)
3. Resource Allocation Hardcoded (e2e/scripts/deploy-langfuse-kind.sh:92-109)
🟢 Minor Enhancements
Performance Considerations✅ Good: Resource-constrained optimizations for kind (ClickHouse replica: 1, ZooKeeper: 1, anti-affinity disabled)
Code Quality✅ Excellent
Recommendations SummaryMust Fix Before Merge 🔴
Should Fix Before Merge 🟡
Nice to Have 🟢
ConclusionThis is high-quality infrastructure code that follows best practices. The only blocking issue is the missing .gitignore entry. Approval: ✅ Approve after adding .gitignore entry Next Steps for Phase 2
Happy to review Phase 2 PR when ready! 🚀 |
Created comprehensive, executable implementation guide for instrumenting the Claude Code Runner's _run_claude_agent_sdk() method with Langfuse. Prompt includes: - Step-by-step implementation with exact file paths and line numbers - Complete code examples (not just diffs) for copy-paste implementation - Dependency setup (langfuse>=2.53.3 in pyproject.toml) - Operator configuration for env var injection - Kubernetes resource creation (ConfigMap + Secret) - Comprehensive testing strategy with validation commands - Troubleshooting guide for common issues - Success criteria and rollback plan Target: Single codepath instrumentation (wrapper.py:152-469) Expected outcome: Session-level traces with token usage, cost, latency Optimized for: - AI agent consumption (Claude Code can execute directly) - Human implementation (clear step-by-step guide) - Production deployment (includes graceful degradation) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Pull Request Review: Langfuse Deployment for kind (Phase 1)Overall Assessment✅ APPROVED - This is a well-structured Phase 1 implementation with excellent documentation, secure credential handling, and thoughtful script design. The code is production-ready with only minor suggestions for enhancement. Strengths1. Excellent Script Design ⭐
2. Security Best Practices ⭐
3. S3 Credential Fix ⭐The automatic patching of langfuse-web and langfuse-worker deployments (lines 141-254) is excellent:
4. Documentation Excellence ⭐
5. Cleanup Script ⭐
Suggestions for Improvement1. Gitignore Specificity (Low Priority)While # Environments
.env
.env.uat
+.env.langfuse
.env.testLocation: 2. Shellcheck Validation (Enhancement)Consider adding shellcheck validation to CI pipeline for bash scripts: # .github/workflows/shellcheck.yml
- name: Run shellcheck
run: |
shellcheck e2e/scripts/*.shThis would catch potential issues early, similar to how golangci-lint works for Go code. 3. Script Error Messages (Minor)In echo "❌ Kind cluster 'vteam-e2e' not found"
-echo " Run './scripts/setup-kind.sh' first"
+echo " Run 'cd e2e && ./scripts/setup-kind.sh' first"Makes it easier for users unfamiliar with the repo structure. 4. Documentation Cross-References (Enhancement)In ### Step 2: Deploy Langfuse
+**Using Makefile** (from project root):
+```bash
+make deploy-langfuse-kind
+```
+
+**Direct script** (from e2e directory):
```bash
./scripts/deploy-langfuse-kind.shConsider: description: ... (Tools: Read, Write, Edit, Bash, WebSearch, WebFetch)This follows the pattern from other agents in CLAUDE.md. Security Review ✅
Code Quality ✅Bash Scripts
Makefile
Documentation
Testing RecommendationsPre-Merge Testing# Test deployment
make deploy-langfuse-kind
# Verify all pods running
kubectl get pods -n langfuse
# Test accessibility
curl -I http://langfuse.local:8080 # Podman
# or
curl -I http://langfuse.local # Docker
# Test cleanup
cd e2e && ./scripts/cleanup-langfuse.sh
# Test cleanup with cluster deletion
cd e2e && ./scripts/cleanup-langfuse.sh --delete-clusterFuture Integration Tests (Phase 2)Consider adding to e2e test suite:
Architecture Alignment ✅This PR follows project standards from CLAUDE.md:
Performance ConsiderationsResource Limits (deploy-langfuse-kind.sh:92-109)The script sets conservative limits appropriate for kind:
These are suitable for local development. For production ROSA deployments, the langfuse-rosa-expert agent should recommend higher limits based on workload. Single ReplicasThe script deploys single replicas for all components ( Phase 2 Readiness ✅This PR sets excellent groundwork for Phase 2:
Phase 2 can now focus purely on instrumentation without deployment concerns. Final VerdictAPPROVE AND MERGE ✅ This PR demonstrates:
The minor suggestions above are enhancements, not blockers. This is production-ready code that follows all project standards. Estimated Merge Risk: Low Next Steps Post-Merge
Great work on this POC! 🎉 |
Changed from full method replacement (318 lines) to surgical insertions at 4 targeted points (~81 new lines, 25% method growth). Key improvements: - Lower risk: Only modifies specific insertion points, preserves 75% of existing code - Easier review: Focused diffs show exact instrumentation locations - Better maintainability: Future SDK changes less likely to conflict - Same capability: Captures all usage data, costs, and metadata Implementation changes: - Insertion Point 1 (line 308): Initialize Langfuse trace (22 lines) - Insertion Point 2 (lines 410-413): Wrap queries with generation spans (12 lines) - Insertion Point 3 (lines 385-402): Capture usage from ResultMessage (25 lines) - Insertion Point 4 (line 456): Finalize trace and flush (22 lines) Benefits: ✅ 75% of method stays untouched (vs 0% with full replacement) ✅ Git diffs show clear insertion points (vs entire method rewrite) ✅ Merge-friendly for future upstream changes ✅ Isolated testing (verify instrumentation independently) Document structure: - Added "Why surgical approach?" section with comparison table - Each insertion point shows exact line numbers and surrounding context - Clear before/after code snippets for accurate placement - Summary showing total line changes and rationale 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Code Review: Langfuse Deployment for kind (Phase 1)Thank you for this well-structured PR! This is a solid foundation for adding LLM observability to the platform. ✅ StrengthsExcellent Script Quality
Smart Engineering Decisions
Comprehensive Documentation
🔍 Issues Found🟡 Medium Priority1. Credentials File Permissions (deploy-langfuse-kind.sh:273-282) 2. S3 Patch Race Condition (deploy-langfuse-kind.sh:230-249) 3. Hardcoded StatefulSet Names (deploy-langfuse-kind.sh:130-137) 🟢 Low Priority
🔒 Security Considerations✅ Good Practices:
📋 For Phase 2/Production:
📊 Testing & ValidationSuggested additions:
🎯 Alignment with CLAUDE.md Standards✅ Follows Guidelines:
📋 Minor Gaps:
🔢 Metrics
🏆 Overall AssessmentRating: Approve with Minor Suggestions ⭐⭐⭐⭐½ This is high-quality infrastructure code with excellent documentation. Scripts are well-written, deployment approach is sound, and Phase 2/3 planning shows strategic thinking. Recommendation: Merge after addressing credential file permissions (chmod 600). Other suggestions can be addressed in follow-up PRs. Great work on this foundational piece! Pre-Merge Checklist
References:
|
This implements complete LLM observability for the Claude Code Runner using the surgical instrumentation approach (4 targeted insertion points vs full method replacement). ## Changes Summary ### 1. Runner Dependencies (pyproject.toml) - ✨ Add langfuse 3.9.1 (latest, Nov 6 2025) - ⬆️ Update anthropic to 0.72.0 (from 0.68.0) - ⬆️ Update claude-agent-sdk to 0.1.6 (from 0.1.4) - All dependencies Python 3.13 compatible ### 2. Runner Instrumentation (wrapper.py) **Import Changes:** - Add Langfuse SDK imports (using 3.x API) - Note: Langfuse 3.x changed API - no longer uses langfuse.decorators **__init__ Changes (lines 38-51):** - Initialize Langfuse client with env-based config - Graceful degradation if LANGFUSE_ENABLED=false or keys missing - Single client instance reused for all traces in session **_run_claude_agent_sdk() Instrumentation (4 insertion points):** Insertion Point 1 (lines 332-352): Session-level trace initialization - Creates trace with session metadata (namespace, project, model, workspace) - Links to Kubernetes session ID for cross-component correlation - Initialize generation_span variable for per-query tracking Insertion Point 2 (lines 455-472): Per-query generation spans - Wraps each Claude query with generation span - Captures prompt input and model name - Uses nonlocal to update parent scope variable Insertion Point 3 (lines 430-473): Usage data capture from ResultMessage - Extracts token counts (input/output/total) from SDK result - Records cost_usd, duration_ms, duration_api_ms - Ends generation span and clears for next query Insertion Point 4 (lines 540-561): Trace finalization and flush - Updates trace with final session outcome (success, turns) - Aggregates total cost and duration - CRITICAL flush() call ensures data sent before pod exit **Total Modification**: ~81 new lines across 4 insertions (~13.5% of method) ### 3. Operator Configuration (sessions.go) **EnvFrom Changes (lines 575-609):** - Add langfuse-keys Secret injection (Optional: true) - Add langfuse-config ConfigMap injection (Optional: true) - Maintain existing runnerSecretsName logic - Optional flag ensures pods start even without Langfuse ### 4. Kubernetes Manifests (langfuse/langfuse-config.yaml) **New ConfigMap:** - LANGFUSE_HOST: cluster-internal URL (langfuse-web.langfuse.svc.cluster.local:3000) - LANGFUSE_ENABLED: "true" (feature flag) **New Secret:** - LANGFUSE_PUBLIC_KEY: pk-lf-REPLACE-ME (placeholder) - LANGFUSE_SECRET_KEY: sk-lf-REPLACE-ME (placeholder) ### 5. Documentation (langfuse-phase2-implementation-prompt.md) - Complete step-by-step implementation guide (787 lines) - Troubleshooting procedures - Testing validation steps ## Breaking Changes⚠️ **Langfuse 3.x API Migration** The implementation uses Langfuse 3.9.1 which has breaking changes from 2.x: - OLD: `from langfuse.decorators import langfuse_context, observe` - NEW: `from langfuse import Langfuse, observe` - `langfuse_context` no longer exists in 3.x ## Validation Completed ✅ Local Testing (Python 3.13 venv): - Dependencies install successfully - Langfuse 3.9.1 imports correctly - API compatibility verified ⏭️ Cluster Testing (requires deployment): - Kubernetes manifests apply correctly - Traces appear in Langfuse UI - Token usage data captured - Cost tracking operational - Interactive mode works ## Deployment Instructions 1. **Update Langfuse Secret** (before deploying runner): ```bash # Get keys from Langfuse UI → Settings → API Keys kubectl edit secret langfuse-keys -n ambient-code # Replace pk-lf-REPLACE-ME and sk-lf-REPLACE-ME ``` 2. **Deploy Manifests**: ```bash kubectl apply -f components/manifests/langfuse/langfuse-config.yaml ``` 3. **Rebuild Runner Image**: ```bash cd components/runners/claude-code-runner make build CONTAINER_ENGINE=podman ``` 4. **Test AgenticSession**: Create session and check logs for "Langfuse client initialized" message ## Next Steps (Phase 3) Phase 3 enhancements documented in `langfuse-phase3-ideas.md`: - Backend API instrumentation (Go) - Operator instrumentation (Go) - Multi-tenant project isolation - Advanced metrics (prompt analysis, feedback loops) - ROSA production deployment ## Related - Phase 1 PR: #30 (Langfuse deployment and S3 fixes) - Context Doc: docs/deployment/langfuse-phase2-context.md (reference only) - Phase 3 Ideas: docs/deployment/langfuse-phase3-ideas.md (future work) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
This implements complete LLM observability for the Claude Code Runner using the surgical instrumentation approach (4 targeted insertion points vs full method replacement). - ✨ Add langfuse 3.9.1 (latest, Nov 6 2025) - ⬆️ Update anthropic to 0.72.0 (from 0.68.0) - ⬆️ Update claude-agent-sdk to 0.1.6 (from 0.1.4) - All dependencies Python 3.13 compatible **Import Changes:** - Add Langfuse SDK imports (using 3.x API) - Note: Langfuse 3.x changed API - no longer uses langfuse.decorators **__init__ Changes (lines 38-51):** - Initialize Langfuse client with env-based config - Graceful degradation if LANGFUSE_ENABLED=false or keys missing - Single client instance reused for all traces in session **_run_claude_agent_sdk() Instrumentation (4 insertion points):** Insertion Point 1 (lines 332-352): Session-level trace initialization - Creates trace with session metadata (namespace, project, model, workspace) - Links to Kubernetes session ID for cross-component correlation - Initialize generation_span variable for per-query tracking Insertion Point 2 (lines 455-472): Per-query generation spans - Wraps each Claude query with generation span - Captures prompt input and model name - Uses nonlocal to update parent scope variable Insertion Point 3 (lines 430-473): Usage data capture from ResultMessage - Extracts token counts (input/output/total) from SDK result - Records cost_usd, duration_ms, duration_api_ms - Ends generation span and clears for next query Insertion Point 4 (lines 540-561): Trace finalization and flush - Updates trace with final session outcome (success, turns) - Aggregates total cost and duration - CRITICAL flush() call ensures data sent before pod exit **Total Modification**: ~81 new lines across 4 insertions (~13.5% of method) **EnvFrom Changes (lines 575-609):** - Add langfuse-keys Secret injection (Optional: true) - Add langfuse-config ConfigMap injection (Optional: true) - Maintain existing runnerSecretsName logic - Optional flag ensures pods start even without Langfuse **New ConfigMap:** - LANGFUSE_HOST: cluster-internal URL (langfuse-web.langfuse.svc.cluster.local:3000) - LANGFUSE_ENABLED: "true" (feature flag) **New Secret:** - LANGFUSE_PUBLIC_KEY: pk-lf-REPLACE-ME (placeholder) - LANGFUSE_SECRET_KEY: sk-lf-REPLACE-ME (placeholder) - Complete step-by-step implementation guide (787 lines) - Troubleshooting procedures - Testing validation steps⚠️ **Langfuse 3.x API Migration** The implementation uses Langfuse 3.9.1 which has breaking changes from 2.x: - OLD: `from langfuse.decorators import langfuse_context, observe` - NEW: `from langfuse import Langfuse, observe` - `langfuse_context` no longer exists in 3.x ✅ Local Testing (Python 3.13 venv): - Dependencies install successfully - Langfuse 3.9.1 imports correctly - API compatibility verified ⏭️ Cluster Testing (requires deployment): - Kubernetes manifests apply correctly - Traces appear in Langfuse UI - Token usage data captured - Cost tracking operational - Interactive mode works 1. **Update Langfuse Secret** (before deploying runner): ```bash # Get keys from Langfuse UI → Settings → API Keys kubectl edit secret langfuse-keys -n ambient-code # Replace pk-lf-REPLACE-ME and sk-lf-REPLACE-ME ``` 2. **Deploy Manifests**: ```bash kubectl apply -f components/manifests/langfuse/langfuse-config.yaml ``` 3. **Rebuild Runner Image**: ```bash cd components/runners/claude-code-runner make build CONTAINER_ENGINE=podman ``` 4. **Test AgenticSession**: Create session and check logs for "Langfuse client initialized" message Phase 3 enhancements documented in `langfuse-phase3-ideas.md`: - Backend API instrumentation (Go) - Operator instrumentation (Go) - Multi-tenant project isolation - Advanced metrics (prompt analysis, feedback loops) - ROSA production deployment - Phase 1 PR: #30 (Langfuse deployment and S3 fixes) - Context Doc: docs/deployment/langfuse-phase2-context.md (reference only) - Phase 3 Ideas: docs/deployment/langfuse-phase3-ideas.md (future work) 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Adds scripted Langfuse deployment to local kind clusters using upstream Helm chart.
Phase 1 scope:
Tested:
Podman on macOS, single-node kind cluster
Phase 2 (future PR):
Instrument platform with Langfuse for LLM observability
Quick start: