update langfuse deployment, configure logging #463

sallyom · 2025-12-15T18:47:27Z

The helm values file to configure log retention isn't working. This PR adds a post-install script to configure langfuse logging to avoid langfuse-clickhouse PVC from filling with logs we don't need or want to retain.

sallyom · 2025-12-15T20:35:06Z

Tested this, now in runner logs:

2025-12-15 20:29:44,091 - root - INFO - Langfuse: Privacy masking ENABLED - user messages and responses will be redacted
2025-12-15 20:29:44,216 - root - INFO - Langfuse: Model 'claude-sonnet-4-5@20250929' added to session metadata and tags
2025-12-15 20:29:44,216 - root - INFO - Langfuse: Session tracking enabled (session_id=agentic-session-1765826551, user_id=somalley, model=claude-sonnet-4-5@20250929)

and, in Langfuse UI

Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>

Implements privacy-preserving Langfuse tracing that redacts user messages and assistant responses while preserving usage metrics for cost tracking. Privacy Controls: - Default: MASK messages (privacy-first) - Set LANGFUSE_MASK_MESSAGES=false to disable (dev/testing only) What Gets Logged (with masking enabled): ✅ Token counts (input, output, cache read/creation) ✅ Cost calculations (USD per session) ✅ Model names, turn counts, session metadata ✅ Tool names and execution status ❌ User prompts → [REDACTED FOR PRIVACY] ❌ Assistant responses → [REDACTED FOR PRIVACY] ❌ Long tool outputs → [REDACTED FOR PRIVACY] This addresses the requirement to track usage metrics without logging potentially sensitive user data. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>

codecov · 2025-12-16T01:48:33Z

Codecov Report

❌ Patch coverage is 92.30769% with 2 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...onents/runners/claude-code-runner/observability.py	92.30%	2 Missing ⚠️

📢 Thoughts on this report? Let us know!

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>

github-actions · 2025-12-16T02:26:00Z

Claude Code Review

Summary

This PR adds privacy-first message masking to Langfuse observability and optimizes ClickHouse deployment to prevent disk space issues. The implementation is well-designed and production-ready with comprehensive test coverage and excellent documentation.

Overall Assessment: ✅ APPROVE with minor suggestions

The privacy masking implementation follows security best practices with secure-by-default behavior, and the ClickHouse optimization addresses a real operational issue. The code quality is excellent with thorough testing and documentation.

Issues by Severity

🟢 No Blocker or Critical Issues Found

This PR is ready to merge. All patterns follow established project standards.

🟡 Major Issues

None - code quality is excellent.

🔵 Minor Issues / Suggestions

1. Shell Script Robustness (Minor)

File: e2e/scripts/configure-clickhouse-ttl.sh

Issue: Line 112 uses grep -v to suppress errors, which could hide real failures

Suggestion: Consider more explicit error handling to distinguish between expected "table does not exist" errors and real failures

Impact: Low - current approach works but could be more robust

2. Documentation Clarity (Very Minor)

File: CLAUDE.md lines 360-369

Observation: The privacy controls section states "Masking is ENABLED BY DEFAULT" but then shows how to set LANGFUSE_MASK_MESSAGES=true

Suggestion: Consider rewording to make it clearer that setting the variable explicitly is optional

Impact: Very Low - documentation is already clear

Positive Highlights

🌟 Excellent Privacy-First Design

Secure by default: Masking enabled without requiring configuration
Clear opt-out: Explicit LANGFUSE_MASK_MESSAGES=false required to disable
Comprehensive masking: Handles strings, dicts, lists, and nested structures
Preserves observability: All usage metrics, costs, and metadata retained

🧪 Outstanding Test Coverage

268 lines of test code for privacy masking alone
9 comprehensive test cases covering all edge cases
Standalone executable: Tests can run independently
Clear assertions: Every test validates both masking AND metric preservation

📚 Production-Ready Documentation

User-facing docs: Clear CLAUDE.md section with examples
Operator guide: Comprehensive README-langfuse.md with troubleshooting
Inline comments: Excellent code documentation in observability.py
Security warnings: Clear warnings about disabling masking in production

🛡️ Follows Security Standards

No token logging: Consistent with project security patterns
Input validation: Proper environment variable handling
Secure defaults: Privacy-first approach aligns with security-standards.md
Clear security boundaries: Distinction between dev/test vs. production

🔧 Operational Excellence

ClickHouse optimization: Addresses real disk space issues
TTL automation: 7-day retention prevents runaway growth
Deployment scripts: Well-structured with platform auto-detection
Error handling: Comprehensive cleanup and error logging

✅ Code Quality

Type safety: Proper typing in masking function
Recursive handling: Correctly processes nested structures
Guard conditions: Prevents duplicate trace creation
Resource cleanup: Proper context manager usage and error handling

Recommendations

Priority 1: Merge as-is ✅

The PR is production-ready and addresses important privacy and operational concerns. All patterns follow established project standards.

Priority 2: Consider Minor Enhancements (Optional)

Add boundary test case for 50-character threshold
Enhance error handling in TTL script to better distinguish error types
Add explanatory comment for S3 credential patching workaround

Priority 3: Post-Merge Monitoring

Monitor ClickHouse disk usage after TTL configuration
Verify Langfuse traces show [REDACTED FOR PRIVACY] in production
Track any S3-related errors after credential patching

Architecture Alignment

✅ Follows Backend/Operator Standards: Proper error handling, secure-by-default, no sensitive data in logs

✅ Follows Python Standards: Black formatting, clear function signatures, comprehensive docstrings

✅ Follows Security Standards: Privacy-first design, token/credential masking, secure defaults

✅ Follows Testing Standards: Added test to runner-tests.yml workflow, comprehensive coverage

Final Verdict

Status: ✅ APPROVED

This is exemplary work that demonstrates:

Deep understanding of privacy requirements
Production-ready operational thinking
Excellent testing practices
Clear, comprehensive documentation

The minor suggestions above are enhancements, not blockers. The PR is ready to merge and will provide immediate value by protecting user privacy and preventing operational issues.

Great work, @sallyom! 🎉

🔍 View AI decision process (logs available for 90 days)

📋 View memory system files loaded (click to expand)

What Amber Loaded for Code Review

Amber automatically loaded these repository standards from the memory system:

CLAUDE.md - Master project instructions, development standards
backend-development.md - Go backend, K8s integration patterns
frontend-development.md - NextJS, Shadcn UI, React Query patterns
security-standards.md - Auth, RBAC, token handling
k8s-client-usage.md - User token vs service account patterns
error-handling.md - Consistent error patterns
react-query-usage.md - Data fetching patterns

Impact: This review used your repository's specific code quality standards, security patterns, and best practices from the memory system (PRs #359, #360) - not just generic code review guidelines.

The helm values file to configure log retention isn't working. This PR adds a post-install script to configure langfuse logging to avoid langfuse-clickhouse PVC from filling with logs we don't need or want to retain. --------- Signed-off-by: sallyom <somalley@redhat.com> Co-authored-by: Claude <noreply@anthropic.com>

This comment has been minimized.

Sign in to view

sallyom force-pushed the update-langfuse-logging branch from 911e7d9 to 810a2a7 Compare December 15, 2025 20:24

This comment was marked as outdated.

Sign in to view

ambient-code deleted a comment from github-actions bot Dec 15, 2025

sallyom and others added 2 commits December 15, 2025 20:25

update langfuse deployment, configure logging

9bc711b

Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>

sallyom force-pushed the update-langfuse-logging branch from 810a2a7 to 8eb65e1 Compare December 16, 2025 01:34

This comment was marked as outdated.

Sign in to view

sallyom force-pushed the update-langfuse-logging branch from 91c0a41 to 6dc6fb6 Compare December 16, 2025 02:02

This comment was marked as outdated.

Sign in to view

fix(ci): Add test_privacy_masking.py to runner test workflow

411cf2b

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com> Signed-off-by: sallyom <somalley@redhat.com>

sallyom force-pushed the update-langfuse-logging branch from 6dc6fb6 to 411cf2b Compare December 16, 2025 02:23

bobbravo2 added the priority: medium label Dec 16, 2025

bobbravo2 added this to the v0.0.14 milestone Dec 16, 2025

Gkrumbach07 approved these changes Dec 17, 2025

View reviewed changes

Gkrumbach07 merged commit 149dcf6 into ambient-code:main Dec 17, 2025
16 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

update langfuse deployment, configure logging #463

update langfuse deployment, configure logging #463

Uh oh!

sallyom commented Dec 15, 2025

Uh oh!

This comment has been minimized.

This comment was marked as outdated.

sallyom commented Dec 15, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

codecov bot commented Dec 16, 2025 •

edited

Loading

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Dec 16, 2025 •

edited

Loading

What Amber Loaded for Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

update langfuse deployment, configure logging #463

update langfuse deployment, configure logging #463

Uh oh!

Conversation

sallyom commented Dec 15, 2025

Uh oh!

This comment has been minimized.

This comment was marked as outdated.

sallyom commented Dec 15, 2025

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

codecov bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

This comment was marked as outdated.

This comment was marked as outdated.

github-actions bot commented Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Claude Code Review

Summary

Issues by Severity

🟢 No Blocker or Critical Issues Found

🟡 Major Issues

🔵 Minor Issues / Suggestions

1. Shell Script Robustness (Minor)

2. Documentation Clarity (Very Minor)

Positive Highlights

🌟 Excellent Privacy-First Design

🧪 Outstanding Test Coverage

📚 Production-Ready Documentation

🛡️ Follows Security Standards

🔧 Operational Excellence

✅ Code Quality

Recommendations

Priority 1: Merge as-is ✅

Priority 2: Consider Minor Enhancements (Optional)

Priority 3: Post-Merge Monitoring

Architecture Alignment

Final Verdict

What Amber Loaded for Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov bot commented Dec 16, 2025 •

edited

Loading

github-actions bot commented Dec 16, 2025 •

edited

Loading