Skip to content

feat: Service Health Diagnostics Enhancement #99

@jongio

Description

@jongio

Overview

Enhance service health status indicators with detailed diagnostic tooltips, providing users with actionable information about why services are unhealthy and what steps to take.

Problem

Users currently see services marked as "unhealthy" but have no insight into:

  • Why the service is unhealthy
  • What check was performed
  • What specific error occurred
  • What actions to take

This creates frustration and requires manual investigation through logs.

Solution

Add interactive diagnostic tooltips on health status badges that show:

Tooltip Content

  • Health check type and endpoint
  • HTTP status code and response time
  • Detailed error messages
  • Consecutive failure count
  • Service uptime and process info
  • Suggested troubleshooting actions
  • One-click copy diagnostics to clipboard

Example Tooltip

✗ Service Health: Unhealthy

Check: HTTP GET
Endpoint: http://localhost:8080/health
Status: 503 Service Unavailable
Response Time: 45ms
Consecutive Failures: 3

Error Details:
Database connection pool exhausted

Suggested Actions:
• Check service logs: azd app logs --service api
• Verify database is running
• Review connection pool settings

[Copy Diagnostics] [View Logs]

Implementation

Backend (Go)

  • Enhance HealthCheckResult with errorDetails, consecutiveFailures
  • Add structured error messages with context
  • Track failure counts across health checks
  • Provide status-specific suggested actions

Frontend (React)

  • Create HealthTooltip component using Radix UI
  • Build diagnostic report formatter
  • Implement copy-to-clipboard (markdown format)
  • Add action suggestions based on error type
  • Integrate with existing DualStatusBadge

Key Features

  • Visibility: Detailed diagnostic info on hover (400ms delay)
  • Clarity: Explain what check failed and why
  • Actionability: Suggest next steps with commands
  • Accessibility: Keyboard navigation, screen reader support, WCAG AA compliant
  • Copy Support: One-click copy formatted diagnostic report

Files Changed

New Files

  • cli/dashboard/src/components/HealthTooltip.tsx
  • cli/dashboard/src/components/HealthTooltipContent.tsx
  • cli/dashboard/src/lib/health-diagnostics.ts

Modified Files

  • cli/src/internal/healthcheck/types.go
  • cli/src/internal/healthcheck/checker.go
  • cli/src/internal/healthcheck/monitor.go
  • cli/dashboard/src/components/StatusIndicator.tsx

Success Criteria

  • 70%+ users hover on status icons within first session
  • 40%+ users copy diagnostics when encountering unhealthy services
  • Reduce troubleshooting time by 30%
  • Unit tests with 80% coverage
  • E2E tests for all health statuses
  • WCAG AA accessibility compliance

Timeline

Estimated: 3-4 days for full implementation and testing

Related

  • Full spec: docs/specs/service-health-diagnostics/spec.md
  • Task breakdown: docs/specs/service-health-diagnostics/tasks.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions