Skip to content

[Feature Request] Counter Metrics to detect leader and follower check failures. #12711

Open
@gargharsh3134

Description

Is your feature request related to a problem? Please describe

Given the introduction of Request Tracing Framework (RTF) using OpenTelemetry (OTel), metrics (histogram/counter) can now be published and used to track failures.

This issue tracks the instrumentation for introducing following 2 counter metrics to identify node drops/health check failures for both the leader and follower nodes:

  1. Leader Check Failures-> Health check failure for ClusterManager Node (leader) performed by follower nodes.
  2. Follower Check Failures -> Health check failures for follower nodes performed by ClusterManager Node (leader).

Describe the solution you'd like

OTel Counter Metrics: Support for Counter type metrics, which was added as part of #10241, can be utilised to publish the metrics.

Related component

Cluster Manager

Describe alternatives you've considered

No response

Additional context

No response

Metadata

Assignees

Labels

Cluster ManagerenhancementEnhancement or improvement to existing feature or request

Type

No type

Projects

  • Status

    Now(This Quarter)

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions