You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recently metrics for Segment Replication were added to index/node/cluster level APIs. The metrics include a max bytes behind, max replication lag, and total bytes behind at each of these levels.
These metrics are computed by the primary shard for each replication group within its ReplicationTracker. These metrics were intended to be used to apply backpressure when the primary identifies its replicas is falling behind. Using these metrics means that when rolled up they are not representative of their label. For example - At a node level, bytes behind metrics will actually be the max/total bytes ahead the primaries that exist on that node are compared to their replicas that are distributed across the cluster. To identify lagging nodes, this is not the correct metric to use and is misleading.
I propose we rename these metric labels appropriately and add new metrics for bytes behind that is computed from the replica's perspective. We can compute them by:
Store on replicas received checkpoints from the primary
Start a timer for each checkpoint
Clear the timers once replicas complete a sync
Compute replication stats per replica with these two fields - bytes behind can be computed from the metadata sent in the latest received checkpoint, while the lag is the ongoing time of the earliest received checkpoint.
In doing this, we will have two sets of metrics - one set computed from a replica's perspective according to its latest received checkpoint which means it does not account for the time taken to publish checkpoints and another from the primary's perspective according to its latest refreshed checkpoint which accounts for publish time.
The text was updated successfully, but these errors were encountered:
mch2
changed the title
[BUG] Segment Replication aggregate metrics are misleading at a node level.
[BUG] Segment Replication aggregate metrics are misleading at a node level
Sep 19, 2023
Recently metrics for Segment Replication were added to index/node/cluster level APIs. The metrics include a max bytes behind, max replication lag, and total bytes behind at each of these levels.
These metrics are computed by the primary shard for each replication group within its ReplicationTracker. These metrics were intended to be used to apply backpressure when the primary identifies its replicas is falling behind. Using these metrics means that when rolled up they are not representative of their label. For example - At a node level, bytes behind metrics will actually be the max/total bytes ahead the primaries that exist on that node are compared to their replicas that are distributed across the cluster. To identify lagging nodes, this is not the correct metric to use and is misleading.
I propose we rename these metric labels appropriately and add new metrics for bytes behind that is computed from the replica's perspective. We can compute them by:
In doing this, we will have two sets of metrics - one set computed from a replica's perspective according to its latest received checkpoint which means it does not account for the time taken to publish checkpoints and another from the primary's perspective according to its latest refreshed checkpoint which accounts for publish time.
The text was updated successfully, but these errors were encountered: