-
Notifications
You must be signed in to change notification settings - Fork 796
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature Request] HA Tracker support for multiple prometheus replicas in the same batch #6256
Comments
@eduardscaueru This will likely increase distributor CPU needs as the whole batch needs to be read. |
Hello @friedrichg. Thank you for your response. Yes, I would like to contribute and implement this feature. |
Hi @friedrichg I implemented this feature in 2 ways
The PR 6279, where the valid HA pairs are stored, probably will increase both the distributor's memory and CPU (as you suggested). On the other hand, the PR 6278, where the KV store is called every time a timeseries has both cluster and replica labels, will probably consume more CPU. Waiting for a review. Thank you! |
Kindly asking if you have any updates regarding the PRs? Please let me know if there is anything to do from my side. Thanks! |
@eduardscaueru |
Is your feature request related to a problem? Please describe.
The HA Tracker mechanism Cortex provides, is based on a Prometheus only remote write, where each batch is from a separate replica. In our case, we have multiple datapoints coming from multiple producers, mixed in a remote written batch. Some datapoints can be sourced from Prometheus HA pair while others might be sourced from other systems.
Distributor HA Tracker implementation will look for the prometheus replica label only in the first datapoint from the batch and assumes that other datapoints from that batch are from the same prometheus replica.
Thus, we have 3 scenarios where the FIRST datapoint from the batch:
2.1. If it is the same as the elected leader replica selected and stored in kv store the batch will be pushed
2.2. If its NOT the same as the elected leader replica selected and stored in kv store the batch will not be pushed
Describe the solution you'd like
Maybe apply the same mechanism of the ha tracker after the batch is separated into smaller batches for each (cluster, replica) pairs. Instead of calling the findHALabels method, if HA Tracker is enabled, add a method to separate these batches and then iterate through them and discard only the smaller batch from the replica which is not in the kv store.
Describe alternatives you've considered
It is possible on our services implementation to segregate datapoints and create HA pairs specific batches. We will have dedicated batches for non HA pairs sourced datapoints and for each HA pair.
The following diagram represents the solution from our side where T1 and T2 represent cluster labels, a and b replica labels (as in the official ha tracker docs) and s1 and s2 different samples for each of these series.
Additional context
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: