Add datacentre discovery #789

XAMPPRocky · 2023-09-15T08:19:06Z

For adding more accurate latency measurement between a proxy and a datacenter, we need the proxy to know what datacentres are available to measure.

As a solution we were thinking that we'd add a new xDS resource type called Datacentre or similar that a proxy which would contain the IP address and the QCMP port. The proxy can then use that address for QCMP latency measurements.

For relay deployments it would send all agents that are connected to it, for single cluster control plane deployments, it would return its own IP and QCMP port.

The text was updated successfully, but these errors were encountered:

markmandel · 2023-09-15T17:49:49Z

For relay deployments it would send all agents that are connected to it, for single cluster control plane deployments, it would return its own IP and QCMP port.

It could also be an optional element - if it's not there, then it's not going to check latency and keep a metric of it (single cluster, and also if people just don't care 😃, say for example if people have separate installs in the same Cloud Region/data centre).

XAMPPRocky · 2023-09-18T08:16:21Z

I'm not sure I see the value of it being optional. Even if you're hosting in the same datacentre, understanding the latency between hops is important, as latency isn't only dictated by distance. If there is an intra-datacentre issue causing latency spikes (as opposed to inter-datacentre), then this would provide that information, where as if it's optional then you would be in the dark.

If you don't want that metric it's easier for the user to just add a filter to your grafana_agent to remove it. Having this information is important for quilkin to be able to build a network topology on top of this, so that we can accurately assign players to the cluster that is closest to their proxy.

markmandel · 2023-09-18T21:21:22Z

I'm not sure I see the value of it being optional. Even if you're hosting in the same datacentre, understanding the latency between hops is important, as latency isn't only dictated by distance. If there is an intra-datacentre issue causing latency spikes (as opposed to inter-datacentre), then this would provide that information, where as if it's optional then you would be in the dark.

That is true. But I also wonder if some people won't want the extra traffic (even though it's minimal).

I tend to err on the side of flexibility. Not a super strong opinion, but just something to consider.

XAMPPRocky · 2023-09-19T15:32:36Z

I can understand that, I'm always weary of adding something as option without a compelling reason to do so, as it adds another variation to test, and adds cognitive overhead (you have to know that the feature exists, and how to turn it on.).

I feel like if someone comes and provides a good reason, or we find it adds too much overhead, we should provide a way to disable it, without that though I think it should be included without an option, as it provides you with more insight, and having this work done for you makes Quilkin a more compelling product for operators.

XAMPPRocky added the kind/feature New feature or request label Sep 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add datacentre discovery #789

Add datacentre discovery #789

XAMPPRocky commented Sep 15, 2023

markmandel commented Sep 15, 2023

XAMPPRocky commented Sep 18, 2023 •

edited

Loading

markmandel commented Sep 18, 2023

XAMPPRocky commented Sep 19, 2023

Add datacentre discovery #789

Add datacentre discovery #789

Comments

XAMPPRocky commented Sep 15, 2023

markmandel commented Sep 15, 2023

XAMPPRocky commented Sep 18, 2023 • edited Loading

markmandel commented Sep 18, 2023

XAMPPRocky commented Sep 19, 2023

XAMPPRocky commented Sep 18, 2023 •

edited

Loading