Implement conflict checking for SessionAffinity #1632

tpantelis · 2024-08-30T11:42:38Z

The MCS spec's conflict resolution policy states that a "conflict will be resolved by assigning precedence based on each ServiceExport's creationTimestamp, from oldest to newest". We don't have a central MCS controller with access to all cluster's ServiceExports but we can store each cluster's ServiceExport creationTimestamp as annotations in the aggregated ServiceImport and use them to resolve conflicts. The SessionAffinity and SessionAffinityConfig fields on the aggregated ServiceImport will be set by the cluster with the oldest timestamp. The other clusters will set a ServiceExportConflict condition if their corresponding local service's fields do not match those on the aggregated ServiceImport. If a local service is updated in any cluster, each cluster re-evaluates the updated aggregated ServiceImport and either clears or sets the conflict condition. Also if the service from the precedent cluster is un-exported, the next precedent cluster will set the fields.

submariner-bot · 2024-08-30T11:42:42Z

🤖 Created branch: z_pr1632/tpantelis/session_affinity_conflict
🚀 Full E2E won't run until the "ready-to-test" label is applied. I will add it automatically once the PR has 2 approvals, or you can add it manually.

vthapar

Tested and reviewed code locally. Other than thoughts on how to improve logic for oldest export, looks good.

vthapar · 2024-09-03T07:52:28Z

pkg/agent/controller/service_import.go

@@ -442,3 +564,48 @@ func (c *ServiceImportController) localServiceImportLister(transform func(si *mc

 	return retList
 }
+
+func findClusterWithOldestTimestamp(from map[string]string) string {


Am wondering if we can improve this logic. In a scale setup there might be too many clusters exporting a service and may cause issues loopin through all of them.

How about we also add an annotation like timestamp.submariner.io/oldest and use that to check if we're older or not. When the oldest one is exported, only then we will need to cycle through all to find the new oldest.

Also, this may have some issues when we're usin VIP. In case two different clusters export service at same time, VIP will end up being allocated by whosoever was first to successfully create Aggregated SI. It may happen that its timestamp is newer than the one that failed to create, coz it depends on timing of when the create attempt is made on broker.

Am wondering if we can improve this logic. In a scale setup there might be too many clusters exporting a service and may cause issues loopin through all of them.

I think the time would be negligible with a single loop even with thousands. If it was a nested loop, then we could have an issue. With a single loop, it would take millions of iterations to start to become significant.

How about we also add an annotation like timestamp.submariner.io/oldest and use that to check if we're older or not. When the oldest one is exported, only then we will need to cycle through all to find the new oldest.

yeah we could do that, ie also if the oldest cluster is un-exported then remove the annotation to trigger other clusters to re-evaluate. But it adds more complexity that I'm not sure we really need. I can try it out....

Also, this may have some issues when we're usin VIP. In case two different clusters export service at same time, VIP will end up being allocated by whosoever was first to successfully create Aggregated SI. It may happen that its timestamp is newer than the one that failed to create, coz it depends on timing of when the create attempt is made on broker.

I wasn't planning on using the oldest timestamp for the VIP b/c we won't ever re-allocate it once set. The first cluster to successfully create the aggregated SI sets the VIP.

I did some testing of findClusterWithOldestTimestamp with varying number of entries in the annotations map:

1,000 entries took 38.793 µs

10,000 took 244.597 µs

1,000,000 took 46.399 ms

10,000,000 took 711.979 ms

I also tried calling findClusterWithOldestTimestamp in a nested loop. So calling it 1,000 times with 1,000 map entries took 21.924 ms (simulates 1000 services on 1000 clusters). With 10,000 and 10,000 it took 2.16 s, which is still pretty negligible.

So I think we should be fine the way it is.

Would it be worth adding those benchmarks as a test?

What would we actually verify as a test?

This is really just normal looping of an array or map (with some fairly simple string manipulation/parsing) that I'm sure we do similarly in other places throughout the code base. These operations are fast with modern computers at the scales we'd be dealing with (in the thousands) so I don't think we need to be concerned. Only when you start getting into the millions is when you might need to be concerned.

The MCS spec's conflict resolution policy states that a "conflict will be resolved by assigning precedence based on each ServiceExport's creationTimestamp, from oldest to newest". We don't have a central MCS controller with access to all cluster's ServiceExports but we can store each cluster's ServiceExport creationTimestamp as annotations in the aggregated ServiceImport and use them to resolve conflicts. The SessionAffinity and SessionAffinityConfig fields on the aggregated ServiceImport will be set by the cluster with the oldest timestamp. The other clusters will set a ServiceExportConflict condition if their corresponding local service's fields do not match those on the aggregated ServiceImport. If a local service is updated in any cluster, each cluster re-evaluates the updated aggregated ServiceImport and either clears or sets the conflict conditon. Also if the service from the precedent cluster is unexported, the next precedent cluster will set the fields. Signed-off-by: Tom Pantelis <tompantelis@gmail.com>

tpantelis · 2024-09-05T10:34:29Z

@aswinsuryan

submariner-bot · 2024-09-05T14:20:59Z

🤖 Closed branches: [z_pr1632/tpantelis/session_affinity_conflict]

dfarrell07 · 2024-09-10T13:30:52Z

The release notes are here: submariner-io/submariner-website#1166

Related: #1621

tpantelis self-assigned this Aug 30, 2024

tpantelis requested review from aswinsuryan, dfarrell07, maayanf24, Oats87, skitt, sridhargaddam, vthapar and yboaron as code owners August 30, 2024 11:42

vthapar force-pushed the session_affinity_conflict branch from ee06699 to 486455f Compare September 3, 2024 04:53

vthapar requested changes Sep 3, 2024

View reviewed changes

vthapar approved these changes Sep 3, 2024

View reviewed changes

tpantelis force-pushed the session_affinity_conflict branch from 486455f to 62ec10b Compare September 4, 2024 15:04

aswinsuryan approved these changes Sep 5, 2024

View reviewed changes

submariner-bot added the ready-to-test When a PR is ready for full E2E testing label Sep 5, 2024

aswinsuryan enabled auto-merge (rebase) September 5, 2024 14:07

aswinsuryan merged commit 687c11c into submariner-io:devel Sep 5, 2024
26 checks passed

tpantelis deleted the session_affinity_conflict branch September 19, 2024 11:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement conflict checking for SessionAffinity #1632

Implement conflict checking for SessionAffinity #1632

tpantelis commented Aug 30, 2024

submariner-bot commented Aug 30, 2024

vthapar left a comment

vthapar Sep 3, 2024

tpantelis Sep 3, 2024

tpantelis Sep 3, 2024

skitt Sep 3, 2024

tpantelis Sep 3, 2024

tpantelis commented Sep 5, 2024

submariner-bot commented Sep 5, 2024

dfarrell07 commented Sep 10, 2024

Implement conflict checking for SessionAffinity #1632

Implement conflict checking for SessionAffinity #1632

Conversation

tpantelis commented Aug 30, 2024

submariner-bot commented Aug 30, 2024

vthapar left a comment

Choose a reason for hiding this comment

vthapar Sep 3, 2024

Choose a reason for hiding this comment

tpantelis Sep 3, 2024

Choose a reason for hiding this comment

tpantelis Sep 3, 2024

Choose a reason for hiding this comment

skitt Sep 3, 2024

Choose a reason for hiding this comment

tpantelis Sep 3, 2024

Choose a reason for hiding this comment

tpantelis commented Sep 5, 2024

submariner-bot commented Sep 5, 2024

dfarrell07 commented Sep 10, 2024