Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add session affinity support to load balancer #374

Open
iffyio opened this issue Aug 25, 2021 · 10 comments
Open

Add session affinity support to load balancer #374

iffyio opened this issue Aug 25, 2021 · 10 comments
Labels
kind/feature New feature or request

Comments

@iffyio
Copy link
Collaborator

iffyio commented Aug 25, 2021

We currently have a naive load balancer that doesn't account for any packet information like the source address (each packet is routed to a distinct server according to the lb policy) - this is usually the unwanted behavior, rather we should send subsequent packets from the same client to the same server.

Impl wise, we can introduce a lookup state to the load balancer so that if there exists a client->server pair in the map there is no need to consult the lb policy impl (we know where to route the packet)

@iffyio iffyio added the kind/feature New feature or request label Aug 25, 2021
@gilesheron
Copy link
Contributor

PR #381 achieves this using a source hash.

But yes, there are cases where we'd want to track session state :)

@iffyio
Copy link
Collaborator Author

iffyio commented Aug 31, 2021

Ah, not exactly if I understood the PR correctly. The goal with this was to remember what endpoint we chose for a client so e.g if the order of endpoints in the list changes (the order isn't currently guaranteed at all atm), #381 hashing won't handle that properly. then this isn't specific to hashing, the other policies should do the same e.g a random policy should select an endpoint at random but only do that once for a client

@gilesheron
Copy link
Contributor

Well yes, if you want to survive changes in the endpoint lists then you'll need something more complex than a simple hash.

Consistent hashing will solve that without requiring session state. See: https://en.wikipedia.org/wiki/Consistent_hashing

@markmandel
Copy link
Member

We happy to close this issue now(just merged #381), or do we think we'll be looking at specific other types of hashing?

@iffyio
Copy link
Collaborator Author

iffyio commented Sep 8, 2021

We can't close this yet #381 doesn't resolve it, the idea with this is that if the lb selects an upstream A for a client, all packets from that client should go to A at least until A goes away, regardless of what policy is being used.
And the current hashing in #381 doesn't do this either, e.g endpoints list [A, B, C] and [B,C,A] are equivalent but the current implementation will pick a different upstream for a client if the order changes since we're blindly using offsets to index into the list.

@gilesheron
Copy link
Contributor

to fix that we'd have to sort the endpoints before picking one. That way the hash would always land on the same endpoint.

Am happy to make the change.

Beyond that you could implement a consistent hash - as I understand the usual algorithm you map each endpoint onto multiple positions on a "wheel" of hash values. When hashing you pick the "next" endpoint on the wheel. If an endpoint dies you remove all its entries from the wheel so that those flows get remapped across the remaining endpoints, but flows currently mapped to other endpoints don't get remapped.

@iffyio
Copy link
Collaborator Author

iffyio commented Sep 8, 2021

Ah, so to clarify this ticket isn't specific to hashing in any way, its for all lb policies we'll have. It'll be wasteful to sort the list on each packet in this case and we still have the issue of when the list changes, sessions get broken. Consistent hashing won't solve this either unfortunately, that only minimises the number of sessions that break (some will still break).

Thankfully though since we have all the information we need in memory - there's isn't really any reason why adding a new endpoint should break any ongoing sessions
e.g if we have a client X talking to endpoint C in endpoint list [A, B, C], its session shouldn't break if the list changes to [B, C, D].
We can achieve this with a map to track the fact that X is currently talking to C and that should be enough, the endpoint_chooser policy impls we have won't even know about it (they're only consulted when there's no entry in the map for a client) something like:

fn read(&self, mut ctx: ReadContext) -> Option<ReadResponse> {
  if let Some(upstream) = self.state.get(ctx.from) {
    // this keep isn't by index we need to extend ctx.endpoints to be smarter
    ctx.endpoints.keep(upstream); // If we already have a session use that
  } else {
    // otherwise select an upstream for this client
    let upstream = self.endpoint_chooser
      .choose_endpoints(&mut ctx.endpoints, ctx.from);
    ctx.endpoints.keep(upstream); // and remember it for next time.
    self.state.insert(ctx.from, upstream);
  }
  Some(ctx.into())
}

@gilesheron
Copy link
Contributor

ah cool - if you already have session state that's great. So there's no need for the hashed implementation if you have this since clients will be pinned to their existing endpoints.

that also neatly solves my next thing. I've implemented a LB mode where we call out to an external API using gRPC. Currently I call the API once per packet, and I was about to add a user cache there. But it'd be easier just to use your solution, I think?

@iffyio
Copy link
Collaborator Author

iffyio commented Sep 9, 2021

Yeah I think that would be easier with the new lb policy since it won't need to worry include its own cache.

@gilesheron
Copy link
Contributor

yeah - for a demo I may just hack my own cache in and then remove it. I've got the API stuff working now - it's pretty cute. We can route traffic based on the source IP/port (so e.g. you can direct customers from ISP A to backend X and from ISP B to backend Y).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants