Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Establishing a CASE session tears down existing sessions for the peer #15986

Closed
bzbarsky-apple opened this issue Mar 8, 2022 · 5 comments
Closed
Assignees
Labels

Comments

@bzbarsky-apple
Copy link
Contributor

Problem

Consider the following scenario:

  1. Admin establishes a CASE session to a node so it can send it commands/etc.
  2. The admin announces itself as an OTA provider.
  3. The node sends a QueryImage, starts getting the image (over a different CASE session that it establishes to the admin, since that's what would happen with the way our code is written).
  4. In the admin's CASEServer, when the session from the node is established the session that was set up in step 1 gets torn down, I am pretty sure (see the ExpireAllPairings call in CASEServer::OnSessionEstablished).

Now either the admin has no CASE session... or it notices and establishes a new one, which tears down the session the node set up in step 3. Then the node notices and retries establishing a session to get the OTA image, etc, etc.

Proposed Solution

We need to do one of two things:

  1. Make it possible to only have one CASE session between two nodes (by allowing an OperationalDeviceProxy to reuse an already-existing session that we might not be the initiator for when it's told to connect).
  2. Assume that we will have bidirectional sessions and stop expiring sessions we're the initiator for in CASEServer::OnSessionEstablished.

I suspect this issue (if I am right about our expiring behavior) would make a lot of interactions not work well, but I am particularly worried about inability to complete an OTA update, due to failure to make progress on downloading the image, if the session keeps dying.

We need to check whether my understanding of the behavior is correct, of course.

@msandstedt @Carol @tcarmelveilleux

@mrjerryjohns
Copy link
Contributor

Honestly, our ownership model for CASESession objects doesn't line up with the ability for a session established in one direction (e.g one setup by CASEServer) to be re-used by one in the other direction (e.g CASEClient).

My recommendation would be to make CASEServer and CASEClient just really only be responsible for the state machine management for setting up sessions, but not actually maintain ownership of the underlying session state itself. We should push that lower into SessionManager, specifically, as a pool of CASESession objects.

@mrjerryjohns mrjerryjohns added the p1 priority 1 work label Mar 24, 2022
@bzbarsky-apple
Copy link
Contributor Author

My recommendation would be to make CASEServer and CASEClient just really only be responsible for the state machine management for setting up sessions, but not actually maintain ownership of the underlying session state itself. We should push that lower into SessionManager, specifically, as a pool of CASESession objects.

That's already what we have. CASEServer and CASEClient just run the state machine and do not actually own the sessions. SessionManager does that.

@mrjerryjohns
Copy link
Contributor

mrjerryjohns commented Mar 24, 2022

We can preserve a lot of the existing APIs if we:

  1. Have CASEClient consult with the SessionManager (it has a reference to that passed into its constructor) if a session already exists. If it does, it will call (synchronously?) the OnCASEConnected callback passed into its constructor.
  2. Add a OnSessionUpdated to SessionReleaseDelegate (and perhaps rename that to SessionChangeDelegate) to permit an atomic swap of a stale session with a new one. This avoids causing issues with tearing down active objects that have a SessionHolder because their session went away, and instead, just transparently swaps out their sessions. ExchangeDelegate will still close/abort itself upon a OnSessionUpdated (per the rules of the spec), but entities like ReadHandler (subscriptions) and OperationalDeviceProxy can continue to exist and function.

We'd need to be careful about a change like this, since we'd need to ensure we swap out all old sessions before calling ExpireAllPairings on the old session.

@mrjerryjohns
Copy link
Contributor

mrjerryjohns commented Mar 24, 2022

That's already what we have. CASEServer and CASEClient just run the state machine and do not actually own the sessions. SessionManager does that.

Roger that. I mis understood the intention of CASESession (sounds like it just encapsulates the state needed for session setup).

@mrjerryjohns
Copy link
Contributor

Part 1 of this issue is complete. Will file a follow-up issue if Part 2 needs to be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants