-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shut down subscription clients when DeviceController shuts down. #22323
Shut down subscription clients when DeviceController shuts down. #22323
Conversation
And just to address #20085 (comment) : If we're shutting down the DeviceController for a fabric, nixing all the sessions for that fabric, etc, not calling Close on the subscription clients is a bit odd. It also leads to races: clients waiting on the server will get closed (because their session goes away), but clients not thus waiting would just stick around... |
The Darwin CI failure here is real, though the problem pre-existing: we used to leak the ReadClient and the callback waiting on it. Now we're freeing the ReadClient, but not the callback (because we are using the ReadInteraction SubscribeAttribute API, which never calls OnDone on us, and hence we never know whether we can free the callback bits. LSAN used to not detect the leak because we leaked objects that referred to each other, and LSAN does not detect that. But now we free one of them, not the other, and the leak gets noticed. |
Filed #22333 on the leak issue. Will just add a suppression for now, since it's not being made worse by this PR. |
PR #22323: Size comparison from bd32efa to afeb46c Increases (42 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, nrfconnect, psoc6, qpg, telink)
Decreases (6 builds for cc13x2_26x2)
Full report (45 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, qpg, telink)
|
V1 accepted: fixes potential crash. |
PR #22323: Size comparison from bd32efa to e39968d Increases (17 builds for bl602, cc13x2_26x2, k32w, linux, nrfconnect, qpg, telink)
Decreases (6 builds for cc13x2_26x2)
Full report (20 builds for bl602, cc13x2_26x2, k32w, linux, mbed, nrfconnect, qpg, telink)
|
PR #22323: Size comparison from bd32efa to cb7ac0c Increases (42 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, nrfconnect, psoc6, qpg, telink)
Decreases (6 builds for cc13x2_26x2)
Full report (45 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, qpg, telink)
|
Other clients get shut down due to their sessions going away, but subscription clients can be in a state where they are not waiting on a response, and those should get shut down too. Also: 1) Fixes ReadClient::Close to release its exchange, so shutting down a subscription while it's waiting for a response from the server actually shuts it down, instead of delivering OnDone and then getting a message on the exchange and possibly sending more notifications after OnDone. 2) Fixes potential use-after-free in the ShutdownSubscriptions functions. Fixes project-chip#22319
…ning subscriptions.
cb7ac0c
to
10fb972
Compare
PR #22323: Size comparison from 0f0e1e9 to 10fb972 Increases (42 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, nrfconnect, psoc6, qpg, telink)
Decreases (7 builds for cc13x2_26x2, esp32)
Full report (45 builds for bl602, cc13x2_26x2, cyw30739, efr32, esp32, k32w, linux, mbed, nrfconnect, psoc6, qpg, telink)
|
This leak got fixed in project-chip#22323
This leak got fixed in #22323
…ject-chip#22323) * Shut down subscription clients when DeviceController shuts down. Other clients get shut down due to their sessions going away, but subscription clients can be in a state where they are not waiting on a response, and those should get shut down too. Also: 1) Fixes ReadClient::Close to release its exchange, so shutting down a subscription while it's waiting for a response from the server actually shuts it down, instead of delivering OnDone and then getting a message on the exchange and possibly sending more notifications after OnDone. 2) Fixes potential use-after-free in the ShutdownSubscriptions functions. Fixes project-chip#22319 * Suppress now-visible LSan leak and have IM engine shut down any remaining subscriptions. * Fix lifetime management in subscribeAttributeWithEndpointId. * Address review comment: reduce duplication in read client iteration methods.
This leak got fixed in project-chip#22323
Other clients get shut down due to their sessions going away, but subscription
clients can be in a state where they are not waiting on a response, and those
should get shut down too.
Also:
Fixes ReadClient::Close to release its exchange, so shutting down a
subscription while it's waiting for a response from the server actually shuts it
down, instead of delivering OnDone and then getting a message on the exchange
and possibly sending more notifications after OnDone.
Fixes potential use-after-free in the ShutdownSubscriptions functions.
Fixes #22319
Problem
Crashes possible due to subscription clients outliving shutdown.
Change overview
Make sure we call OnDone on all relevant subscriptions when shutting down a DeviceController.
Testing
Verified that if I create a subscription in chip-tool or darwin-framework-tool and then shut down the relevant controller I see
~ReadClient
getting called.