[Darwin] MTRBaseSubscriptionCallback needs to have consistent handling of callbacks #22936

jtung-apple · 2022-09-28T17:43:26Z

Reproduction steps

Copying from Issue #22935:
1. ReadClient starts tearing down, for example during `OnLivenessTimeoutCallback` calling `Close`
2. While that is happening, Darwin framework calls read with a `MTRClusterStateCacheContainer` parameter with one of the `MTRBaseCluster` objects
3. The `MTRBaseCluster` does a dispatch_sync call to the Matter queue
4. ReadClient `Close` proceeds, and calls `OnError` in the MTRBaseSubscriptionCallback object, which calls `ReportError`, which eventually does an `dispatch_async` to call the OnDone handler (this cleans up the cache container\'s pointer to the c++ object), which is scheduled after BaseCluster\'s read.
5. ReadClient then calls MTRBaseSubscriptionCallback\'s OnDone, which deletes itself and returns
6. The BaseCluster then executes the read with the cache container, which still has a pointer to the now defunct c++ cache, leading to a crash

Platform

darwin

Platform Version(s)

No response

Type

Manually tested with SDK

(Optional) If manually tested please explain why this is only manually tested

As mentioned in Issue #22935 this was caught from crash tracking.

Anything else?

Copying from
The fix for issue #22935 was to require (and document in header) SubscriptionCallback's OnDone be called inline. But that makes MTRBaseSubscriptionCallback inconsistent in the queues used to call the handlers. Existing logic would work fine because there's only simple cleanup, but this may lead to future issues if more logic is added in OnDone, because other handlers may be called after OnDone handler is called.

One option is to change MTRBaseSubscriptionCallback to always call everything inline, and leave the "call client callback on queue" as exercise in MTRBaseDevice / MTRDevice objects.

jtung-apple · 2022-09-28T18:32:23Z

On second thought, I think this can be better fixed by having MTRBaseSubscriptionCallback's ReportError mark mHaveQueuedDeletion when there is an async onDoneHandler to call. This way the object gets to live long enough to service the last read from the cache before OnDone is finally processed afterwards. This would also keep queue usage consistent without having to make a separate fix.

jtung-apple · 2022-09-30T17:31:01Z

#22978 will fix the original issue properly, and so this issue is no longer needed.

jtung-apple added the darwin label Sep 28, 2022

jtung-apple self-assigned this Sep 28, 2022

jtung-apple closed this as completed Sep 30, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Darwin] MTRBaseSubscriptionCallback needs to have consistent handling of callbacks #22936

[Darwin] MTRBaseSubscriptionCallback needs to have consistent handling of callbacks #22936

jtung-apple commented Sep 28, 2022

jtung-apple commented Sep 28, 2022

jtung-apple commented Sep 30, 2022

[Darwin] MTRBaseSubscriptionCallback needs to have consistent handling of callbacks #22936

[Darwin] MTRBaseSubscriptionCallback needs to have consistent handling of callbacks #22936

Comments

jtung-apple commented Sep 28, 2022

Reproduction steps

Platform

Platform Version(s)

Type

(Optional) If manually tested please explain why this is only manually tested

Anything else?

jtung-apple commented Sep 28, 2022

jtung-apple commented Sep 30, 2022