Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple case about ExchangeContext leaks #19359

Closed
kghost opened this issue Jun 9, 2022 · 7 comments
Closed

Multiple case about ExchangeContext leaks #19359

kghost opened this issue Jun 9, 2022 · 7 comments
Assignees
Labels
stale Stale issue or PR

Comments

@kghost
Copy link
Contributor

kghost commented Jun 9, 2022

Problem

  • When OnSessionReleased is called while the exchange is in state of WillSendMessage, the OnExchangeClosing is triggered, but the ref count is not decreased. Will cause that exchange leak.
  • In ExchangeContext::HandleResponseTimeout if the application doesn't close the exchange in OnResponseTimeout callback. it leaks.
  • In ExchangeContext::SendMessage, if mResponseTimeout is set to 0, it leaks.

Proposed Solution

To prevent possible problem like this one, and make exchange context more robust

  • Implement a state machine to manage ref counters
  • Make Close/Abort re-entrant
@kghost kghost self-assigned this Jun 9, 2022
@kghost kghost changed the title ExchangeContext may leak in ExchangeContext::OnSessionReleased Multiple case about ExchangeContext leaks Jun 9, 2022
@bzbarsky-apple
Copy link
Contributor

bzbarsky-apple commented Jun 9, 2022

When OnSessionReleased is called while the exchange is in state of WillSendMessage, the OnExchangeClosing is triggered, but the ref count is not decreased. Will cause that exchange leak.

If he consumer is waiting to send on the exchange, the consumer has to release that ref. The exchange cannot release it, because that will leave a dangling pointer.

In ExchangeContext::HandleResponseTimeout if the application doesn't close the exchange in OnResponseTimeout callback. it leaks.

This is not true. The exchange closes itself after calling OnResponseTimeout (by calling MessageHandled()), unless the callee takes action to keep it open, and this is documented in the delegate API.

In ExchangeContext::SendMessage, if mResponseTimeout is set to 0, it leaks.

In what sense? What leaks, exactly? If mResponseTimeout is 0, that means "stay open indefinitely waiting for a response" and the responsibility for closing the exchange is then on the consumer.

Implement a state machine to manage ref counters

This is generally a good idea, but we should not be taking that on for 1.0 at this point.

Make Close/Abort re-entrant

Not sure we should do that, but worth thinking about whether that makes things more or less complex.

@bzbarsky-apple
Copy link
Contributor

And in particular: if the exchange is waiting for a response, the refcount is held by the network stack. The exchange can release that refcount itself. If the exchange is waiting for a send, the refcount is held by the consumer that will do the send. The exchange has no way to release that refcount itself without leaving dangling pointers.

bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Jun 9, 2022
It was not clear who was responsible for releasing various refcounts.
Improve the code and documentation to make it clearer what's going on,
and to avoid the two calls into DoClose in the "waiting for response"
case.

Addresses a problem that was mentioned in
project-chip#19359
@bzbarsky-apple
Copy link
Contributor

Created #19383 to handle the clarity issue.

bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Jun 9, 2022
It was not clear who was responsible for releasing various refcounts.
Improve the code and documentation to make it clearer what's going on,
and to avoid the two calls into DoClose in the "waiting for response"
case.

Addresses a problem that was mentioned in
project-chip#19359
bzbarsky-apple added a commit to bzbarsky-apple/connectedhomeip that referenced this issue Jun 9, 2022
It was not clear who was responsible for releasing various refcounts.
Improve the code and documentation to make it clearer what's going on,
and to avoid the two calls into DoClose in the "waiting for response"
case.

Addresses a problem that was mentioned in
project-chip#19359
@kghost
Copy link
Contributor Author

kghost commented Jun 9, 2022

If mResponseTimeout is 0, that means "stay open indefinitely waiting for a response" and the responsibility for closing the exchange is then on the consumer.

if the exchange is waiting for a response, the refcount is held by the network stack.

These 2 statement just conflict.

@bzbarsky-apple
Copy link
Contributor

OK, good point. If the exchange is waiting for a response and a response timeout was requested, the network stack will handle closing the exchange on response timeout, unless the consumer takes action to prevent it. The consumer will be notified when the timeout happens, so it can drop its ref to the exchange.

If the exchange is waiting on a response with no timeout allowed, the consumer needs to handle closing the exchange and dropping the reference to it, generally. In the "session released" case we treat things as a timeout, looks like.

I agree this is all a bit more complicated than one would prefer.

bzbarsky-apple added a commit that referenced this issue Jun 10, 2022
* Clarify the logic in ExchangeContext::OnSessionReleased.

It was not clear who was responsible for releasing various refcounts.
Improve the code and documentation to make it clearer what's going on,
and to avoid the two calls into DoClose in the "waiting for response"
case.

Addresses a problem that was mentioned in
#19359

* Clarify comments
@stale
Copy link

stale bot commented Dec 10, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale Stale issue or PR label Dec 10, 2022
@bzbarsky-apple bzbarsky-apple removed the stale Stale issue or PR label Dec 17, 2022
@stale
Copy link

stale bot commented Jun 25, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale Stale issue or PR label Jun 25, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stale Stale issue or PR
Projects
None yet
Development

No branches or pull requests

2 participants