Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix BlueZ wait condition disconnect race #1399

Merged
merged 3 commits into from
Aug 28, 2023

Conversation

dlech
Copy link
Collaborator

@dlech dlech commented Aug 25, 2023

Alternative to #1329 and #1377.

@bdraco could you please take a look?

The issue addressed by #1377 seems more widespread, so I took it a bit farther to make sure it is fixed in all public functions (first commit).

Instead of synthesizing a "Connected" property change as proposed in #1329, here I have opted to just add a 3rd race condition on "InterfacesRemoved" when waiting for services to be resolved (second commit).

@dlech dlech force-pushed the bluez-wait-condition-disconnect-race branch from 87460b9 to 147b4e3 Compare August 25, 2023 19:10
@dlech
Copy link
Collaborator Author

dlech commented Aug 25, 2023

Updated:

  • added helper functions and additional checks to handle lack of eager evaluation of _wait_* coros in first commit
  • added checking of done tasks for exceptions in second commit
  • added 3rd commit to handle potential issues with condition callbacks

@dlech
Copy link
Collaborator Author

dlech commented Aug 25, 2023

@bdraco let me know if you want to test before merging

@bdraco
Copy link
Contributor

bdraco commented Aug 25, 2023

Will give it a shot shortly

This makes sure all methods of the BlueZManager are checking
that the adapter or device exists in BlueZ before continuing with
the method.

Documentation is updated to indicate which functions have this check
and which do not.
…es_discovery()

This adds a 3rd race condition to _wait_for_services_discovery() to
handle the case where an interface is removed from BlueZ without
changing the "Connected" property. This can happen, e.g. when we
hit "retry due to le-connection-abort-by-local".
This adds a device path key to the condition callbacks. This will avoid
calling callbacks for other devices that are not the subject of the
current "PropertiesChanged" signal.
@dlech dlech force-pushed the bluez-wait-condition-disconnect-race branch from 147b4e3 to edfbd93 Compare August 25, 2023 19:26
Copy link
Contributor

@bdraco bdraco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested, could not replicate race after this PR.

Note: it took days of testing on the other one to be sure so I can't be 100% but this seems good.

@bdraco
Copy link
Contributor

bdraco commented Aug 26, 2023

Ran overnight frequent polling. Didn't notice the safety timeout ever kicking in anymore. Could be just luck but I think its solved

@bdraco
Copy link
Contributor

bdraco commented Aug 27, 2023

Running for almost 48 hours now and no more safety timeouts. I think this is solved 👍

@bdraco
Copy link
Contributor

bdraco commented Aug 27, 2023

Thanks @dlech

@dlech
Copy link
Collaborator Author

dlech commented Aug 28, 2023

Thanks for testing!

@dlech dlech merged commit cd0fda5 into develop Aug 28, 2023
16 checks passed
@dlech dlech deleted the bluez-wait-condition-disconnect-race branch August 28, 2023 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants