Skip to content

LPN establishment of Friendship never completes if there is no response to the initial Friend Poll #16678

@richardbarlow

Description

@richardbarlow

Describe the bug
Over two Friend establishment attempts, if the Friend fails to receive the initial Friend Poll both times or the node wishing to establish friendship fails to receive the initial Friend Update both times the LPN code will stop trying to establish Friendship forever (it never sends any more Friend Requests and waits indefinitely for a response to a Friend Poll).

After a node has sent a Friend Request and received a Friend Offer it sends an initial Friend Poll to the node that offered Friendship to indicate its acceptance of the offer. The friend must respond to the initial Friend Poll with a Friend Update, as specified in the standard: The Friend node shall respond with a Friend Update message after a minimum of ReceiveDelay milliseconds and before a maximum of the sum of ReceiveDelay and ReceiveWindow milliseconds, from the reception of the Friend Poll message from the Low Power node. If the Friend Update is not received within the maximum time limit the node should resume looking for a friend by sending out a Friend Request.

To Reproduce
At the moment the following scenario fails (L = node wanting to be an LPN; F = friend):

  1. L sends Friend Request
  2. F sends Friend Offer
  3. L sends Friend Poll
  4. F fails to receive Friend Poll from L
  5. L times out waiting for initial Friend Update and reports No acceptable Friend Offers received
  6. L sends Friend Request
  7. F sends Friend Offer
  8. L does not send a Friend Poll
  9. L never times out and therefore never sends another Friend Request

The issue appears to be that state is left hanging around after step 5. The lpn.sent_req field is still set to 1 (TRANS_CTL_OP_FRIEND_POLL), even though the timeout has been hit and no response to the Friend Poll should be expected. On the subsequent round of Request/Offer/Poll the Friend Poll is not sent due to the lpn.sent_req field still being set to 1:

if (lpn->sent_req) {
if (lpn->sent_req != TRANS_CTL_OP_FRIEND_POLL) {
lpn->pending_poll = 1U;
}
return 0;
}

This can be seen in the last line of the log shown below, where send_req == 1, which prevents a Friend Poll being sent.

Based on the wording of the debug messages in update_timeout() (specifically Retrying first Friend Poll and Timed out waiting for first Friend Update) it looks like it is/was expected that this function would handle this scenario. However, all of the code in the else branch will only be executed if lpn.established is false and update_timeout() is only ever called when this is true (if LPN_ESTABLISHMENT is not enabled, which it is not in my case).

Expected behavior
I expect the node to continue sending Friend Requests until either it successfully establishes Friendship or the LPN feature is disabled (via bt_mesh_lpn_set()).

Impact
I am not able to deliver a reliable low-power product to a client. For now I will have to workaround it by implementing my own timeout that observes changes in the friendship state (via the lpn callback) and if it detects this bug, cycles the LPN feature off/on.

Screenshots or console output

[00:01:46.498,596] <dbg> bt_mesh_lpn.lpn_timeout: state: enabled
[00:01:46.498,596] <dbg> bt_mesh_lpn.send_friend_req: 
[00:01:46.521,606] <dbg> bt_mesh_lpn.lpn_set_state: enabled -> wait offer
[00:01:46.618,621] <inf> kernel: adv stack :	unused 304	usage 464 / 768 (60 %)
[00:01:46.892,883] <dbg> bt_mesh_settings.schedule_store: Waiting 120 seconds
[00:01:46.892,913] <dbg> bt_mesh_lpn.bt_mesh_lpn_friend_offer: recv_win 255 queue_size 8 sub_list_size 3 rssi -48 counter 0
[00:01:46.895,294] <dbg> bt_mesh_lpn.send_friend_poll: lpn->sent_req 0x00
[00:01:46.911,560] <dbg> bt_mesh_lpn.req_sent: req 0x01 duration 90 err 0 state wait offer
[00:01:47.007,812] <inf> kernel: adv stack :	unused 304	usage 464 / 768 (60 %)
[00:01:47.237,335] <inf> kernel: prio recv thread stack :	unused 280	usage 168 / 448 (37 %)
[00:01:47.237,640] <inf> kernel: recv thread stack :	unused 1064	usage 984 / 2048 (48 %)
[00:01:47.366,790] <dbg> bt_mesh_lpn.lpn_timeout: state: wait offer
[00:01:47.366,790] <wrn> bt_mesh_lpn: No acceptable Friend Offers received
[00:01:47.366,790] <dbg> bt_mesh_lpn.lpn_set_state: wait offer -> enabled
[00:01:50.182,067] <inf> kernel: adv stack :	unused 304	usage 464 / 768 (60 %)
[00:01:52.257,690] <inf> kernel: prio recv thread stack :	unused 280	usage 168 / 448 (37 %)
[00:01:52.257,995] <inf> kernel: recv thread stack :	unused 1064	usage 984 / 2048 (48 %)
[00:01:55.360,168] <dbg> bt_mesh_lpn.lpn_timeout: state: enabled
[00:01:55.360,168] <dbg> bt_mesh_lpn.send_friend_req: 
[00:01:55.371,612] <dbg> bt_mesh_lpn.lpn_set_state: enabled -> wait offer
[00:01:55.470,214] <inf> kernel: adv stack :	unused 304	usage 464 / 768 (60 %)
[00:01:55.827,148] <dbg> bt_mesh_settings.schedule_store: Waiting 120 seconds
[00:01:55.827,178] <dbg> bt_mesh_lpn.bt_mesh_lpn_friend_offer: recv_win 255 queue_size 8 sub_list_size 3 rssi -48 counter 1
[00:01:55.829,559] <dbg> bt_mesh_lpn.send_friend_poll: lpn->sent_req 0x01

Environment (please complete the following information):

  • OS: Linux (Fedora 29 Workstation)
  • Toolchain gcc-arm-none-eabi-6-2017-q2-update
  • v1.14.0 (with no patches)

Additional context
Relevant bits of config:

CONFIG_BT_MESH=y
CONFIG_BT_MESH_RELAY=n
CONFIG_BT_MESH_LOW_POWER=y

CONFIG_BT_MESH_LPN_ESTABLISHMENT=n
CONFIG_BT_MESH_LPN_AUTO=n

Metadata

Metadata

Assignees

Labels

area: BluetoothbugThe issue is a bug, or the PR is fixing a bugpriority: lowLow impact/importance bug

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions