Add logging to help diagnose missing to-device messages#19801
Conversation
In particular, include the "last_token" that we used to calculate whether there were any updates. Also a bit of general cleanup.
fd2b20c to
d55d564
Compare
Co-authored-by: Eric Eastwood <madlittlemods@gmail.com>
Both branches of our `if` do a db query for all the user's devices, though the spell it slightly differently. Let's pull it out, as it makes the code shorter and clearer.
fbe061e to
e922532
Compare
|
@MadLittleMods thanks for the review. I've made the changes you requested, but in looking at it I realised I wanted to add another couple of commits. Would you mind taking another look? |
| msg["content"].get(EventContentFields.TO_DEVICE_MSGID), | ||
| ) | ||
| message_json = json_encoder.encode(msg) | ||
| msg = messages_by_device[device_id] |
There was a problem hiding this comment.
As far as I can tell, in this else-branch, we're working with to-device messages going to specific devices defined by devices.
messages_by_device[device_id] can probably be None (if we're going to loop over known_devices here) and we should do a safer lookup.
And to confirm this suspicion, it looks like that's what the Python tests are failing on as well:
❌ https://github.com/element-hq/synapse/actions/runs/26568079004/job/78268176493?pr=19801
File "/opt/hostedtoolcache/Python/3.10.20/x64/lib/python3.10/site-packages/synapse/storage/databases/main/deviceinbox.py", line 945, in _add_messages_to_local_device_inbox_txn
msg = messages_by_device[device_id]
builtins.KeyError: 'XYDQYXDPTF'
tests.rest.client.test_devices.DehydratedDeviceTestCase.test_dehydrate_msc3814
There was a problem hiding this comment.
ugh, right; I now understand why we had two separate queries before: I'd failed to realise that, in the non-wildcard case, we were only looking up device_ids for which we had a to-device message.
I'm going to revert this commit.
| messages_json_for_user = {} | ||
| devices = list(messages_by_device.keys()) | ||
| if not devices: | ||
| continue |
There was a problem hiding this comment.
Log about dropping to-device messages?
How does this case even happen? A to-device message should be expected to go somewhere. Should this be an assert?
There was a problem hiding this comment.
I'm just moving this logic from below.
If devices is empty, then it means there are no to-device messages for this user (we just have an empty dict for them). Looks like one reason it can happen if somebody hits /sendToDevice with a request body like:
{
"messages": {
"@alice:example.com": {}
}
}I don't think it's worth logging, and it certainly shouldn't be an assert (at least, unless we change the higher-level methods.
I'll add a quick comment.
| # A map from user id, to device id, to a pair of (serialized message, msgid). | ||
| local_by_user_then_device: dict[str, dict[str, tuple[str, str]]] = {} | ||
| for user_id, messages_by_device in messages_by_user_then_device.items(): | ||
| messages_json_for_user = {} |
There was a problem hiding this comment.
Comment and type to document what this map holds
| if issue9533_logger.isEnabledFor(logging.DEBUG): | ||
| # Log any messags we are dropping | ||
| unmapped_devices = ( | ||
| messages_by_device.keys() - messages_json_for_user.keys() |
There was a problem hiding this comment.
I think this logic will mess up for * and will say we're always dropping that one when we're actually handling it.
There was a problem hiding this comment.
we don't hit this branch for *.
Co-authored-by: Eric Eastwood <erice@element.io>
This reverts commit e922532. Turns out the two separate queries where there for a good reason.
Some attempts to debug #19795.