Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up push actions/unread counting #13846

Open
matrixbot opened this issue Dec 20, 2023 · 0 comments
Open

Speed up push actions/unread counting #13846

matrixbot opened this issue Dec 20, 2023 · 0 comments

Comments

@matrixbot
Copy link
Collaborator

matrixbot commented Dec 20, 2023

This issue has been migrated from #13846.


I've been spending a bunch of time looking into push action processing & badge counting and I think there'd be a real benefit to separating out push actions from summaries (notification/unread/highlight counts). There is a lot of code and complexity introduced in the mechanism of rolling event push actions into summaries, and the push actions table conflates push notifications and push counters. Suggestions:

Push actions

Firstly push actions stay as is using event_push_actions, this gets deleted on receipt and read/deleted by the pusher instances, no need for any background work. No longer used for any badge counting.

Badge counts

For badge counts we add a new table, event_push_counts, that looks roughly like (pseudo-SQL):

CREATE TABLE event_push_counts (
    user_id text,
    room_id text,
    thread_id text,
    event_stream_ordering bigint,
    notifs bigint,
    unreads bigint,
    highlights bigint,
)
ALTER TABLE `event_push_counts` ADD CONSTRAINT uniq (user_id, room_id, thread_id, event_stream_ordering);

The key here is that this is not unique per user/room/thread but also event stream ordering. This means that as events come in new rows simply get appended according to the push actions per user. This prevents any contention issues during the critical event insertion path.

This makes counting a users total unreads very simple - instead of the current loop & count per room, simply:

# Counting all events for push badges
SELECT SUM(unreads)
FROM event_push_counts
WHERE user_id = '@blah:matrix.org'

# Counting unread rooms for push badges
SELECT COUNT(room_id)
FROM event_push_counts
WHERE user_id = '@blah:matrix.org'
GROUP BY room_id

# Separated room/thread counts for sync responses
SELECT SUM(unreads), SUM(notifs), SUM(highlights), room_id, thread_id
FROM event_push_counts
WHERE user_id = '@blah:matrix.org'
AND room_id IN ('!abc:matrix.org', '!def:beeper.com')
GROUP BY room_id, thread_id

The same applies to counting unreads for rooms in sync responses.

It is still possible to summarise these by merging rows into a higher stream ordering. Like the current system this doesn't account for receipts not at the latest stream ordering, but the summarisaton could be delayed to provide a window of support for this if desired. The table is leaner than the push actions table so this shouldn't be such an issue (but still important to do to keep the table fast).

Finally, rows could be cleaned out either on receipt on as a background job processing receipts. If there was sufficient (24h?) delay before any summarisation phase, deleting on receipt shouldn't result in much contention on the table.


(If this seems sensible, I can invest time to implement over the next few weeks)

@matrixbot matrixbot changed the title Dummy issue Speed up push actions/unread counting Dec 21, 2023
@matrixbot matrixbot reopened this Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant