Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Full-text-search in Synapse 1.71.0rc1 may be prohibitively expensive (DB IO) #14354

Closed
DMRobertson opened this issue Nov 2, 2022 · 4 comments
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Message-Search Searching messages A-Performance Performance, both client-facing and admin-facing O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.

Comments

@DMRobertson
Copy link
Contributor

From Matrix.org's monitoring:

image

In the "Toast table blocks read from disk/buffer" and "Toast index block read from buffer" graphs, there are intermittent blue spikes corresponding to the event_search table.

Correlated: peaks in "total txn time" due to search_rooms

image

And federation send PDU lag:

image

and event send time:
image

(Above graphs only showing the main process).

The event persister also saw pain at similar times:

image

It's a little tricky to interpret these, because the m.org database was updated this morning (2nd Nov 9.30 UTC) and there was some expensive background processing by postgres afterwards. (All times UTC in the graphs.)

Our suspicion is that the changes in #11635 are to blame. I reverted it, #13410 and #14311 on the hotfixes branch (37307a5) and deployed to matrix.org. We haven't seen the event persisters flare up since then... but it's not completely clear that the changed mentioned were the cause.

I think we should (regrettably) back out the changes on the release branch too, before a final 1.71.0 release.

@DMRobertson DMRobertson added A-Performance Performance, both client-facing and admin-facing S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues. X-Release-Blocker Must be resolved before making a release X-Regression Something broke which worked on a previous release A-Message-Search Searching messages A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db O-Occasional Affects or can be seen by some users regularly or most users rarely labels Nov 2, 2022
@DMRobertson
Copy link
Contributor Author

I should have added: in the slow query logs, we saw inserts into event search tables that took >=10 seconds.

E.g. 2022-11-02 00:40:57.355 UTC [matrix event_persister1] LOG:  duration: 10687.079 ms  statement: 
                    INSERT INTO event_search

@H-Shay
Copy link
Contributor

H-Shay commented Nov 2, 2022

Are the inserts slow due to poor indexing, or what else would cause such a slow insert?

@DMRobertson DMRobertson removed X-Release-Blocker Must be resolved before making a release X-Regression Something broke which worked on a previous release labels Nov 3, 2022
@DMRobertson
Copy link
Contributor Author

Consensus: go ahead with this but keep an eye out for performance.

@DMRobertson DMRobertson closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2022
@DMRobertson
Copy link
Contributor Author

#14402 should fix some pain related to SELECTS related to event searches, which might explain some of this pain.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Database DB stuff like queries, migrations, new/remove columns, indexes, unexpected entries in the db A-Message-Search Searching messages A-Performance Performance, both client-facing and admin-facing O-Occasional Affects or can be seen by some users regularly or most users rarely S-Major Major functionality / product severely impaired, no satisfactory workaround. T-Defect Bugs, crashes, hangs, security vulnerabilities, or other reported issues.
Projects
None yet
Development

No branches or pull requests

2 participants