Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Re-apply making thread_id required in push rules tables #15359

Closed
reivilibre opened this issue Mar 31, 2023 · 3 comments · Fixed by #15437 or #15597
Closed

Re-apply making thread_id required in push rules tables #15359

reivilibre opened this issue Mar 31, 2023 · 3 comments · Fixed by #15437 or #15597
Assignees
Labels
A-Threads Threaded messages O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.

Comments

@reivilibre
Copy link
Contributor

reivilibre commented Mar 31, 2023

The migration synapse/storage/schema/main/delta/74/03thread_notifications_not_null.sql.postgres from #15350 took too long and caused a small outage on Matrix.org.

I've reverted it for now

We believe the statement doing this is one or both of the ALTER TABLE statements.

I hoped #15357 might fix it, but the Postgres docs don't mention it as a valid technique to improve performance and my testing shows the addition of the constraint takes the same amount of time without regard for the presence of the index, definitely not negligible.

The way forward is probably to add a NOT VALID constraint, which is enforced immediately for new rows, and then do a VALIDATE CONSTRAINT in the background. It's a shame we didn't know about this earlier since this could have been started ages ago, but ah well — as they say, the next best time is now.

Scanning a large table to verify a new foreign key or check constraint can take a long time, and other updates to the table are locked out until the ALTER TABLE ADD CONSTRAINT command is committed. The main purpose of the NOT VALID constraint option is to reduce the impact of adding a constraint on concurrent updates. With NOT VALID, the ADD CONSTRAINT command does not scan the table and can be committed immediately. After that, a VALIDATE CONSTRAINT command can be issued to verify that existing rows satisfy the constraint. The validation step does not need to lock out concurrent updates, since it knows that other transactions will be enforcing the constraint for rows that they insert or update; only pre-existing rows need to be checked. Hence, validation acquires only a SHARE UPDATE EXCLUSIVE lock on the table being altered. (If the constraint is a foreign key then a ROW SHARE lock is also required on the table referenced by the constraint.) In addition to improving concurrency, it can be useful to use NOT VALID and VALIDATE CONSTRAINT in cases where the table is known to contain pre-existing violations. Once the constraint is in place, no new violations can be inserted, and the existing problems can be corrected at leisure until VALIDATE CONSTRAINT finally succeeds.

postgresql.org/docs/15/sql-altertable.html

reivilibre added a commit that referenced this issue Mar 31, 2023
…ons_staging,summary} (#15350)"

This reverts commit 2a234b7.

See #15359 for context.
@reivilibre reivilibre changed the title Reversion of #15350 (adding thread_id NOT NULL constraints) #15350 had to be reverted (adding thread_id NOT NULL constraints) Mar 31, 2023
@clokep clokep added S-Minor Blocks non-critical functionality, workarounds exist. T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks. A-Threads Threaded messages O-Uncommon Most users are unlikely to come across this or unexpected workflow labels Apr 3, 2023
@clokep clokep changed the title #15350 had to be reverted (adding thread_id NOT NULL constraints) Re-apply making thread_id required in push rules tables Apr 3, 2023
@clokep
Copy link
Member

clokep commented Apr 14, 2023

Thanks @reivilibre for providing that bit, I think in the future it might make sense then to add the constraint when we add the column (but as not-valid), then do the backfilling, etc. and at the end of that we could do validate the constraint?

Removing any backwards compatibility code could be tricky in that case, I guess?

@clokep
Copy link
Member

clokep commented May 12, 2023

#15580 reverted this again.

@clokep clokep reopened this May 12, 2023
@clokep
Copy link
Member

clokep commented May 12, 2023

From discussion with @erikjohnston this was causing a deadlock, and we think the following two changes would fix it:

  1. Move the modifications for each table to a separate file. (To isolate each table modification into a separate transaction.)
  2. Move the index deletions to background updates.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Threads Threaded messages O-Uncommon Most users are unlikely to come across this or unexpected workflow S-Minor Blocks non-critical functionality, workarounds exist. T-Task Refactoring, removal, replacement, enabling or disabling functionality, other engineering tasks.
Projects
None yet
2 participants