[bugfix] Restore constraints before snapshot conflict updates#806
Open
blurskye wants to merge 4 commits into
Open
[bugfix] Restore constraints before snapshot conflict updates#806blurskye wants to merge 4 commits into
blurskye wants to merge 4 commits into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This fixes a snapshot restore ordering issue when using a Postgres target with
on_conflict_action: "update".In that mode, the snapshot data path writes rows with
INSERT ... ON CONFLICT (...) DO UPDATE. For that SQL to work, Postgres needs the matching primary key, unique constraint, or unique index to already exist on the target table.The schema snapshot currently restores tables first, loads data, and only restores indexes/constraints after the data snapshot finishes. That ordering is good for bulk loading, but it breaks the non-bulk conflict-update path because the target table exists without the constraint needed by the
ON CONFLICTclause.For example, this is the shape of the problem:
Postgres rejects that with:
The same insert works once the primary key exists:
Fix
When the snapshot is using the regular Postgres batch writer with
on_conflict_action: "update", restore only the constraints/indexes that can be used as conflict targets before the data snapshot runs:Everything else keeps the existing restore order. Foreign keys, regular indexes, triggers, and other constraints still restore after data, so the snapshot does not lose the current bulk-load behavior or start enforcing relationship constraints before all rows are present.
For the other paths, including bulk ingest and
on_conflict_action: "nothing", the existing order is unchanged.