Skip to content

Comments

Respect updates to st_table during replay#3937

Merged
gefjon merged 5 commits intomasterfrom
phoebe/replay-st-table-update
Dec 30, 2025
Merged

Respect updates to st_table during replay#3937
gefjon merged 5 commits intomasterfrom
phoebe/replay-st-table-update

Conversation

@gefjon
Copy link
Contributor

@gefjon gefjon commented Dec 30, 2025

Description of Changes

Prior to this commit, we had special handling for deletes from st_table during replay, but we did not pair them with inserts to form updates, instead only treating the delete as a dropped table.

With this commit, we record inserts to st_table which will form update pairs during replay, and handle them appropriately, updating the table's schema and not dropping the table.

There is a tricky case where a table exists but is empty, and then within a single transaction:

  • The table undergoes a migration s.t. its table_access or primary_key changes.
  • At least one row is inserted into the table.

In this case, the in-memory table structure will not exist at the point of the st_table insert, but will be created before the corresponding st_table delete, meaning there will be two conflicting st_table rows resident. To handle this case, CommittedState tracks a side table, replay_table_updated, which stores a RowPointer to the correct most-recent st_table row for the migrating table.

I've also renamed the one previously-extant replay-only side table, table_dropped, to include the replay_ prefix, which IMO improves clarity. And I've made it so replay_table_dropped is cleared at the end of each transaction, as the previous behavior of continuing to ignore a table that should be unreachable masked errors which would have been helpful when debugging this issue.

This PR also includes an extended error message when encountering a unique constraint violation while replaying, which I found helpful while debugging.

API and ABI breaking changes

N/a

Expected complexity level and risk

2 - replay is complicated and scary, but this PR isn't gonna make things more broken than they already were.

Testing

  • Manually replayed a commitlog which included a migration that altered a table's table_access, which was previously broken but now replays successfully.

when seeing a unique constraint violation after replay
Prior to this commit, we had special handling for deletes from `st_table` during replay,
but we did not pair them with inserts to form updates,
instead only treating the delete as a dropped table.

With this commit, we record inserts to `st_table` which will form update pairs during replay,
and handle them appropriately, updating the table's schema and not dropping the table.

There is a tricky case where a table exists but is empty, and then within a single transaction:
- The table undergoes a migration s.t. its `table_access` or `primary_key` changes.
- At least one row is inserted into the table.

In this case, the in-memory table structure will not exist at the point of the `st_table` insert,
but will be created before the corresponding `st_table` delete,
meaning there will be two conflicting `st_table` rows resident.
To handle this case, `CommittedState` tracks a side table, `replay_table_updated`,
which stores a `RowPointer` to the correct most-recent `st_table` row for the migrating table.

I've also renamed the one previously-extant replay-only side table, `table_dropped`,
to include the `replay_` prefix, which IMO improves clarity.
And I've made it so `replay_table_dropped` is cleared at the end of each transaction,
as the previous behavior of continuing to ignore a table that should be unreachable
masked errors which would have been helpful when debugging this issue.
Printing the full table contents was helpful while debugging,
but is too much to see in logs normally.
@gefjon gefjon requested a review from coolreader18 December 30, 2025 17:25
Good catch, Tyler!
Copy link
Contributor

@cloutiertyler cloutiertyler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This now LGTM

@gefjon gefjon enabled auto-merge December 30, 2025 20:05
@gefjon gefjon added this pull request to the merge queue Dec 30, 2025
Merged via the queue into master with commit 41eec04 Dec 30, 2025
27 checks passed
jdetter pushed a commit that referenced this pull request Dec 30, 2025
# Description of Changes

Prior to this commit, we had special handling for deletes from
`st_table` during replay, but we did not pair them with inserts to form
updates, instead only treating the delete as a dropped table.

With this commit, we record inserts to `st_table` which will form update
pairs during replay, and handle them appropriately, updating the table's
schema and not dropping the table.

There is a tricky case where a table exists but is empty, and then
within a single transaction:
- The table undergoes a migration s.t. its `table_access` or
`primary_key` changes.
- At least one row is inserted into the table.

In this case, the in-memory table structure will not exist at the point
of the `st_table` insert, but will be created before the corresponding
`st_table` delete, meaning there will be two conflicting `st_table` rows
resident. To handle this case, `CommittedState` tracks a side table,
`replay_table_updated`, which stores a `RowPointer` to the correct
most-recent `st_table` row for the migrating table.

I've also renamed the one previously-extant replay-only side table,
`table_dropped`, to include the `replay_` prefix, which IMO improves
clarity. And I've made it so `replay_table_dropped` is cleared at the
end of each transaction, as the previous behavior of continuing to
ignore a table that should be unreachable masked errors which would have
been helpful when debugging this issue.

This PR also includes an extended error message when encountering a
unique constraint violation while replaying, which I found helpful while
debugging.

# API and ABI breaking changes

N/a

# Expected complexity level and risk

2 - replay is complicated and scary, but this PR isn't gonna make things
*more* broken than they already were.

# Testing

- [x] Manually replayed a commitlog which included a migration that
altered a table's `table_access`, which was previously broken but now
replays successfully.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants