You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following sequence describes how a not null constraint in YSQL table can be violated:
PG executes an alter table statement, which is translated to a number of updates to PG sys catalog tables. Because the alter table is a DDL statement, a separate DDL transaction is used to manage the entire flow of the alter table statement, including all updates to the relevant PG sys catalog tables.
After PG sys catalog updates succeed, PG executes an additional YB alter table operation (YBCExecAlterTable). At this time, all the PG sys catalog updates reside in the intents db which can be rolled back if the DDL transaction is later aborted.
An AlterTable RPC is sent from PG to the TServer that the PG is bound to (its local TServer). For AlterTable, the TServer forwards the request to the Master.
The Master writes the new table schema metadata into the YB sys catalog table. This is different from the PG sys catalog table. Both are raft-replicated. However, YB sys catalog table does not have an intents db. So the alter table operation is directly written into the regular db of the YB sys catalog table to indicate that the alter table is on-going. After that it sends one AlterSchema RPC to each of the tablet replicas hosting the table being altered. This is because each tablet hosting the table T also stores schema metadata of T which is now old and needs to be updated.
The local TServer will keep polling Master whether the alter table is done via IsAlterTableDone RPC. The Master will wait for a response for each AlterSchema RPC. On a TServer to Master heartbeat, TServer will report its schema version to the Master. When Master finds all tablets have the latest table schema version, it will mark the AlterTable is done by directly "finalize" the alter table operation into the regular db of the YB sys catalog table. Once done, it has two effects: (1) the next IsAlterTableDone RPC will get a positive answer and the local TServer will reply to the PG that its AlterTable RPC has succeeded. (2) the next GetTableSchema RPC will return new schema version 1.
PG will invalidate its old YB table entry and reload T to have the new schema. It will also issue a request to increment master catalog version in table pg_yb_catalog_version. This increment is also covered by the same DDL transaction.
PG commits the DDL transaction which covers the entire alter table flow described above.
In case of a race between two sessions:
insert into T values (1)
and
alter table T add column v1 int not null
They cannot both succeed because of the not null constraint.
If insert reads PG sys catalog table prior to step 7, it will read the old PG schema metadata that does not have the not null constraint. However, if insert reads YB sys catalog after step 5, it will read the new YB table schema metadata.
Assume insert fails with a "schema version mismatch" error, this does not mean step 7 is done. It only means the TServer replica the insert has reached has already got the new schema. On a retry of the insert, it is possible that we are still in the window between step 5 and step 7. In this case insert can read old PG schema metadata if step 6 to increment master catalog version hasn't done yet, together with new YB schema metadata because step 5 is done. As a result, PG does not see the not null constraint on the old PG schema, and new YB schema metadata will no longer hit "schema version mismatch" error and the insert statement will succeed on the retry. The end result is that both insert and alter succeeded, violating the not null constraint.
The text was updated successfully, but these errors were encountered:
Jira Link: DB-1254
Description
The following sequence describes how a not null constraint in YSQL table can be violated:
PG executes an alter table statement, which is translated to a number of updates to PG sys catalog tables. Because the alter table is a DDL statement, a separate DDL transaction is used to manage the entire flow of the alter table statement, including all updates to the relevant PG sys catalog tables.
After PG sys catalog updates succeed, PG executes an additional YB alter table operation (
YBCExecAlterTable
). At this time, all the PG sys catalog updates reside in the intents db which can be rolled back if the DDL transaction is later aborted.An
AlterTable
RPC is sent from PG to the TServer that the PG is bound to (its local TServer). For AlterTable, the TServer forwards the request to the Master.The Master writes the new table schema metadata into the YB sys catalog table. This is different from the PG sys catalog table. Both are raft-replicated. However, YB sys catalog table does not have an intents db. So the alter table operation is directly written into the regular db of the YB sys catalog table to indicate that the alter table is on-going. After that it sends one AlterSchema RPC to each of the tablet replicas hosting the table being altered. This is because each tablet hosting the table T also stores schema metadata of T which is now old and needs to be updated.
The local TServer will keep polling Master whether the alter table is done via IsAlterTableDone RPC. The Master will wait for a response for each AlterSchema RPC. On a TServer to Master heartbeat, TServer will report its schema version to the Master. When Master finds all tablets have the latest table schema version, it will mark the
AlterTable
is done by directly "finalize" the alter table operation into the regular db of the YB sys catalog table. Once done, it has two effects: (1) the next IsAlterTableDone RPC will get a positive answer and the local TServer will reply to the PG that itsAlterTable
RPC has succeeded. (2) the nextGetTableSchema
RPC will return new schema version 1.PG will invalidate its old YB table entry and reload T to have the new schema. It will also issue a request to increment master catalog version in table
pg_yb_catalog_version
. This increment is also covered by the same DDL transaction.PG commits the DDL transaction which covers the entire alter table flow described above.
In case of a race between two sessions:
and
They cannot both succeed because of the not null constraint.
If insert reads PG sys catalog table prior to step 7, it will read the old PG schema metadata that does not have the not null constraint. However, if insert reads YB sys catalog after step 5, it will read the new YB table schema metadata.
Assume insert fails with a "schema version mismatch" error, this does not mean step 7 is done. It only means the TServer replica the insert has reached has already got the new schema. On a retry of the insert, it is possible that we are still in the window between step 5 and step 7. In this case insert can read old PG schema metadata if step 6 to increment master catalog version hasn't done yet, together with new YB schema metadata because step 5 is done. As a result, PG does not see the not null constraint on the old PG schema, and new YB schema metadata will no longer hit "schema version mismatch" error and the insert statement will succeed on the retry. The end result is that both insert and alter succeeded, violating the not null constraint.
The text was updated successfully, but these errors were encountered: