Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl:support the definition of null change to not null using alter table #7771

Merged
merged 42 commits into from
Oct 17, 2018

Conversation

ciscoxll
Copy link
Contributor

@ciscoxll ciscoxll commented Sep 23, 2018

What problem does this PR solve?

Before this PR , we don't support modifying the type definitions of 'null ' to 'not null '.

Fix issue #5035.

What is changed and how it works?

Refer to https://dev.mysql.com/doc/refman/5.7/en/alter-table.html.

  • Before the job is executed.
    Check if the field to be modified has a null value, cannot be modified if a null value exists.
  • In the job execution.
    1. Query whether the field to be modified has a null value.
    2. Then introduce another flag to prevent users from inserting or updating null values.
    3. Modify the type defined Flag to NotNullFlag.
    4. Query field definition whether there is a null value inserted after modification.
    5. If there is a null value inserted , it cannot be modified and needs to be rollback.

Check List

Tests

  • Unit test

ddl/column.go Outdated
tblInfo.Columns[oldCol.Offset].Flag = newCol.Flag
ver, err = updateVersionAndTableInfo(t, job, tblInfo, true)
// Wait for two leases to ensure that all nodes in the cluster are updated successfully.
ver, err = updateVersionAndTableInfo(t, job, tblInfo, true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why update it again? What if tidb panics after this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question needs to be discussed.

ddl/column.go Outdated
@@ -379,6 +422,20 @@ func doModifyColumn(t *meta.Meta, job *model.Job, newCol *model.ColumnInfo, oldN
return ver, nil
}

// CheckForNullValue check for null values of the field.
func CheckForNullValue(ctx sessionctx.Context, schema, table, oldCol, newCol model.CIStr) error {
sql := fmt.Sprintf(`select * from %s.%s where %s is null; `, schema.L, table.L, oldCol.L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is better to add quote around table/schema name, in case there are names like a b.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like `a`

Copy link
Contributor

@winkyao winkyao Sep 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use select count(*) from %s.%s where %s is null; instead, if the null values count is very large, this will oom.

ddl/column.go Outdated
@@ -379,6 +422,20 @@ func doModifyColumn(t *meta.Meta, job *model.Job, newCol *model.ColumnInfo, oldN
return ver, nil
}

// CheckForNullValue check for null values of the field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckForNullValue ensures there are no null values in the column of this table.

@winkyao
Copy link
Contributor

winkyao commented Sep 26, 2018

any update?

ddl/column.go Outdated
// Modify the type defined Flag to NotNullFlag.
tblInfo.Columns[oldCol.Offset].Flag = newCol.Flag
ver, err = updateVersionAndTableInfo(t, job, tblInfo, true)
// Wait for two leases to ensure that all nodes in the cluster are updated successfully.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function can not do it.
This transaction is not committed, so no new schema needs to be updated by other nodes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zimulala I am doing this for the purpose, modify the field definition to NotNullFlag to prevent insertion of null values.

@@ -267,7 +272,7 @@ func onSetDefaultValue(t *meta.Meta, job *model.Job) (ver int64, _ error) {
return updateColumn(t, job, newCol, &newCol.Name)
}

func onModifyColumn(t *meta.Meta, job *model.Job) (ver int64, _ error) {
func (w *worker) onModifyColumn(t *meta.Meta, job *model.Job) (ver int64, _ error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we need to change this to member function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winkyao I need to get sessionctx from context resource pool.

ddl/column.go Outdated
tblInfo, err := getTableInfo(t, job, job.SchemaID)
if err != nil {
return ver, errors.Trace(err)
}

oldCol := model.FindColumnInfo(tblInfo.Columns, oldName.L)
if job.IsRollingback() {
// field flag reset to null.
tblInfo.Columns[oldCol.Offset].Flag = oldCol.Flag &^ mysql.NotNullFlag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can you confirm the oldCol definitely set to not null flag? What if the job is not alter column from not null to null?

ddl/column.go Outdated
// Wait for two leases to ensure that all nodes in the cluster are updated successfully.
ver, err = updateVersionAndTableInfo(t, job, tblInfo, true)
if err != nil {
job.State = model.JobStateRollingback
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If "err != nil" is true, I don't think we need this state.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zimulala Here need to rollback the second time you enter the function.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ciscoxll This comment is for your old commit.

ddl/ddl_api.go Outdated
if !mysql.HasNotNullFlag(col.Flag) && mysql.HasNotNullFlag(newCol.Flag) {
return nil, errUnsupportedModifyColumn.GenWithStackByArgs("null to not null")
if err = CheckForNullValue(ctx, ident.Schema, ident.Name, col.Name, newCol.Name); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why check it here? We will check it two times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Job entering the ddl queue requires two checks on the job.

s.mustExec(c, "create table t4 (a int default '0', b varchar(10), d int not null default '0')")
s.tk.MustExec("insert into t4(a) values (null);")
sql = "alter table t4 change a a1 int not null"
s.testErrorCode(c, sql, tmysql.WarnDataTruncated)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same with mysql? In my environment:

mysql> alter table t4 change a a1 int not null;
ERROR 1138 (22004): Invalid use of NULL value

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lamxTyler Done

mysql

mysql> create table test2 (c1 int, c2 int, c3 int default 1, index (c1));
Query OK, 0 rows affected (0.03 sec)

mysql> insert into test2(c2) values (null);
Query OK, 1 row affected (0.00 sec)
mysql> alter table test2 change c2 a bigint not null;
ERROR 1265 (01000): Data truncated for column 'a' at row 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your test also changes the column type, but if you use alter table test2 change c2 a int not null;, you can get a different result.

Copy link
Contributor

@winkyao winkyao Oct 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mysql> create table test2 (c1 int, c2 int, c3 int default 1, index (c1));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into test2(c2) values (null);
Query OK, 1 row affected (0.00 sec)

mysql> alter table test2 change c2 a bigint not null;
ERROR 1265 (01000): Data truncated for column 'a' at row 1
mysql> alter table test2 change c2 a int not null;
ERROR 1138 (22004): Invalid use of NULL value

FYI @ciscoxll

ddl/column.go Outdated
@@ -379,6 +431,20 @@ func doModifyColumn(t *meta.Meta, job *model.Job, newCol *model.ColumnInfo, oldN
return ver, nil
}

// CheckForNullValue ensure there are no null values of the column of this table.
func CheckForNullValue(ctx sessionctx.Context, schema, table, oldCol, newCol model.CIStr) error {
sql := fmt.Sprintf("select count(*) from `%s`.`%s` where `%s` is null;", schema.L, table.L, oldCol.L)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to know the number of rows? If not, use limit 1 is better.

Copy link
Contributor

@winkyao winkyao Sep 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add limit 1

ddl/column.go Outdated
defer w.sessPool.put(ctx)
err = CheckForNullValue(ctx, dbInfo.Name, tblInfo.Name, oldCol.Name, newCol.Name)
// If there is a null value inserted, it cannot be modified and needs to be rollback.
if ErrWarnDataTruncated.Equal(err) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if err != nil and err != ErrWarnDataTruncated

ddl/column.go Outdated
if job.IsRollingback() {
if IsNull2Notnull {
// field flag reset to null.
tblInfo.Columns[oldCol.Offset].Flag = oldCol.Flag &^ mysql.NotNullFlag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use another flag to indicate this column cannot insert null value. Otherwise, if there are some null value are inserted in the interval before the job is handled by the owner, decode a null value in column with mysql.NotNullFlag will return "Miss Column" error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@winkyao Done.

ddl/column.go Show resolved Hide resolved
ddl/column.go Outdated
if !mysql.HasNotNullFlag(oldCol.Flag) && mysql.HasNotNullFlag(newCol.Flag) {
err = CheckForNullValue(ctx, dbInfo.Name, tblInfo.Name, oldCol.Name, newCol.Name)
if err != nil {
job.State = model.JobStateCancelled
Copy link
Contributor

@crazycs520 crazycs520 Oct 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set job.State = model.JobStateRollingback for rollback to clear PreventNullInsertFlag.
The reason is: If you insert null after CheckForNullValue but before set PreventNullInsertFlag, the ddl will failed and return in next CheckForNullValue, but have not clear the PreventNullInsertFlag.

ddl/column.go Outdated
}
} else {
// Modify the type defined Flag to NotNullFlag.
tblInfo.Columns[oldCol.Offset].Flag = newCol.Flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think only modify the Flag to NotNullFlag and clear the PreventNullInsertFlag maybe better.

tblInfo.Columns[oldCol.Offset].Flag |= mysql.NotNullFlag
tblInfo.Columns[oldCol.Offset].Flag = oldCol.Flag &^ mysql.PreventNullInsertFlag

Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please resolve the data race in test case.

winkyao
winkyao previously approved these changes Oct 16, 2018
Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@winkyao winkyao dismissed their stale review October 16, 2018 03:45

reset approve

@ciscoxll ciscoxll added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Oct 16, 2018
@ciscoxll
Copy link
Contributor Author

/run-all-tests

@winkyao
Copy link
Contributor

winkyao commented Oct 16, 2018

@zimulala PTAL

ddl/column.go Outdated
}

if err != nil {
return ver, errors.Trace(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe here use this code is enough.

ddl/column.go Outdated
tblInfo.Columns[oldCol.Offset].Flag |= mysql.PreventNullInsertFlag
} else {
// Modify the type defined Flag to NotNullFlag.
tblInfo.Columns[oldCol.Offset].Flag |= mysql.NotNullFlag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this situation? Then we can reduce the waiting time( 2*lease).

ddl/column.go Outdated
return ver, nil
}
// Modify the type defined Flag to NotNullFlag.
tblInfo.Columns[oldCol.Offset].Flag |= mysql.NotNullFlag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this code?
Because after this,we will assign newCol to the original column.

Copy link
Contributor

@zimulala zimulala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zimulala
Copy link
Contributor

/run-all-tests

@ciscoxll ciscoxll added status/LGT3 The PR has already had 3 LGTM. and removed status/LGT2 Indicates that a PR has LGTM 2. labels Oct 17, 2018
@ciscoxll ciscoxll merged commit e544882 into pingcap:master Oct 17, 2018
@ciscoxll ciscoxll deleted the modify-column branch October 17, 2018 09:13
iamzhoug37 pushed a commit to iamzhoug37/tidb that referenced this pull request Oct 25, 2018
bugfix fixed pingcap#7518

expression: MySQL compatible current_user function (pingcap#7801)

plan: propagate constant over outer join (pingcap#7794)

- extract `outerCol = const` from join conditions and filter conditions,
  substitute `outerCol` in join conditions with `const`;
- extract `outerCol = innerCol` from join conditions, derive new join
  conditions based on this column equal condition and `outerCol` related
  expressions in join conditions and filter conditions;

util/timeutil: fix data race caused by forgetting set stats lease to 0 (pingcap#7901)

stats: handle ddl event for partition table (pingcap#7903)

plan: implement Operand and Pattern of cascades planner. (pingcap#7910)

planner: not convert to TableDual if empty range is derived from deferred constants (pingcap#7808)

plan: move projEliminate behind aggEliminate (pingcap#7909)

admin: fix admin check table bug of byte compare (pingcap#7887)

* admin: remove reflect deepEqual

stats: fix panic caused by empty histogram (pingcap#7912)

plan: fix panic caused by empty schema of LogicalTableDual (pingcap#7906)

* fix drop view if exist error (pingcap#7833)

executor: refine `explain analyze` (pingcap#7888)

executor: add an variable to compatible with MySQL insert for OGG (pingcap#7863)

expression: maintain `DeferredExpr` in aggressive constant folding. (pingcap#7915)

stats: fix histogram boundaries overflow error (pingcap#7883)

ddl:support the definition of `null` change to `not null` using `alter table` (pingcap#7771)

* ddl:support the definition of null change to not null using alter table

ddl: add check when create table with foreign key. (pingcap#7885)

* ddl: add check when create table with foreign key

planner: eliminate if null on non null column (pingcap#7924)

executor: fix a bug in point get (pingcap#7934)

planner, executor: refine ColumnPrune for LogicalUnionAll (pingcap#7930)

executor: fix panic when limit is too large (pingcap#7936)

ddl: add TiDB version to metrics (pingcap#7902)

stats: limit the length of sample values (pingcap#7931)

vendor: update tipb (pingcap#7893)

planner: support the Group and GroupExpr for the cascades planner (pingcap#7917)

store/tikv: log more information when other err occurs (pingcap#7948)

types: fix date time parse (pingcap#7933)

ddl: just print error message when ddl job is normal to calcel, to eliminate noisy log (pingcap#7875)

stats: update delta info for partition table (pingcap#7947)

explaintest: add explain test for partition pruning (pingcap#7505)

util: move disjoint set to util package (pingcap#7950)

util: add PreAlloc4Row and Insert for Chunk and List (pingcap#7916)

executor: add the slow log for commit (pingcap#7951)

expression: add builtin json_keys (pingcap#7776)

privilege: add USAGE in `show grants` for mysql compatibility (pingcap#7955)

ddl: fix invailid ddl job panic (pingcap#7940)

*: move ast.NewValueExpr to standalone parser_driver package (pingcap#7952)

Make the ast package get rid of the dependency of types.Datum

server: allow cors http request (pingcap#7939)

*: move `Statement` and `RecordSet` from ast to sqlexec package (pingcap#7970)

pr suggestion update

executor/aggfuncs: split unit tests to corresponding file (pingcap#7993)

store/tikv: fix typo (pingcap#7990)

executor, planner: clone proj schema for different children in buildProj4Union (pingcap#7999)

executor: let information_schema be the first database in ShowDatabases (pingcap#7938)

stats: use local feedback for partition table (pingcap#7963)

executor: add unit test for aggfuncs (pingcap#7966)

server: add log for binary execute statement (pingcap#7987)

admin: refine admin check decoder (pingcap#7862)

executor: improve wide table insert & update performance (pingcap#7935)

ddl: fix reassigned partition id in `truncate table` does not take effect (pingcap#7919)

fix reassigned partition id in truncate table does not take effect

add changelog for 2.1.0 rc4 (pingcap#8020)

*: make parser package dependency as small as possible (pingcap#7989)

parser: support `:=` in the `set` syntax (pingcap#8018)

According to MySQL document, `set` use the = assignment operator,
but the := assignment operator is also permitted

stats: garbage collect stats for partition table (pingcap#7962)

docs: add the proposal for the column pool (pingcap#7988)

expression: refine built-in func truncate to support uint arg (pingcap#8000)

stats: support show stats for partition table (pingcap#8023)

stats: update error rate for partition table (pingcap#8022)

stats: fix estimation for out of range point queries (pingcap#8015)

*: move parser to a separate repository (pingcap#8036)

executor: fix wrong result when index join on union scan. (pingcap#8031)

Do not modify Plan of dataReaderBuilder directly, because it would
impact next batch of outer rows, as well as other concurrent inner
workers. Instead, build a local child builder to store the child plan.

planner: fix a panic of a cached prepared statement with IndexScan (pingcap#8017)

*: fix the issue of executing DDL after executing SQL failure in txn (pingcap#8044)

* ddl, executor: fix the issue of executing DDL after executing SQL failure in txn

add unit test

remove debug info

add like evaluator case sensitive test

ddl, domain: make schema correct after canceling jobs (pingcap#7997)

unit test fix

code format

proposal: maintaining histograms in plan. (pingcap#7605)

support _tidb_rowid for table scan range (pingcap#8047)

var rename fix
@sre-bot sre-bot added the contribution This PR is from a community contributor. label Dec 18, 2019
@you06 you06 added the sig/sql-infra SIG: SQL Infra label Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor. sig/sql-infra SIG: SQL Infra status/LGT3 The PR has already had 3 LGTM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants