Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: fix bad column offsets. #3754

Merged
merged 37 commits into from
Jul 28, 2017
Merged

*: fix bad column offsets. #3754

merged 37 commits into from
Jul 28, 2017

Conversation

hicqu
Copy link
Contributor

@hicqu hicqu commented Jul 14, 2017

Sort Table.Cols() and Table.WritableCols() by model.ColumnInfo.Offset, so we can use the offset in any schema states correctly.

And, fix some other bugs.

@shenli
Copy link
Member

shenli commented Jul 14, 2017

@hicqu Please add a few description.

@shenli
Copy link
Member

shenli commented Jul 17, 2017

This PR needs three LGTMs.

Copy link
Contributor

@tiancaiamao tiancaiamao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add test for bug fix.

var sc = ctx.GetSessionVars().StmtCtx
var changed = false
var handleChanged = false
var onUpdate = make([]bool, len(touched))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just declare it var onUpdate []bool is enough, avoid the allocation here.

var onUpdate = make([]bool, len(touched))

// We can iterate on public columns not writable columns,
// because all writable columns are after public columns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"because all writable columns are after public columns"
I think public columns are after writable columns ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

they are sorted by their Offset, and write-only columns' Offset must be larger than public columns.

}
if cmp != 0 {
changed = true
touched[i] = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe you misunderstand what touched means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

touched means the update statement changes the record really. is that correct?

@@ -1090,9 +1090,7 @@ func (b *planBuilder) buildDataSource(tn *ast.TableName) LogicalPlan {
} else {
statisticTable = handle.GetTableStats(tn.TableInfo.ID)
}
if b.err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

because it's a redundant error check.

writable := make([]table.Index, 0, len(t.indices))
for _, index := range t.indices {
s := index.Meta().State
if s != model.StateNone && s != model.StateDeleteOnly && s != model.StateDeleteReorganization {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

model.StateDeleteReorganization state is writable ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to consider the state of StateNone. The same as line 145.

}
}

return t.publicColumns
return publicColumns[0 : maxOffset+1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some empty hole in the result columns ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ddl.adjustColumnOffset guarantees the offsets are continuous.

// 1. the column is included in primary key;
// 2. the column's default value is null, and the value equals to that;
// 3. the column is virtual generated.
func (t *Table) CanSkip(col *table.Column, value types.Datum) bool {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it necessary to make the method public?

@hicqu hicqu closed this Jul 18, 2017
@hicqu hicqu deleted the qupeng/column-offset branch July 18, 2017 06:14
@hicqu hicqu restored the qupeng/column-offset branch July 18, 2017 06:14
@hicqu hicqu reopened this Jul 18, 2017
@hicqu hicqu added the priority/P1 The issue has P1 priority. label Jul 18, 2017
@shenli
Copy link
Member

shenli commented Jul 18, 2017

@tiancaiamao @zimulala @hanfei1991 PTAL

func updateRecord(ctx context.Context, h int64, oldData, newData []types.Datum, touched []bool, t table.Table, onDup bool) (bool, error) {
var sc = ctx.GetSessionVars().StmtCtx
var changed, handleChanged = false, false
var onUpdateNoChange = make(map[int]bool)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for this variable.

// If no assign list for this table, no need to update.
if !assignExists {
return false, nil
for i, col := range t.Cols() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any test case for this?

if err != nil {
return false, errors.Trace(err)
}
newData[i] = v
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we update the variable changed here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, but there is another bug: only if changed == true we can change those on-update fields.

rowCount = 0
}
txn := e.ctx.Txn()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only need to get a new txn after commit a batch of data. So it is better to move this back into the if statement.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I made a mistake.

@@ -974,15 +969,16 @@ func (e *InsertValues) adjustAutoIncrementDatum(row []types.Datum, i int, c *tab
// onDuplicateUpdate updates the duplicate row.
// TODO: Report rows affected and last insert id.
func (e *InsertExec) onDuplicateUpdate(row []types.Datum, h int64, cols []*expression.Assignment) error {
data, err := e.Table.Row(e.ctx, h)
// See http://dev.mysql.com/doc/refman/5.7/en/miscellaneous-functions.html#function_values
e.ctx.GetSessionVars().CurrInsertValues = row
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move it here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thinks it looks better. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: it's not always better, if error happen, and statement retry, CurrInsertValues should set to the real current insert value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I will restore that.

var columns []*table.Column
if b.inUpdateStmt {
columns = tbl.WritableCols()
} else {
columns = tbl.Cols()
}
p.Columns = make([]*model.ColumnInfo, 0, len(columns))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is performance improvement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a little improvement, because we only need alloc space with length as len(columns), not len(tableInfo.Columns).

AddRecord(ctx context.Context, r []types.Datum) (recordID int64, err error)

// UpdateRecord updates a row in the table.
UpdateRecord(ctx context.Context, h int64, currData []types.Datum, newData []types.Datum, touched map[int]bool) error
// UpdateRecord updates a row which should contain only writable columns.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writable or public?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For AddRecord, it should be public. But for UpdateRecord, it should be write-only + public because we need fetch write-only fields back, and write them into tikv again.

func (t *Table) rebuildIndices(rm kv.RetrieverMutator, h int64, touched map[int]bool, oldData []types.Datum, newData []types.Datum) error {
for _, idx := range t.Indices() {
idxTouched := false
for _, idx := range t.WritableIndices() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bug fix?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously check the index is writable or not is done in buildIndexForRow. I moved that here for more easily to understand.

@@ -137,33 +159,35 @@ func (t *Table) Cols() []*table.Column {
if len(t.publicColumns) > 0 {
return t.publicColumns
}

t.publicColumns = make([]*table.Column, 0, len(t.Columns))
publicColumns := make([]*table.Column, len(t.Columns))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So line 159 will always be false?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just for the first time it will be false. When everytimes we load schemas from tikv, t.publicColumns = t.Cols() is called.


t.publicColumns = make([]*table.Column, 0, len(t.Columns))
publicColumns := make([]*table.Column, len(t.Columns))
maxOffset := -1
for _, col := range t.Columns {
if col.State == model.StatePublic {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following code style is more like Go's style.

if col.State != model.StatePublix{
    continue
}

} else {
continue
}
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move line 92-103 behind line 118. Then you can move the judgment condition of changed.

}
newData[i] = v
} else {
continue
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This else is not needed.

// See https://dev.mysql.com/doc/refman/5.7/en/mysql-real-connect.html CLIENT_FOUND_ROWS
if ctx.GetSessionVars().ClientCapability&mysql.ClientFoundRows > 0 {
sc.AddAffectedRows(1)
}
return false, nil
}

var err error
if !newHandle.IsNull() {
// Fill values into on-update-now fields, only if they are really changed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any test case covers this?

Copy link
Contributor Author

@hicqu hicqu Jul 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test cases covering this are in tidb-test/mysqltest. If here the logic is wrong, this error will happen.

writable := make([]table.Index, 0, len(t.indices))
for _, index := range t.indices {
s := index.Meta().State
if s != model.StateNone && s != model.StateDeleteOnly && s != model.StateDeleteReorganization {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to consider the state of StateNone. The same as line 145.

var value types.Datum
if col.State == model.StateWriteOnly || col.State == model.StateWriteReorganization {
if col.State != model.StatePublic {
// if col is in write only or write reorganization state, we must add it with its default value.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/if/If.
This comment may need to be updated.

row = append(row, value)
}
if shouldWriteBinlog(ctx) && !t.canSkipUpdateBinlog(col, value) {
binlogColIDs = append(binlogColIDs, col.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the same colIDs?
I think that using the same colIDs and row can reduce memory and reduce encoding times.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because colIDs is for encoding values into row stored in Tikv, but binlogColIDs is used for binlog. They are differant. For example, if a column is PK handle, it won't be encoded into row for saving space, but it should appear in binlog.

@hicqu hicqu force-pushed the qupeng/column-offset branch from a486785 to 7665b13 Compare July 25, 2017 08:47
@zimulala
Copy link
Contributor

LGTM

@zimulala zimulala added the status/LGT1 Indicates that a PR has LGTM 1. label Jul 25, 2017
@hicqu hicqu force-pushed the qupeng/column-offset branch from 4443787 to 7665b13 Compare July 26, 2017 05:01
@hicqu
Copy link
Contributor Author

hicqu commented Jul 26, 2017

Now this PR has merged #3804 , and it can deal with the two state bug.

@zimulala
Copy link
Contributor

PTAL @tiancaiamao @shenli

@tiancaiamao
Copy link
Contributor

LGTM

@hicqu
Copy link
Contributor Author

hicqu commented Jul 27, 2017

@shenli , PTAL again, thanks.

@shenli
Copy link
Member

shenli commented Jul 28, 2017

LGTM

@zimulala zimulala added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Jul 28, 2017
@shenli shenli merged commit 597796e into master Jul 28, 2017
@shenli shenli deleted the qupeng/column-offset branch July 28, 2017 12:56
shenli added a commit that referenced this pull request Jul 28, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/P1 The issue has P1 priority. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants