Skip to content
This repository has been archived by the owner on Dec 8, 2021. It is now read-only.

restore: check row value count to avoid unexpected encode result #528

Merged
merged 16 commits into from
Jan 6, 2021

Conversation

glorv
Copy link
Contributor

@glorv glorv commented Dec 18, 2020

What problem does this PR solve?

Check row field count before encodes, if row value count is bigger than table field count, directly return an error.

  • check column count in tidb encoder and return an error if column count doesn't match table column count
  • Be compatible with special field _tidb_rowid in getColumnNames and tidb encoder

What is changed and how it works?

Check List

Tests

  • Unit test
  • Integration test
  • Manual test (add detailed scripts or steps below)
  • No code

Side effects

Related changes

Release Note

  • Fix the bug that tidb backend will panics if source file columns are more than target table columns

@glorv glorv added the type/bug-fix Bug fix label Dec 18, 2020
@glorv
Copy link
Contributor Author

glorv commented Dec 18, 2020

/run-all-tests

1 similar comment
@glorv
Copy link
Contributor Author

glorv commented Dec 18, 2020

/run-all-tests

@lance6716
Copy link
Contributor

LGTM

@ti-srebot ti-srebot added the status/LGT1 One reviewer already commented LGTM (LGTM1) label Dec 21, 2020
Copy link
Collaborator

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what bug does this PR fix?

lightning/restore/restore.go Outdated Show resolved Hide resolved
@glorv
Copy link
Contributor Author

glorv commented Dec 22, 2020

what bug does this PR fix?

We met a case last week that the source file columns are more than the table schema, and this caused tidb backend panic at

if err := enc.appendSQL(&encoded, &field, cols[enc.columnIdx[i]]); err != nil {
.

And for local/importer backend, whether it can result in correct result depends on the implementation of tidb table encoder, thus it is undefined behavior.

@glorv
Copy link
Contributor Author

glorv commented Dec 23, 2020

To close this issue: #518

Not really. #531 should fix one cause of the tikv timeout in checksum/analyze phase. There is no corresponding issue for this one. We found it in an oncall issue though the root cause of that issue is user misconfiged the table schema.

@kennytm
Copy link
Collaborator

kennytm commented Dec 23, 2020

We met a case last week that the source file columns are more than the table schema, and this caused tidb backend panic at

if err := enc.appendSQL(&encoded, &field, cols[enc.columnIdx[i]]); err != nil {
.

maybe easier to check len(enc.columnIdx) >= len(row) in this file and also enc.columnIdx[i] >= 0.

or maybe do it inside initializeColumns and throw the unknownCols error.

And for local/importer backend, whether it can result in correct result depends on the implementation of tidb table encoder

pretty sure they will just be ignored.


we also need to support "ignore_extra_columns" in Fast Bulk Import i.e. it is legal to have a CSV in the form

name,info,ignored,address
Foo,Foo2,X,1234567
Bar,Bar2,Y,987654321

with the column "ignored" skipped.

@glorv
Copy link
Contributor Author

glorv commented Dec 23, 2020

ok, we could delay resolve this issue to the implementation of "ignore_extra_columns" since this is an ergent thing.

@glorv
Copy link
Contributor Author

glorv commented Dec 23, 2020

With a CSV without header, or it's an invalid csv with the fields in some line more than in header, maybe we still should check it in the tidb encoder

@kennytm
Copy link
Collaborator

kennytm commented Dec 23, 2020

i mean we should still check it, but we should do it initializeColumns or the TiDB Encoder, not in the encodeLoop.

@glorv
Copy link
Contributor Author

glorv commented Dec 24, 2020

i mean we should still check it, but we should do it initializeColumns or the TiDB Encoder, not in the encodeLoop.

got. I'll move it into the tidb encoder since tableKvEncoder can properly handle it.

@glorv
Copy link
Contributor Author

glorv commented Dec 24, 2020

@kennytm I have moved the check into tidb encoder, and fix an extra bug related to _tidb_rowid, PTAL again

Copy link
Collaborator

@kennytm kennytm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM (and fix the test)

Comment on lines 245 to 247
} else {
return cols[index]
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
} else {
return cols[index]
}
return cols[index]

strange that it is not linted

// See: tests/generated_columns/data/gencol.various_types.0.sql this sql has no columns, so encodeLoop will fill the
// column permutation with default, thus enc.columnCnt > len(row).
if len(row) > enc.columnCnt {
log.L().Error("column count mismatch", zap.Ints("column_permutation", columnPermutation),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
log.L().Error("column count mismatch", zap.Ints("column_permutation", columnPermutation),
logger.Error("column count mismatch", zap.Ints("column_permutation", columnPermutation),

tests/tidb_rowid/run.sh Show resolved Hide resolved
@glorv
Copy link
Contributor Author

glorv commented Dec 25, 2020

@lance6716 PTAL again, since it has changed a log.

Copy link
Contributor

@lance6716 lance6716 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest LGTM

lightning/backend/tidb.go Show resolved Hide resolved
@kennytm
Copy link
Collaborator

kennytm commented Dec 25, 2020

/run-all-tests

TiDB crashed?!

[2020-12-25T03:56:45.752Z] [mysql] 2020/12/25 11:56:45 packets.go:36: unexpected EOF

[2020-12-25T03:56:45.752Z] [mysql] 2020/12/25 11:56:45 connection.go:135: write tcp 127.0.0.1:27016->127.0.0.1:4000: write: broken pipe

[2020-12-25T03:56:48.274Z] [mysql] 2020/12/25 11:56:48 packets.go:122: closing bad idle connection: EOF

[2020-12-25T03:56:48.275Z] [mysql] 2020/12/25 11:56:48 connection.go:135: write tcp 127.0.0.1:27018->127.0.0.1:4000: write: broken pipe

[2020-12-25T03:56:48.275Z] [mysql] 2020/12/25 11:56:48 packets.go:122: closing bad idle connection: EOF

[2020-12-25T03:56:48.275Z] [mysql] 2020/12/25 11:56:48 connection.go:135: write tcp 127.0.0.1:27036->127.0.0.1:4000: write: broken pipe

[2020-12-25T03:56:48.275Z] Error: read checkpoint failed: begin transaction failed: dial tcp 127.0.0.1:4000: connect: connection refused

@glorv
Copy link
Contributor Author

glorv commented Dec 25, 2020

Still need to resolve the integration test tidb_rowid first.

@lance6716
Copy link
Contributor

(LGTM, bot has already count my lgtm)

@glorv
Copy link
Contributor Author

glorv commented Jan 6, 2021

@kennytm PTAL again

@kennytm
Copy link
Collaborator

kennytm commented Jan 6, 2021

/lgtm

@ti-srebot ti-srebot added status/LGT2 Two reviewers already commented LGTM, ready for merge (LGTM2) and removed status/LGT1 One reviewer already commented LGTM (LGTM1) labels Jan 6, 2021
@kennytm kennytm merged commit 1709cc7 into master Jan 6, 2021
@kennytm kennytm deleted the check-read-row branch January 6, 2021 08:09
glorv added a commit that referenced this pull request Jan 29, 2021
* check row value count to avoid unexpected encode result

* check the '_tidb_row_id' field

* resolve comments

* fix issue related to '_tidb_rowid' and move column count to tidb encoder

* add tidb_opt_write_row_id session var

* fix test

* resolve comments

* update tidb to apply tidb#22062

* fix test
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
status/LGT2 Two reviewers already commented LGTM, ready for merge (LGTM2) type/bug-fix Bug fix
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants