Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightning parallel import single table with duplication data fail when duplicate-resolution = record #39476

Closed
seiya-annie opened this issue Nov 30, 2022 · 2 comments · Fixed by #39571
Assignees
Labels
component/lightning This issue is related to Lightning of TiDB. severity/major type/bug The issue is confirmed as a bug.

Comments

@seiya-annie
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. Launch 2 lightning instance to do parallel local backend import data for single data, with duplicate data, specify duplicate-resolution = record
  2. Lightning should succeed with notification to users to let them know there is duplicate data

2. What did you expect to see? (Required)

  1. Lighting import should succeed (crc32 check skipped as data might be inconsistent)
    1. Duplicate rows numbers check should succeed

3. What did you see instead (Required)

[2022/11/29 18:43:32.126 +00:00] [ERROR] [main.go:103] ["tidb lightning encountered error stack info"] [error="failed to record conflict errors: [xeval:8221]invalid key - "7480000000000002b55f6980000000000000010419a6340000000000013138343939323234ff0000000000000000f7016c69505a00000000fb""] [errorVerbose="[xeval:8221]invalid key - "7480000000000002b55f6980000000000000010419a6340000000000013138343939323234ff0000000000000000f7016c69505a00000000fb"\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20220729040631-518f63d66278/normalize.go:155\ngithub.com/pingcap/tidb/tablecodec.DecodeRowKey\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/tablecodec/tablecodec.go:283\ngithub.com/pingcap/tidb/br/pkg/lightning/backend/kv.(*TableKVDecoder).DecodeHandleFromRowKey\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/kv/kv2sql.go:42\ngithub.com/pingcap/tidb/br/pkg/lightning/backend/local.(*DuplicateManager).RecordDataConflictError\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/duplicate.go:444\ngithub.com/pingcap/tidb/br/pkg/lightning/backend/local.(*DuplicateManager).processRemoteDupTaskOnce.func1.1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/duplicate.go:780\ngithub.com/pingcap/tidb/br/pkg/lightning/backend/local.(*DuplicateManager).processRemoteDupTaskOnce.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/lightning/backend/local/duplicate.go:788\ngithub.com/pingcap/tidb/br/pkg/utils.(*WorkerPool).Apply.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/br/br/pkg/utils/worker.go:58\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1594\nfailed to record conflict errors"]

4. What is your TiDB version? (Required)

[2022/11/29 18:41:59.964 +00:00] [INFO] [info.go:49] ["Welcome to TiDB-Lightning"] [release-version=v6.5.0-alpha] [git-hash=9689b4763d2705f1dc1308b4e4bec257e71d391a] [git-branch=heads/refs/tags/v6.5.0-alpha] [go-version=go1.19.3] [utc-build-time="2022-11-29 11:09:16"] [race-enabled=false]

@seiya-annie seiya-annie added type/bug The issue is confirmed as a bug. component/lightning This issue is related to Lightning of TiDB. labels Nov 30, 2022
@seiya-annie
Copy link
Author

lightning.log

@ti-chi-bot ti-chi-bot added may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.0 may-affects-6.1 may-affects-6.2 may-affects-6.3 may-affects-6.4 labels Dec 1, 2022
@seiya-annie seiya-annie removed may-affects-4.0 This bug maybe affects 4.0.x versions. may-affects-5.1 This bug maybe affects 5.1.x versions. may-affects-5.2 This bug maybe affects 5.2.x versions. may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-5.0 This bug maybe affects 5.0.x versions. may-affects-6.0 may-affects-6.1 may-affects-6.2 may-affects-6.3 may-affects-6.4 labels Dec 1, 2022
@dsdashun
Copy link
Contributor

dsdashun commented Dec 2, 2022

Here's the root cause: The key is an index key, but the dup-detector logic treated it as a row record key. The logic is here:

if task.indexInfo == nil {
err = m.RecordDataConflictError(ctx, stream)
} else {
err = m.RecordIndexConflictError(ctx, stream, task.tableID, task.indexInfo)
}

It should have an indexInfo and jump into the else clause, but it actually jumped into the RecordDataConflictError branch.
When constructing the task struct using (*DuplicateManager) buildDupTasks(), now it abstracted the appending task operation as a single function
putToTaskFunc := func(ranges []tidbkv.KeyRange) {

However, when appending the index dup-detect task, it should add the indexInfo, but unfortunately, the abstracted function didn't do that:
tasks = append(tasks, dupTask{
KeyRange: r,
tableID: tid,
})

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/lightning This issue is related to Lightning of TiDB. severity/major type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants