Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

import into failed when inject some fault,such as kill ddl owner、kill pdleader with global sort #48702

Open
Lily2025 opened this issue Nov 20, 2023 · 4 comments
Assignees
Labels
component/ddl This issue is related to DDL of TiDB. severity/moderate type/bug The issue is confirmed as a bug.

Comments

@Lily2025
Copy link

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

1、run import into
2、inject some fault during import into such as kill ddl owner、kill pdleader

2. What did you expect to see? (Required)

import into can success

3. What did you see instead (Required)

import into failed when inject some fault,such as kill ddl owner、kill pdleader

the status of import job is not finished or running (now: 2023-11-18 20:15:22, jobId: 1, step: importing, status: failed)
operatorLogs:
[2023-11-18 19:14:51] ###### start import into
import into user_data1 from 's3://qe-testing/brie/lightning-mhy-100G-csv/sysbench.user_data1.000001*.csv?access-key=xxx&secret-access-key=xxx&endpoint=xxx&force-path-style=false&region=xxx&provider=xxx' WITH DETACHED,thread=8,skip_rows=1
[2023-11-18 19:14:51] ###### wait for import job to finish
[2023-11-18 20:15:22] ###### wait for import job to finish failed
select id, step, status from mysql.tidb_import_jobs where start_time >= '2023-11-18 19:14:51'
jobId: 1, step: importing, status: failed

4. What is your TiDB version? (Required)

git hash:23314d9d402e545dae9df57606f0ead3e67e0cd7

@Lily2025 Lily2025 added the type/bug The issue is confirmed as a bug. label Nov 20, 2023
@Lily2025
Copy link
Author

/type bug
/severity major
/assign ywqzzy

@ywqzzy
Copy link
Contributor

ywqzzy commented Nov 20, 2023

[2023/11/18 21:14:03.658 +08:00] [WARN] [s3.go:890] ["open new s3 reader failed"] [file=30001/30007/data/2a79c3c8-5334-4c1d-ad3c-57b855ae5590/19] [error="RequestCanceled: request context canceled\ncaused by: context canceled"]
[2023/11/18 21:14:03.658 +08:00] [WARN] [byte_reader.go:289] ["other error during read"] [error="context canceled"]
[2023/11/18 21:14:03.713 +08:00] [WARN] [s3.go:890] ["open new s3 reader failed"] [file=30001/30007/data/ca358151-5dd7-4008-a9d0-9b7569e38431/32] [error="RequestCanceled: request context canceled\ncaused by: context canceled"]
[2023/11/18 21:14:03.713 +08:00] [WARN] [byte_reader.go:289] ["other error during read"] [error="context canceled"]
[2023/11/18 21:14:03.739 +08:00] [WARN] [s3.go:890] ["open new s3 reader failed"] [file=30001/30007/data/2a79c3c8-5334-4c1d-ad3c-57b855ae5590/44] [error="RequestCanceled: request context canceled\ncaused by: context canceled"]
[2023/11/18 21:14:03.739 +08:00] [WARN] [byte_reader.go:289] ["other error during read"] [error="context canceled"]
[2023/11/18 21:14:03.741 +08:00] [WARN] [s3.go:890] ["open new s3 reader failed"] [file=30001/30007/data/66b58960-2b9e-4b57-bec1-ccf1cafa6225/32] [error="RequestCanceled: request context canceled\ncaused by: context canceled"]
[2023/11/18 21:14:03.741 +08:00] [WARN] [byte_reader.go:289] ["other error during read"] [error="context canceled"]
[2023/11/18 21:14:03.742 +08:00] [ERROR] [local.go:1704] ["do import meets error"] [error="SlowDown: A timeout exceeded while waiting to proceed with the request, please reduce your request rate\n\tstatus code: 503, request id: 1798B9C9CC6C5146, host id: "]
[2023/11/18 21:14:03.742 +08:00] [ERROR] [scheduler.go:357] ["run subtask failed"] [type=ImportInto] [task-id=30001] [step=write&ingest] [subtask-id=60002] [kv-group=data] [takeTime=22m40.922847228s] [error="SlowDown: A timeout exceeded while waiting to proceed with the request, please reduce your request rate\n\tstatus code: 503, request id: 1798B9C9CC6C5146, host id: "]
[2023/11/18 21:14:03.742 +08:00] [ERROR] [scheduler.go:500] [onError] [task-id=30001] [error="SlowDown: A timeout exceeded while waiting to proceed with the request, please reduce your request rate\n\tstatus code: 503, request id: 1798B9C9CC6C5146, host id: "] [stack="[github.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).onError\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:500\ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).runSubtask\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:299\ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).run\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:279\ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).Run\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:138\ngithub.com/pingcap/tidb/pkg/disttask/importinto.(*importScheduler).Run\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/importinto/scheduler.go:456\ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*Manager).onRunnableTask\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/manager.go:391\ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*Manager).onRunnableTasks.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/manager.go:217\ngithub.com/pingcap/tidb/pkg/resourcemanager/pool/spool.(*Pool).run.func1\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/resourcemanager/pool/spool/spool.go:145](http://github.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).onError/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:500/ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).runSubtask/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:299/ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).run/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:279/ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*BaseScheduler).Run/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/scheduler.go:138/ngithub.com/pingcap/tidb/pkg/disttask/importinto.(*importScheduler).Run/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/importinto/scheduler.go:456/ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*Manager).onRunnableTask/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/manager.go:391/ngithub.com/pingcap/tidb/pkg/disttask/framework/scheduler.(*Manager).onRunnableTasks.func1/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/disttask/framework/scheduler/manager.go:217/ngithub.com/pingcap/tidb/pkg/resourcemanager/pool/spool.(*Pool).run.func1/n/t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/pkg/resourcemanager/pool/spool/spool.go:145)"]
[2023/11/18 21:14:03.742 +08:00] [ERROR] [scheduler.go:506] ["scheduler met first error"] [task-id=30001] [error="SlowDown: A timeout exceeded while waiting to proceed with the request, please reduce your request rate\n\tstatus code: 503, request id: 1798B9C9CC6C5146, host id: "]
[2023/11/18 21:14:03.742 +08:00] [WARN] [scheduler.go:622] ["subtask canceled"] [task-id=30001] [error="SlowDown: A timeout exceeded while waiting to proceed with the request, please reduce your request rate\n\tstatus code: 503, request id: 1798B9C9CC6C5146, host id: "]
[2023/11/18 21:14:03.747 +08:00] [INFO] [scheduler.go:406] ["cleanup subtask env"] [type=ImportInto] [task-id=30001] [step=write&ingest]

global sort open s3 failed for so many times, then trigger the rate limit of object store.

@D3Hunter
Copy link
Contributor

D3Hunter commented Nov 20, 2023

the task is failed due to minio issue, task failed after retry, this is expected behavior, shound't be taken as bug.

so closing it.

@D3Hunter D3Hunter removed may-affects-5.3 This bug maybe affects 5.3.x versions. may-affects-5.4 This bug maybe affects 5.4.x versions. may-affects-6.1 may-affects-6.5 may-affects-7.1 may-affects-7.5 labels Nov 20, 2023
@D3Hunter D3Hunter reopened this Nov 27, 2023
@Lily2025
Copy link
Author

/remove-severity major
/severity major moderate

@aytrack aytrack added the component/ddl This issue is related to DDL of TiDB. label Nov 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/ddl This issue is related to DDL of TiDB. severity/moderate type/bug The issue is confirmed as a bug.
Projects
None yet
Development

No branches or pull requests

4 participants