Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm6.5,资源耗尽 #10429

Open
xc1989xc opened this issue Jan 6, 2024 · 6 comments
Open

dm6.5,资源耗尽 #10429

xc1989xc opened this issue Jan 6, 2024 · 6 comments
Labels
area/dm Issues or PRs related to DM. severity/moderate type/bug The issue is confirmed as a bug.

Comments

@xc1989xc
Copy link

xc1989xc commented Jan 6, 2024

What did you do?

mysql同步到tidb,源实例400+schema,2.5w+tables

What did you expect to see?

No response

What did you see instead?

启动task报错
ERROR] [subtask.go:218] ["fail to initialize subtask"] [subtask=aaa] [error="[code=42501:class=ha:scope=internal:level=high], Message: fail to initialize unit Sync of subtask aaa: fail to do etcd txn operation: txn commit failed, RawCause: rpc error: code = ResourceExhausted desc = trying to send message larger than max (2157479 vs. 2097152), Workaround: Please check dm-master's node status and the network between this node and dm-master"]

Versions of the cluster

Release Version: v6.5.0
Git Commit Hash: 9e91cff
Git Branch: heads/refs/tags/v6.5.0
UTC Build Time: 2022-12-23 08:44:26
Go Version: go version go1.19.3 linux/amd64
Failpoint Build: false

Release Version: v6.5.0
Git Commit Hash: 9e91cff
Git Branch: heads/refs/tags/v6.5.0
UTC Build Time: 2022-12-23 08:44:26
Go Version: go version go1.19.3 linux/amd64
Failpoint Build: false

current status of DM cluster (execute query-status <task-name> in dmctl)

"stage": "Paused",
"unit": "InvalidUnit",
"result": {
"isCanceled": false,
"errors": [
{
"ErrCode": 42501,
"ErrClass": "ha",
"ErrScope": "internal",
"ErrLevel": "high",
"Message": "fail to initialize unit Sync of subtask aaa: fail to do etcd txn operation: txn commit failed",
"RawCause": "rpc error: code = ResourceExhausted desc = trying to send message larger than max (2157479 vs. 2097152)",
"Workaround": "Please check dm-master's node status and the network between this node and dm-master"
}
],
"detail": null

@xc1989xc xc1989xc added area/dm Issues or PRs related to DM. type/bug The issue is confirmed as a bug. labels Jan 6, 2024
@GMHDBJD
Copy link
Contributor

GMHDBJD commented Jan 8, 2024

can you provide the task configuration?

@xc1989xc
Copy link
Author

xc1989xc commented Jan 8, 2024

配置肯定没问题啊,配置要是有问题,检查不能报这个错
我的需求就是整个实例迁移
因为是增量,只配置了filters
filters: # 上游数据库实例匹配的表的 binlog event filter 规则集
filter-rule-1: # 配置名称
schema-pattern: "*"
events: ["all"] # 匹配哪些 event 类型
action: Do # 对与符合匹配规则的 binlog 迁移(Do)还是忽略(Ignore)

@xc1989xc
Copy link
Author

这个项目还有人在维护么?

@xc1989xc
Copy link
Author

// 增加调用选项 grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(recvSize)))
grpc.Dial(host, grpc.WithInsecure(), grpc.WithDefaultCallOptions(grpc.MaxCallRecvMsgSize(recvSize)))
服务端报错

// 其实也一样, 设置一下发送 接收的大小
var options = []grpc.ServerOption{
grpc.MaxRecvMsgSize(recvSize),
grpc.MaxSendMsgSize(sendSize),
}
s := grpc.NewServer(options…)

//etcd客户端初始化
func (ec *EtcdCliV3) Init(cfg *EtcdCliConf) (err error) {
dialTimtout := cfg.DialTimeout
if dialTimtout == 0 {
dialTimtout = DEFAULT_DIAL_TIMOUT
}
etcdConfig := clientv3.Config{
Endpoints: cfg.Endpoints,
DialTimeout: dialTimtout,
Username: cfg.Username,
Password: cfg.Password,
DialOptions: []grpc.DialOption{grpc.WithBlock()},
MaxCallSendMsgSize:4 * 1024 * 1024,
}
if ec.client, err = clientv3.New(etcdConfig); err != nil {
err = fmt.Errorf("init etcd cli fail, err: %v", err)
return
}
return
}

@GMHDBJD
Copy link
Contributor

GMHDBJD commented Jan 15, 2024

we will fix it in v8.0 and pick to v6.5.8

@fubinzh
Copy link

fubinzh commented Jan 22, 2024

/severity moderate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/dm Issues or PRs related to DM. severity/moderate type/bug The issue is confirmed as a bug.
Projects
Status: Need Triage
Development

No branches or pull requests

3 participants