Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dm: CompareGTID uses an extremely slow algorithm to check emptiness and slowed down replication speed #9676

Closed
kennytm opened this issue Sep 5, 2023 · 5 comments · Fixed by #9829 or #9944

Comments

@kennytm
Copy link
Contributor

kennytm commented Sep 5, 2023

What did you do?

  1. Set upstream to a MySQL cluster with 23 servers.
  2. Set the source with GTID enabled (the GTID set should be around 1 KiB long).

What did you expect to see?

Replication works normally and quickly.

What did you see instead?

CompareGTID shows up significantly in the worker CPU profile and blocking the main thread, significantly slowing down replication speed

1-fs8

Versions of the cluster

DM version (run dmctl -V or dm-worker -V or dm-master -V):

6.5.1

Upstream MySQL/MariaDB server version:

5.7.29-32-log

Downstream TiDB cluster version (execute SELECT tidb_version(); in a MySQL client):

6.5.3

How did you deploy DM: tiup or manually?

k8s (operator)

Other interesting information (system version, hardware config, etc):

>
>

current status of DM cluster (execute query-status <task-name> in dmctl)

(paste current status of DM cluster here)
@kennytm kennytm added type/bug The issue is confirmed as a bug. area/dm Issues or PRs related to DM. labels Sep 5, 2023
@knull-cn
Copy link
Contributor

knull-cn commented Sep 5, 2023

the MysqlGTIDSet.String() just to check the empty.
maybe we can check it at parseBinlogLocation: if data of gtid was not empty and parse was not error, it was not empty. otherwise , it was empty.

@kennytm
Copy link
Contributor Author

kennytm commented Sep 5, 2023

(Internal reference: https://internal.pingcap.net/jira/browse/ONCALL-4846)

The issue, as highlighted in the flamegraph, is that CompareGTID uses .String() to check if the GTIDSet is empty

func CompareGTID(gSet1, gSet2 gmysql.GTIDSet) (int, bool) {
gSetIsEmpty1 := gSet1 == nil || len(gSet1.String()) == 0
gSetIsEmpty2 := gSet2 == nil || len(gSet2.String()) == 0

so everytime we CompareGTID we need to reconstruct a 1 KiB string just to check if the length is zero. That's a total waste of resource. Granted, the GTIDSet interface does not have an IsEmpty() method. But it does have an Equal() interface and can be done like this instead:

var (
 	emptyMySQLGTIDSet, _   = gmysql.ParseMysqlGTIDSet("")
 	emptyMariaDBGTIDSet, _ = gmysql.ParseMariadbGTID("")
)

func CompareGTID(gSet1, gSet2 gmysql.GTIDSet) (int, bool) { 
 	gSetIsEmpty1 := gSet1 == nil || gSet1.Equal(emptyMySQLGTIDSet) || gSet1.Equal(emptyMariaDBGTIDSet)
 	gSetIsEmpty2 := gSet2 == nil || gSet2.Equal(emptyMySQLGTIDSet) || gSet2.Equal(emptyMariaDBGTIDSet)
)

ideally go-mysql should include an IsEmpty() method to prevent the combinatorical explosion but this alone can eliminate all those unnecessary .String() calls.

(This similar pattern also appears as location.GTIDSetStr() == "" in some other files)

@fubinzh
Copy link

fubinzh commented Sep 8, 2023

/severity major

@seiya-annie
Copy link

/found gs

@ti-chi-bot ti-chi-bot bot added the found/gs label Sep 20, 2023
@kennytm
Copy link
Contributor Author

kennytm commented Sep 28, 2023

Since all devs seems to be busy with other stuff I've asked @feran-morgan-pingcap to try to deliver a PR by Oct 7th.

cc @GMHDBJD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment