Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem in loop replication detection #592

Closed
ghost opened this issue Feb 25, 2021 · 6 comments
Closed

Problem in loop replication detection #592

ghost opened this issue Feb 25, 2021 · 6 comments
Milestone

Comments

@ghost
Copy link

ghost commented Feb 25, 2021

Case

A <--dtle --> B

  1. Prepare data on both sides
create schema if not exists a592;
create table a592.a (id int primary key auto_increment);
insert into a592.a values (1);
  1. run the 2 jobs

  2. exec on side A

set gtid_next = '11111111-1111-1111-1111-111111111111:1';
update a592.a set id = 111 where id = 1;
set gtid_next = 'automatic';
insert into a592.a values (1);
  1. The TX is replicated to B (which is normal)

  2. The TX is replicated to A again (which it should not), since the OSID is not A's uuid.

   [ERROR] client.driver_mgr.dtle: error at exec: driver=dtle @module=dtle.applier.ApplyBinlogEvent err="Error 1062: Duplicate entry '111' for key 'PRIMARY'" gtid=b0f42144-06cd-11e9-93ca-02000aba3e28:74 job=aa2 timestamp=2021-02-25T17:56:21.693+0800

We should check whether a tx has been executed based on target gtid_executed instead of uuid.

@ghost
Copy link
Author

ghost commented Feb 25, 2021

  • 已知事务的gtid有server_id:gno两部分.
  • 当A端执行的TX, 其gtid.server_id不是A的server_id时, 循环识别机制将误判

@ghost ghost added this to the next milestone Mar 11, 2021
@ghost ghost mentioned this issue Apr 30, 2021
@ghost ghost modified the milestones: next, 4.22.05.0 Apr 28, 2022
@ghost ghost modified the milestones: 4.22.05.0, next Jun 6, 2022
@ghost ghost modified the milestones: next, 4.22.07.0 Jun 24, 2022
@ghost
Copy link
Author

ghost commented Jun 28, 2022

问题重述:

  • 一个TX gtid有server_uuid:gno两部分
  • dtle在目标端mysql gtid_executed表上记录TX的server_uuid:gno
  • TODO

@ghost
Copy link
Author

ghost commented Jun 28, 2022

方案1: gtid_executed记录TX uuid和src MySQL uuid(如若两者不一致).

  • 对于源端uuid会发生变化(集群切换、SIP飘移)的情况无效

方案2A: 遇到有osid的TX,执行前通过select @@global.gtid_executed查询

  • 性能开销?
  • AliRDS支持?
  • 已知DBLE不支持

方案2B: 使用MySQL session_track_gtids功能追踪dst mysql的gtid_executed

  • 性能开销?
  • AliRDS支持?
  • 目测DBLE不支持

方案3: 拓展dtle源端和目标端的功能,使其支持单job双向复制。则可以通过
slave thread得知mysql的gtid_executed

@ghost
Copy link
Author

ghost commented Jul 1, 2022

目前倾向于方案3。需作较大调整:

  1. 整合job配置:所有配置写在src端,由src发给dst端
  2. 调整Extractor/Applier, 可在task中重复启动
  3. 对于双向复制,在Extractor/Applier间共享一些信息

TODO 考虑双向执行DDL的问题。

ghost pushed a commit that referenced this issue Jul 18, 2022
ghost pushed a commit that referenced this issue Jul 19, 2022
@ghost ghost modified the milestones: 4.22.07.0, 4.22.08.0 Aug 22, 2022
@ghost
Copy link
Author

ghost commented Sep 9, 2022

方案4: 提供选项, 执行set gtid_next, 使源端目标端gtid一致.

这需要REPLICATION_APPLIER权限(8.0) 或者 SUPER 权限 (5.7)

当gtid_next包含于gtid_executed中时, MySQL会忽略该TX的执行.

@ghost
Copy link
Author

ghost commented Sep 13, 2022

问题4.1: set gtid_next vs DDL (implicit commit)

mysql> set gtid_next = 'acd7d195-06cd-11e9-928f-02000aba3e00:1';
Query OK, 0 rows affected (0.01 sec)

mysql> create table a.a1 (id int primary key);
Query OK, 0 rows affected (0.07 sec)

mysql> set gtid_next = 'acd7d195-06cd-11e9-928f-02000aba3e00:2';
Query OK, 0 rows affected (0.00 sec)

mysql> begin;
Query OK, 0 rows affected (0.01 sec)

mysql> create table a.a2 (id int primary key);
ERROR 1778 (HY000): Cannot execute statements with implicit commit inside a transaction when @@SESSION.GTID_NEXT == 'UUID:NUMBER'.

问题4.2: 多job复制不同表

tx

  • insert tb1

  • insert tb2

  • job1先复制tx.tb1

  • job2后复制tx.tb2

则job2复制tx时, 目标端MySQL认为tx已经执行过, 会跳过执行.

ghost pushed a commit that referenced this issue Sep 14, 2022
ghost pushed a commit that referenced this issue Sep 14, 2022
ghost pushed a commit that referenced this issue Oct 27, 2022
ghost pushed a commit that referenced this issue Oct 27, 2022
@ghost ghost closed this as completed Nov 10, 2022
ghost pushed a commit that referenced this issue Dec 12, 2022
ghost pushed a commit that referenced this issue Dec 12, 2022
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

0 participants