Skip to content

Commit

Permalink
BUG#34555045 - join rejected in group bootstrapped with paxos single …
Browse files Browse the repository at this point in the history
…leader but runtime value 0

Problem
-------------------
`performance_schema.replication_group_communication_information.
WRITE_CONSENSUS_SINGLE_LEADER_CAPABLE` reflects the runtime value of
Paxos Single Leader setup in a group, and its main purpose is to let
users know which must be the value of
`group_replication_paxos_single_leader` on joining members.

A group that was bootstrapped with single leader enabled but its
protocol version is downgraded to one that does not support it reports
WRITE_CONSENSUS_SINGLE_LEADER_CAPABLE=0, as expected. However,
attempting to join an instance to the group using
group_replication_paxos_single_leader=0 fails

Analysis and Fix
-------------------

For this, we will change the behaviour and make the value of
`group_replication_paxos_single_leader` to be up to par with
the Communication Version that the group is running.

`group_replication_paxos_single_leader` was introduced in 8.0.27,
and below that version, it is not known or used. As such, we will
enforce the following rules:

- When a node joins a group that is running < 8.0.27 and we are of a
  version >= 8.0.27, we must error out and state that
  `group_replication_paxos_single_leader` must be OFF before joining
  the group
- When we try to run `set_communication_protocol` to a version < 8.0.27
  and we are of a version >= 8.0.27, we must error out the UDF if
  `group_replication_paxos_single_leader` is not OFF

This bug also changes the value that we used to check if we are allowed
to change the group leader after running `set_communication_protocol`.
Until today, we would look at the runtime value of
`group_replication_paxos_single_leader`. This is not
correct, since, as per the WL design, this value only takes effect
after a group reboot. As such, when we run `set_communication_protocol`,
we will use the value that is shown in `performance_schema.
replication_group_communication_information
.WRITE_CONSENSUS_SINGLE_LEADER_CAPABLE`

Change-Id: I9a73ec0f58ddfdf36694287a2ece0c2524e2c2da
  • Loading branch information
tiagoportelajorge committed Nov 11, 2022
1 parent e93768b commit 6e71b67
Show file tree
Hide file tree
Showing 16 changed files with 604 additions and 13 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,17 @@ include/assert.inc [protocol_version (8.0.27) should be 8.0.27]
########################################################################
# 5) Switch to single-primary mode. Switch communication_protocol to
# 8.0.21. Confirm everyone is a preferred consensus leader even in
# single-primary
# single-primary. Must stop and start the group with
# group_replication_paxos_single_leader=OFF
[connection server2]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server1]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_and_bootstrap_group_replication.inc
[connection server2]
include/start_group_replication.inc
[connection server1]
SELECT group_replication_switch_to_single_primary_mode();
group_replication_switch_to_single_primary_mode()
Expand All @@ -53,5 +63,8 @@ include/assert_grep.inc [There is no warning about a member joining the group wh
include/assert_grep.inc [There is no warning about a member joining the group while a group configuration operation is occurring]
########################################################################
# 6) Cleanup.
[connection server2]
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server1]
SET GLOBAL group_replication_paxos_single_leader = "ON";
include/group_replication_end.inc
Original file line number Diff line number Diff line change
Expand Up @@ -87,9 +87,24 @@ include/assert.inc [The third server's UUID should be the only UUID in WRITE_CON
include/assert.inc [The third server's UUID should be the only UUID in WRITE_CONSENSUS_LEADERS_ACTUAL]

####
# 2) Change the communication protocol to 8.0.21.
# 2) Change the communication protocol to 8.0.21. Must restart the group.
####

[connection server3]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server2]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server1]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_and_bootstrap_group_replication.inc
[connection server2]
include/start_group_replication.inc
[connection server3]
include/start_group_replication.inc
[connection server2]
SELECT group_replication_set_communication_protocol("8.0.21");
group_replication_set_communication_protocol("8.0.21")
The operation group_replication_set_communication_protocol completed successfully
Expand Down Expand Up @@ -155,6 +170,12 @@ include/assert.inc [All members should be in WRITE_CONSENSUS_LEADERS_ACTUAL]
# 4) Cleanup.
####

[connection server3]
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server2]
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server1]
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server1]
include/stop_group_replication.inc
include/group_replication_end.inc
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@

####
# 0) The test requires three servers.
####

include/group_replication.inc [rpl_server_count=3]
Warnings:
Note #### Sending passwords in plain text without SSL/TLS is extremely insecure.
Note #### Storing MySQL user name or password information in the connection metadata repository is not secure and is therefore not recommended. Please consider using the USER and PASSWORD connection options for START REPLICA; see the 'START REPLICA Syntax' in the MySQL Manual for more information.
[connection server1]
[connection server3]
SET SESSION sql_log_bin = 0;
call mtr.add_suppression("This member is configured with a group_replication_paxos_single_leader option value of 1 and it is trying to join a group with Communication Protocol Version below 8.0.27.*");
call mtr.add_suppression("This member is configured with a group_replication_paxos_single_leader option value of.*");
SET SESSION sql_log_bin = 1;
##
# 1) Start the first two servers with PAXOS Single Leader = 0
# and change the protocol version to 8.0.26
##
[connection server1]
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_and_bootstrap_group_replication.inc
[connection server2]
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_group_replication.inc
SELECT group_replication_set_communication_protocol("8.0.26");
group_replication_set_communication_protocol("8.0.26")
The operation group_replication_set_communication_protocol completed successfully
##
# 2) Join the third server with PAXOS Single Leader = 1 and it must fail
##
[connection server3]
SET GLOBAL group_replication_paxos_single_leader = "ON";
SET GLOBAL group_replication_group_name= "GROUP_REPLICATION_GROUP_NAME";
START GROUP_REPLICATION;
ERROR HY000: The server is not configured properly to be an active member of the group. Please see more details on error log.
##
# 3) Change PAXOS Single Leader to 0 and it must join the group
##
[connection server3]
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_group_replication.inc
[connection server2]
##
# 4) Change the protocol version to 8.0.32. It must be successful
##
SELECT group_replication_set_communication_protocol("8.0.32");
group_replication_set_communication_protocol("8.0.32")
The operation group_replication_set_communication_protocol completed successfully
##
# 5) Change PAXOS Single Leader to 1 on node 3. Stop and Start and it
# must fail.
##
[connection server3]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
SET GLOBAL group_replication_group_name= "GROUP_REPLICATION_GROUP_NAME";
START GROUP_REPLICATION;
ERROR HY000: The server is not configured properly to be an active member of the group. Please see more details on error log.
##
# 6) Change PAXOS Single Leader to 0 on node 3. Stop and Start and it
# must be successful.
##
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_group_replication.inc
##
# 7) Stop the whole group, change PAXOS Single Leader to 1 and start
# the group.
##
[connection server3]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server2]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server1]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
include/start_and_bootstrap_group_replication.inc
[connection server2]
include/start_group_replication.inc
[connection server3]
include/start_group_replication.inc
##
# 8) Try to change the protocol version to 8.0.26. It must fail.
##
[connection server1]
SELECT group_replication_set_communication_protocol("8.0.26");
ERROR HY000: The function 'group_replication_set_communication_protocol' failed. group_replication_paxos_single_leader must be OFF when choosing a version lower than 8.0.27.
##
# 10) Cleanup.
##
[connection server3]
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server2]
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server1]
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/group_replication_end.inc
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,20 @@ include/assert.inc [group_replication_paxos_single_leader must be enabled]
include/assert.inc [group_replication_paxos_single_leader must be enabled]
[connection server3]
include/assert.inc [group_replication_paxos_single_leader must be enabled]
[connection server3]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server2]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
[connection server1]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "OFF";
include/start_and_bootstrap_group_replication.inc
[connection server2]
include/start_group_replication.inc
[connection server3]
include/start_group_replication.inc
SELECT group_replication_set_communication_protocol("8.0.21");
group_replication_set_communication_protocol("8.0.21")
The operation group_replication_set_communication_protocol completed successfully
Expand All @@ -69,6 +83,20 @@ include/assert.inc [group_replication_paxos_single_leader must be disabled]
SELECT group_replication_set_communication_protocol("8.0.27");
group_replication_set_communication_protocol("8.0.27")
The operation group_replication_set_communication_protocol completed successfully
[connection server3]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server2]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
[connection server1]
include/stop_group_replication.inc
SET GLOBAL group_replication_paxos_single_leader = "ON";
include/start_and_bootstrap_group_replication.inc
[connection server2]
include/start_group_replication.inc
[connection server3]
include/start_group_replication.inc
[connection server1]
include/assert.inc [group_replication_paxos_single_leader must be enabled]
[connection server2]
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,28 @@ SELECT group_replication_switch_to_multi_primary_mode();
--echo ########################################################################
--echo # 5) Switch to single-primary mode. Switch communication_protocol to
--echo # 8.0.21. Confirm everyone is a preferred consensus leader even in
--echo # single-primary
--echo # single-primary. Must stop and start the group with
--echo # group_replication_paxos_single_leader=OFF

--let $rpl_connection_name= server2
--source include/rpl_connection.inc

--source include/stop_group_replication.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "OFF"

--let $rpl_connection_name= server1
--source include/rpl_connection.inc

--source include/stop_group_replication.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "OFF"

--source include/start_and_bootstrap_group_replication.inc

--let $rpl_connection_name= server2
--source include/rpl_connection.inc

--source include/start_group_replication.inc

--let $rpl_connection_name= server1
--source include/rpl_connection.inc

Expand Down Expand Up @@ -170,8 +191,14 @@ SELECT group_replication_set_communication_protocol("8.0.21");

--echo ########################################################################
--echo # 6) Cleanup.

--let $rpl_connection_name= server2
--source include/rpl_connection.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "ON"

--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "ON"

--let $rpl_group_replication_reset_persistent_vars= 1
--source include/group_replication_end.inc
Original file line number Diff line number Diff line change
Expand Up @@ -213,9 +213,43 @@ SET SESSION sql_log_bin = 1;

--echo
--echo ####
--echo # 2) Change the communication protocol to 8.0.21.
--echo # 2) Change the communication protocol to 8.0.21. Must restart the group.
--echo ####
--echo

--let $rpl_connection_name= server3
--source include/rpl_connection.inc

--source include/stop_group_replication.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "OFF"

--let $rpl_connection_name= server2
--source include/rpl_connection.inc

--source include/stop_group_replication.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "OFF"

--let $rpl_connection_name= server1
--source include/rpl_connection.inc

--source include/stop_group_replication.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "OFF"

--source include/start_and_bootstrap_group_replication.inc

--let $rpl_connection_name= server2
--source include/rpl_connection.inc

--source include/start_group_replication.inc

--let $rpl_connection_name= server3
--source include/rpl_connection.inc

--source include/start_group_replication.inc

--let $rpl_connection_name= server2
--source include/rpl_connection.inc

--eval SELECT group_replication_set_communication_protocol("8.0.21")

--echo
Expand Down Expand Up @@ -385,6 +419,19 @@ SET SESSION sql_log_bin = 1;
--echo # 4) Cleanup.
--echo ####
--echo

--let $rpl_connection_name= server3
--source include/rpl_connection.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "ON"

--let $rpl_connection_name= server2
--source include/rpl_connection.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "ON"

--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--eval SET GLOBAL group_replication_paxos_single_leader = "ON"

--let $rpl_connection_name= server1
--source include/rpl_connection.inc
--source include/stop_group_replication.inc
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
!include ../my.cnf

[mysqld.1]

[mysqld.2]

[mysqld.3]

[ENV]
SERVER_MYPORT_3= @mysqld.3.port
SERVER_MYSOCK_3= @mysqld.3.socket
Loading

0 comments on commit 6e71b67

Please sign in to comment.