Skip to content

Simulate the main instance crash,hang ssh #125

Open
@trsenzhang

Description

1.ssh 检查ok
[root@trsen184 masterha]# masterha_check_ssh --conf=/usr/local/masterha/masterha_mha1.cnf
Tue Jul 30 11:57:32 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Tue Jul 30 11:57:32 2019 - [info] Reading application default configuration from /usr/local/masterha/masterha_mha1.cnf..
Tue Jul 30 11:57:32 2019 - [info] Reading server configuration from /usr/local/masterha/masterha_mha1.cnf..
Tue Jul 30 11:57:32 2019 - [info] Starting SSH connection tests..
Tue Jul 30 11:57:33 2019 - [debug]
Tue Jul 30 11:57:32 2019 - [debug] Connecting via SSH from root@172.18.0.181(172.18.0.181:22) to root@172.18.0.182(172.18.0.182:22)..
Tue Jul 30 11:57:32 2019 - [debug] ok.
Tue Jul 30 11:57:32 2019 - [debug] Connecting via SSH from root@172.18.0.181(172.18.0.181:22) to root@172.18.0.183(172.18.0.183:22)..
Tue Jul 30 11:57:33 2019 - [debug] ok.
Tue Jul 30 11:57:34 2019 - [debug]
Tue Jul 30 11:57:33 2019 - [debug] Connecting via SSH from root@172.18.0.182(172.18.0.182:22) to root@172.18.0.181(172.18.0.181:22)..
Tue Jul 30 11:57:33 2019 - [debug] ok.
Tue Jul 30 11:57:33 2019 - [debug] Connecting via SSH from root@172.18.0.182(172.18.0.182:22) to root@172.18.0.183(172.18.0.183:22)..
Tue Jul 30 11:57:33 2019 - [debug] ok.
Tue Jul 30 11:57:35 2019 - [debug]
Tue Jul 30 11:57:33 2019 - [debug] Connecting via SSH from root@172.18.0.183(172.18.0.183:22) to root@172.18.0.181(172.18.0.181:22)..
Tue Jul 30 11:57:33 2019 - [debug] ok.
Tue Jul 30 11:57:33 2019 - [debug] Connecting via SSH from root@172.18.0.183(172.18.0.183:22) to root@172.18.0.182(172.18.0.182:22)..
Tue Jul 30 11:57:34 2019 - [debug] ok.
Tue Jul 30 11:57:35 2019 - [info] All SSH connection tests passed successfully.

2.repl检查 ok
[root@trsen184 masterha]# masterha_check_repl --conf=/usr/local/masterha/masterha_mha1.cnf
Tue Jul 30 12:01:28 2019 - [info] Reading default configuration from /etc/masterha_default.cnf..
Tue Jul 30 12:01:28 2019 - [info] Reading application default configuration from /usr/local/masterha/masterha_mha1.cnf..
Tue Jul 30 12:01:28 2019 - [info] Reading server configuration from /usr/local/masterha/masterha_mha1.cnf..
Tue Jul 30 12:01:28 2019 - [info] MHA::MasterMonitor version 0.58.
Tue Jul 30 12:01:29 2019 - [info] GTID failover mode = 1
Tue Jul 30 12:01:29 2019 - [info] Dead Servers:
Tue Jul 30 12:01:29 2019 - [info] Alive Servers:
Tue Jul 30 12:01:29 2019 - [info] 172.18.0.181(172.18.0.181:3309)
Tue Jul 30 12:01:29 2019 - [info] 172.18.0.182(172.18.0.182:3309)
Tue Jul 30 12:01:29 2019 - [info] 172.18.0.183(172.18.0.183:3309)
Tue Jul 30 12:01:29 2019 - [info] Alive Slaves:
Tue Jul 30 12:01:29 2019 - [info] 172.18.0.182(172.18.0.182:3309) Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Jul 30 12:01:29 2019 - [info] GTID ON
Tue Jul 30 12:01:29 2019 - [info] Replicating from 172.18.0.181(172.18.0.181:3309)
Tue Jul 30 12:01:29 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Tue Jul 30 12:01:29 2019 - [info] 172.18.0.183(172.18.0.183:3309) Version=5.7.24-log (oldest major version between slaves) log-bin:enabled
Tue Jul 30 12:01:29 2019 - [info] GTID ON
Tue Jul 30 12:01:29 2019 - [info] Replicating from 172.18.0.181(172.18.0.181:3309)
Tue Jul 30 12:01:29 2019 - [info] Primary candidate for the new Master (candidate_master is set)
Tue Jul 30 12:01:29 2019 - [info] Current Alive Master: 172.18.0.181(172.18.0.181:3309)
Tue Jul 30 12:01:29 2019 - [info] Checking slave configurations..
Tue Jul 30 12:01:29 2019 - [info] Checking replication filtering settings..
Tue Jul 30 12:01:29 2019 - [info] binlog_do_db= , binlog_ignore_db=
Tue Jul 30 12:01:29 2019 - [info] Replication filtering check ok.
Tue Jul 30 12:01:29 2019 - [info] GTID (with auto-pos) is supported. Skipping all SSH and Node package checking.
Tue Jul 30 12:01:29 2019 - [info] Checking SSH publickey authentication settings on the current master..
Tue Jul 30 12:01:29 2019 - [info] HealthCheck: SSH to 172.18.0.181 is reachable.
Tue Jul 30 12:01:29 2019 - [info]
172.18.0.181(172.18.0.181:3309) (current master)
+--172.18.0.182(172.18.0.182:3309)
+--172.18.0.183(172.18.0.183:3309)

Tue Jul 30 12:01:29 2019 - [info] Checking replication health on 172.18.0.182..
Tue Jul 30 12:01:29 2019 - [info] ok.
Tue Jul 30 12:01:29 2019 - [info] Checking replication health on 172.18.0.183..
Tue Jul 30 12:01:29 2019 - [info] ok.
Tue Jul 30 12:01:29 2019 - [info] Checking master_ip_failover_script status:
Tue Jul 30 12:01:29 2019 - [info] /usr/local/masterha/master_ip_failover --command=status --ssh_user=root --orig_master_host=172.18.0.181 --orig_master_ip=172.18.0.181 --orig_master_port=3309
Tue Jul 30 12:01:29 2019 - [info] OK.
Tue Jul 30 12:01:29 2019 - [warning] shutdown_script is not defined.
Tue Jul 30 12:01:29 2019 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

3.conf信息
[root@trsen184 masterha]# vi masterha_mha1.cnf
[server default]
#log_level=debug
#mysql user
user=trsen
password=xxx

#ssh user
ssh_user=root
ssh_port=22

#replication user
repl_user=repl
repl_password=xxx

#monitor
ping_interval=3
#shutdown_script=""

#switch scripts
master_ip_failover_script= /usr/local/masterha/master_ip_failover
master_ip_online_change_script= /usr/local/masterha/master_ip_online_change

#mha manager directory
manager_workdir = /usr/local/masterha/mha1
manager_log = /usr/local/masterha/mha1/mha1.log
remote_workdir = /usr/local/masterha/mha1

[server1]
hostname=172.18.0.181
master_binlog_dir = /data/mysql/mha/logs
candidate_master = 1
check_repl_delay = 0

[server2]
hostname=172.18.0.182
master_binlog_dir=/data/mysql/mha/logs
candidate_master=1
check_repl_delay=0

[server3]
hostname=172.18.0.183
master_binlog_dir=/data/mysql/mha/logs
candidate_master=1
check_repl_delay=0

4.hang在ssh
图片

5.ssh manual is ok
~
~
[root@trsen184 ~]# ps -ef |grep ssh
root 10 1 0 11:31 ? 00:00:00 /usr/sbin/sshd -D -e -u 0
root 54 10 0 11:31 ? 00:00:00 sshd: root@pts/0
root 18577 18576 0 12:03 pts/0 00:00:00 ssh -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5 -p 22 root@172.18.0.181 exit 0
root 20208 10 0 12:06 ? 00:00:00 sshd: root@pts/1
root 21498 20221 0 12:08 pts/1 00:00:00 grep --color=auto ssh

[root@trsen184 ~]# ssh -o StrictHostKeyChecking=no -o PasswordAuthentication=no -o BatchMode=yes -o ConnectTimeout=5 -p 22 root@172.18.0.181
Last login: Tue Jul 30 11:48:16 2019 from 172.18.0.182
[root@trsen181 ~]# exit
logout
Connection to 172.18.0.181 closed.
[root@trsen184 ~]#

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions