MHA
Posted John_2011
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MHA相关的知识,希望对你有一定的参考价值。
1、检查MHA Manager到所有MHA Node的SSH连接状态
[[email protected] ~]# masterha_check_ssh --conf=/etc/masterha/app1.cnf Thu Nov 9 11:01:51 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Nov 9 11:01:51 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Nov 9 11:01:51 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Nov 9 11:01:51 2017 - [info] Starting SSH connection tests.. Thu Nov 9 11:01:52 2017 - [debug] Thu Nov 9 11:01:51 2017 - [debug] Connecting via SSH from [email protected]192.168.1.120(192.168.1.120:22) to [email protected]192.168.1.119(192.168.1.119:22).. Thu Nov 9 11:01:52 2017 - [debug] ok. Thu Nov 9 11:01:52 2017 - [debug] Connecting via SSH from [email protected]192.168.1.120(192.168.1.120:22) to [email protected]192.168.1.121(192.168.1.121:22).. Thu Nov 9 11:01:52 2017 - [debug] ok. Thu Nov 9 11:01:52 2017 - [debug] Thu Nov 9 11:01:51 2017 - [debug] Connecting via SSH from [email protected]192.168.1.119(192.168.1.119:22) to [email protected]192.168.1.120(192.168.1.120:22).. Thu Nov 9 11:01:52 2017 - [debug] ok. Thu Nov 9 11:01:52 2017 - [debug] Connecting via SSH from [email protected]192.168.1.119(192.168.1.119:22) to [email protected]192.168.1.121(192.168.1.121:22).. Thu Nov 9 11:01:52 2017 - [debug] ok. Thu Nov 9 11:01:53 2017 - [debug] Thu Nov 9 11:01:52 2017 - [debug] Connecting via SSH from [email protected]192.168.1.121(192.168.1.121:22) to [email protected]192.168.1.119(192.168.1.119:22).. Thu Nov 9 11:01:53 2017 - [debug] ok. Thu Nov 9 11:01:53 2017 - [debug] Connecting via SSH from [email protected]192.168.1.121(192.168.1.121:22) to [email protected]192.168.1.120(192.168.1.120:22).. Thu Nov 9 11:01:53 2017 - [debug] ok. Thu Nov 9 11:01:53 2017 - [info] All SSH connection tests passed successfully.
2、检查整个复制环境状态
[[email protected] ~]# masterha_check_repl --conf=/etc/masterha/app1.cnf Thu Nov 9 11:29:10 2017 - [info] Reading default configuration from /etc/masterha_default.cnf.. Thu Nov 9 11:29:10 2017 - [info] Reading application default configuration from /etc/masterha/app1.cnf.. Thu Nov 9 11:29:10 2017 - [info] Reading server configuration from /etc/masterha/app1.cnf.. Thu Nov 9 11:29:10 2017 - [info] MHA::MasterMonitor version 0.57. Thu Nov 9 11:29:11 2017 - [info] GTID failover mode = 0 Thu Nov 9 11:29:11 2017 - [info] Dead Servers: Thu Nov 9 11:29:11 2017 - [info] Alive Servers: Thu Nov 9 11:29:11 2017 - [info] 192.168.1.119(192.168.1.119:3306) Thu Nov 9 11:29:11 2017 - [info] 192.168.1.120(192.168.1.120:3306) Thu Nov 9 11:29:11 2017 - [info] 192.168.1.121(192.168.1.121:3306) Thu Nov 9 11:29:11 2017 - [info] Alive Slaves: Thu Nov 9 11:29:11 2017 - [info] 192.168.1.120(192.168.1.120:3306) Version=5.7.20-log (oldest major version between slaves) log-bin:enabled Thu Nov 9 11:29:11 2017 - [info] Replicating from 192.168.1.119(192.168.1.119:3306) Thu Nov 9 11:29:11 2017 - [info] Primary candidate for the new Master (candidate_master is set) Thu Nov 9 11:29:11 2017 - [info] 192.168.1.121(192.168.1.121:3306) Version=5.7.20-log (oldest major version between slaves) log-bin:enabled Thu Nov 9 11:29:11 2017 - [info] Replicating from 192.168.1.119(192.168.1.119:3306) Thu Nov 9 11:29:11 2017 - [info] Current Alive Master: 192.168.1.119(192.168.1.119:3306) Thu Nov 9 11:29:11 2017 - [info] Checking slave configurations.. Thu Nov 9 11:29:11 2017 - [info] read_only=1 is not set on slave 192.168.1.120(192.168.1.120:3306). Thu Nov 9 11:29:11 2017 - [info] read_only=1 is not set on slave 192.168.1.121(192.168.1.121:3306). Thu Nov 9 11:29:11 2017 - [info] Checking replication filtering settings.. Thu Nov 9 11:29:11 2017 - [info] binlog_do_db= , binlog_ignore_db= Thu Nov 9 11:29:11 2017 - [info] Replication filtering check ok. Thu Nov 9 11:29:11 2017 - [info] GTID (with auto-pos) is not supported Thu Nov 9 11:29:11 2017 - [info] Starting SSH connection tests.. Thu Nov 9 11:29:14 2017 - [info] All SSH connection tests passed successfully. Thu Nov 9 11:29:14 2017 - [info] Checking MHA Node version.. Thu Nov 9 11:29:14 2017 - [info] Version check ok. Thu Nov 9 11:29:14 2017 - [info] Checking SSH publickey authentication settings on the current master.. Thu Nov 9 11:29:14 2017 - [info] HealthCheck: SSH to 192.168.1.119 is reachable. Thu Nov 9 11:29:15 2017 - [info] Master MHA Node version is 0.57. Thu Nov 9 11:29:15 2017 - [info] Checking recovery script configurations on 192.168.1.119(192.168.1.119:3306).. Thu Nov 9 11:29:15 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/var/lib/mysql --output_file=/data/mysql/tmp/save_binary_logs_test --manager_version=0.57 --start_file=mysql-bin.000004 Thu Nov 9 11:29:15 2017 - [info] Connecting to [email protected]192.168.1.119(192.168.1.119:22).. Creating /data/mysql/tmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /var/lib/mysql, up to mysql-bin.000004 Thu Nov 9 11:29:15 2017 - [info] Binlog setting check done. Thu Nov 9 11:29:15 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Thu Nov 9 11:29:15 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=‘mha‘ --slave_host=192.168.1.120 --slave_ip=192.168.1.120 --slave_port=3306 --workdir=/data/mysql/tmp --target_version=5.7.20-log --manager_version=0.57 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Thu Nov 9 11:29:15 2017 - [info] Connecting to [email protected]192.168.1.120(192.168.1.120:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to relay-log.000006 Temporary relay log file is /var/lib/mysql/relay-log.000006 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Thu Nov 9 11:29:15 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=‘mha‘ --slave_host=192.168.1.121 --slave_ip=192.168.1.121 --slave_port=3306 --workdir=/data/mysql/tmp --target_version=5.7.20-log --manager_version=0.57 --relay_log_info=/var/lib/mysql/relay-log.info --relay_dir=/var/lib/mysql/ --slave_pass=xxx Thu Nov 9 11:29:15 2017 - [info] Connecting to [email protected]192.168.1.121(192.168.1.121:22).. Checking slave recovery environment settings.. Opening /var/lib/mysql/relay-log.info ... ok. Relay log found at /var/lib/mysql, up to relay-log.000008 Temporary relay log file is /var/lib/mysql/relay-log.000008 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Thu Nov 9 11:29:16 2017 - [info] Slaves settings check done. Thu Nov 9 11:29:16 2017 - [info] 192.168.1.119(192.168.1.119:3306) (current master) +--192.168.1.120(192.168.1.120:3306) +--192.168.1.121(192.168.1.121:3306) Thu Nov 9 11:29:16 2017 - [info] Checking replication health on 192.168.1.120.. Thu Nov 9 11:29:16 2017 - [info] ok. Thu Nov 9 11:29:16 2017 - [info] Checking replication health on 192.168.1.121.. Thu Nov 9 11:29:16 2017 - [info] ok. Thu Nov 9 11:29:16 2017 - [warning] master_ip_failover_script is not defined. Thu Nov 9 11:29:16 2017 - [warning] shutdown_script is not defined. Thu Nov 9 11:29:16 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
3、检查MHA Manager的状态(masterha_check_status)
[[email protected] ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
4、开启MHA Manager监控(masterha_manager)
[[email protected] ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover &
参数说明:
--remove_dead_master_conf:该参数表示当发生主从切换后,老的主库的IP将会从配置文件中移除。
--ignore_last_failover:在缺省情况下,如果MHA检测到连续发生宕机,且两次宕机时间间隔不足8小时的话,则不会进行failover,之所以这样限制是为了避免ping-pong效应。
5、关闭MHA Manager监控(masterha_stop)
[[email protected] ~]# masterha_stop --conf=/etc/masterha/app1.cnf Stopped app1 successfully. [1]+ Exit 1 nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover (wd: /var/log/masterha/app1) (wd now: ~)
6、各个脚本说明
[[email protected] scripts]# pwd /root/mha4mysql-manager-0.57/samples/scripts [[email protected] scripts]# ll total 32 -rwxr-xr-x. 1 1001 1001 3648 May 31 2015 master_ip_failover -rwxr-xr-x. 1 1001 1001 9870 May 31 2015 master_ip_online_change -rwxr-xr-x. 1 1001 1001 11867 May 31 2015 power_manager -rwxr-xr-x. 1 1001 1001 1360 May 31 2015 send_report
#自动切换时vip管理的脚本,如果使用keepalived的,可以编写脚本完成对vip的管理
master_ip_failover
#在线切换时vip的管理
master_ip_online_change
#故障发生后关闭主机的脚本
power_manager
#因故障切换后发送报警的脚本
send_report
以上是关于MHA的主要内容,如果未能解决你的问题,请参考以下文章