MHA快速搭建
Posted Atlas
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MHA快速搭建相关的知识,希望对你有一定的参考价值。
很早之前写过MHA的文章,但是常常在技术群看到有同学问MHA搭建的问题,不是权限问题就是配置问题,我在这里就再次一写下配置过程以及快速的搭建。如果想知道更多的细节与原理,请参考:MySQL高可用架构之MHA
环境:
1主1从,manager放在从库。
主库:192.168.0.10
从库:192.168.0.20
两台机器的mysql安装完成初始化以后进行复制搭建,首先登录主库(192.168.0.10),查看pos点:
mysql> show master status; +------------------+----------+--------------+------------------+-------------------+ | File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set | +------------------+----------+--------------+------------------+-------------------+ | mysql-bin.000001 | 154 | | | | +------------------+----------+--------------+------------------+-------------------+ 1 row in set (0.00 sec)
然后在主库(192.168.0.10)
添加复制账号以及mha用的账号
mysql> grant replication slave on *.* to \'repl\'@\'192.168.0.10\' identified by \'repl\'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> grant replication slave on *.* to \'repl\'@\'192.168.0.20\' identified by \'repl\'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> grant all on *.* to \'root\'@\'192.168.0.20\' identified by \'123\'; Query OK, 0 rows affected, 1 warning (0.00 sec) mysql> grant all on *.* to \'root\'@\'192.168.0.10\' identified by \'123\'; Query OK, 0 rows affected, 1 warning (0.01 sec) mysql>
从库(192.168.0.10 )change到mysql-bin.000001,pos点154
CHANGE MASTER TO MASTER_HOST=\'192.168.0.10\',MASTER_USER=\'repl\',MASTER_PASSWORD=\'repl\',MASTER_LOG_FILE=\'mysql-bin.000001\',MASTER_LOG_POS=154;
mysql> start slave; Query OK, 0 rows affected (0.15 sec) mysql> show slave status\\G *************************** 1. row *************************** Slave_IO_State: Waiting for master to send event Master_Host: 192.168.0.10 Master_User: repl Master_Port: 3306 Connect_Retry: 60 Master_Log_File: mysql-bin.000001 Read_Master_Log_Pos: 1344 Relay_Log_File: relaylog.000002 Relay_Log_Pos: 1510 Relay_Master_Log_File: mysql-bin.000001 Slave_IO_Running: Yes Slave_SQL_Running: Yes Replicate_Do_DB: Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table:
复制到这来就搭建完成了,然后配置192.168.0.10和192.168.0.20 ssh互信。(两台机器都执行)
ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub \'-p 22 192.168.0.10\' ssh-copy-id -i /root/.ssh/id_rsa.pub \'-p 22 192.168.0.20\'
安装MHA软件,首先安装epel源。(2台机器)
rpm -ivh http://dl.fedoraproject.org/pub/epel/6/x86_64/epel-release-6-8.noarch.rpm
安装依赖包
yum install perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes -y
安装MHA软件(两台机器)
tar xf mha4mysql-node-0.56.tar.gz cd mha4mysql-node-0.56 perl Makefile.PL make && make install tar xf mha4mysql-manager-0.56.tar.gz cd mha4mysql-manager-0.56 perl Makefile.PL make && make install
在从库(192.168.0.20)创建目录:
mkdir /data/mha/3306/log cd /data/mha/3306/touch mha.cnf
mha.cnf配置文件内容如下:
[server default] client_bindir=/usr/local/mysql/bin/ manager_log=/data/mha/3306/log/manager.log manager_workdir=/data/mha/3306/log master_binlog_dir=/data/mysql/3306/binlog/ master_ip_failover_script=/usr/local/bin/master_ip_failover master_ip_online_change_script= /usr/local/bin/master_ip_online_change report_script=/usr/local/bin/send_report init_conf_load_script=/usr/local/bin/load_cnf remote_workdir=/data/mysql/3306/mysqltmp #secondary_check_script= /usr/local/bin/masterha_secondary_check -s 192.168.0.30 -s 192.168.0.40 user=root ping_interval=3 repl_user=repl ssh_port=22 ssh_user=root max_ping_errors=40 [server1] hostname=192.168.0.10 port=3306 [server2] candidate_master=1 check_repl_delay=0 hostname=192.168.0.20 port=3306
编辑文件 /usr/local/bin/load_cnf 里面的密码修改成对应的密码
#!/usr/bin/perl print "password=123\\n"; print "repl_password=repl\\n";
执行chek命令查看复制是否正常:
[root@dbserver-yayun-02 3306]# masterha_check_repl --conf=/data/mha/3306/mha.cnf Mon Mar 13 21:36:54 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Mar 13 21:36:54 2017 - [info] Reading application default configuration from /data/mha/3306/mha.cnf.. Mon Mar 13 21:36:54 2017 - [info] Updating application default configuration from /usr/local/bin/load_cnf.. Mon Mar 13 21:36:54 2017 - [info] Reading server configuration from /data/mha/3306/mha.cnf.. Mon Mar 13 21:36:54 2017 - [info] Setting max_ping_errors to 40, ping_interval to 3. Mon Mar 13 21:36:54 2017 - [info] MHA::MasterMonitor version 0.56. Mon Mar 13 21:36:54 2017 - [info] GTID failover mode = 0 Mon Mar 13 21:36:54 2017 - [info] Dead Servers: Mon Mar 13 21:36:54 2017 - [info] Alive Servers: Mon Mar 13 21:36:54 2017 - [info] 192.168.0.10(192.168.0.10:3306) Mon Mar 13 21:36:54 2017 - [info] 192.168.0.20(192.168.0.20:3306) Mon Mar 13 21:36:54 2017 - [info] Alive Slaves: Mon Mar 13 21:36:54 2017 - [info] 192.168.0.20(192.168.0.20:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Mon Mar 13 21:36:54 2017 - [info] Replicating from 192.168.0.10(192.168.0.10:3306) Mon Mar 13 21:36:54 2017 - [info] Primary candidate for the new Master (candidate_master is set) Mon Mar 13 21:36:54 2017 - [info] Current Alive Master: 192.168.0.10(192.168.0.10:3306) Mon Mar 13 21:36:54 2017 - [info] Checking slave configurations.. Mon Mar 13 21:36:54 2017 - [info] read_only=1 is not set on slave 192.168.0.20(192.168.0.20:3306). Mon Mar 13 21:36:54 2017 - [warning] relay_log_purge=0 is not set on slave 192.168.0.20(192.168.0.20:3306). Mon Mar 13 21:36:54 2017 - [info] Checking replication filtering settings.. Mon Mar 13 21:36:54 2017 - [info] binlog_do_db= , binlog_ignore_db= Mon Mar 13 21:36:54 2017 - [info] Replication filtering check ok. Mon Mar 13 21:36:54 2017 - [info] GTID (with auto-pos) is not supported Mon Mar 13 21:36:54 2017 - [info] Starting SSH connection tests.. Mon Mar 13 21:36:55 2017 - [info] All SSH connection tests passed successfully. Mon Mar 13 21:36:55 2017 - [info] Checking MHA Node version.. Mon Mar 13 21:36:55 2017 - [info] Version check ok. Mon Mar 13 21:36:55 2017 - [info] Checking SSH publickey authentication settings on the current master.. Mon Mar 13 21:36:55 2017 - [info] HealthCheck: SSH to 192.168.0.10 is reachable. Mon Mar 13 21:36:56 2017 - [info] Master MHA Node version is 0.56. Mon Mar 13 21:36:56 2017 - [info] Checking recovery script configurations on 192.168.0.10(192.168.0.10:3306).. Mon Mar 13 21:36:56 2017 - [info] Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/data/mysql/3306/binlog/ --output_file=/data/mysql/3306/mysqltmp/save_binary_logs_test --manager_version=0.56 --start_file=mysql-bin.000001 Mon Mar 13 21:36:56 2017 - [info] Connecting to root@192.168.0.10(192.168.0.10:22).. Creating /data/mysql/3306/mysqltmp if not exists.. ok. Checking output directory is accessible or not.. ok. Binlog found at /data/mysql/3306/binlog/, up to mysql-bin.000001 Mon Mar 13 21:36:56 2017 - [info] Binlog setting check done. Mon Mar 13 21:36:56 2017 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers.. Mon Mar 13 21:36:56 2017 - [info] Executing command : apply_diff_relay_logs --command=test --slave_user=\'root\' --slave_host=192.168.0.20 --slave_ip=192.168.0.20 --slave_port=3306 --workdir=/data/mysql/3306/mysqltmp --target_version=5.7.17-log --manager_version=0.56 --client_bindir=/usr/local/mysql/bin/ --relay_dir=/data/mysql/3306/relaylog --current_relay_log=relaylog.000002 --slave_pass=xxx Mon Mar 13 21:36:56 2017 - [info] Connecting to root@192.168.0.20(192.168.0.20:22).. Checking slave recovery environment settings.. Relay log found at /data/mysql/3306/relaylog, up to relaylog.000002 Temporary relay log file is /data/mysql/3306/relaylog/relaylog.000002 Testing mysql connection and privileges..mysql: [Warning] Using a password on the command line interface can be insecure. done. Testing mysqlbinlog output.. done. Cleaning up test file(s).. done. Mon Mar 13 21:36:56 2017 - [info] Slaves settings check done. Mon Mar 13 21:36:56 2017 - [info] 192.168.0.10(192.168.0.10:3306) (current master) +--192.168.0.20(192.168.0.20:3306) Mon Mar 13 21:36:56 2017 - [info] Checking replication health on 192.168.0.20.. Mon Mar 13 21:36:56 2017 - [info] ok. Mon Mar 13 21:36:56 2017 - [info] Checking master_ip_failover_script status: Mon Mar 13 21:36:56 2017 - [info] /usr/local/bin/master_ip_failover --command=status --ssh_user=root --orig_master_host=192.168.0.10 --orig_master_ip=192.168.0.10 --orig_master_port=3306 IN SCRIPT TEST====/sbin/ifconfig eth1:1 down==/sbin/ifconfig eth1:1 192.168.0.88/16=== Checking the Status of the script.. OK Mon Mar 13 21:36:56 2017 - [info] OK. Mon Mar 13 21:36:56 2017 - [warning] shutdown_script is not defined. Mon Mar 13 21:36:56 2017 - [info] Got exit code 0 (Not master dead). MySQL Replication Health is OK.
主库(192.168.0.10)执行命令,启动vip:
/sbin/ifconfig eth1:1 192.168.0.88/16
在线切换,把主库切到192.168.0.20
masterha_master_switch --master_state=alive --conf=/data/mha/3306/mha.cnf --new_master_host=192.168.0.20 --new_master_port=3306 --orig_master_is_new_slave
输出如下:
Mon Mar 13 22:15:08 2017 - [info] MHA::MasterRotate version 0.56. Mon Mar 13 22:15:08 2017 - [info] Starting online master switch.. Mon Mar 13 22:15:08 2017 - [info] Mon Mar 13 22:15:08 2017 - [info] * Phase 1: Configuration Check Phase.. Mon Mar 13 22:15:08 2017 - [info] Mon Mar 13 22:15:08 2017 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping. Mon Mar 13 22:15:08 2017 - [info] Reading application default configuration from /data/mha/3306/mha.cnf.. Mon Mar 13 22:15:08 2017 - [info] Updating application default configuration from /usr/local/bin/load_cnf.. Mon Mar 13 22:15:08 2017 - [info] Reading server configuration from /data/mha/3306/mha.cnf.. Mon Mar 13 22:15:08 2017 - [info] Setting max_ping_errors to 40, ping_interval to 3. Mon Mar 13 22:15:08 2017 - [info] GTID failover mode = 0 Mon Mar 13 22:15:08 2017 - [info] Current Alive Master: 192.168.0.10(192.168.0.10:3306) Mon Mar 13 22:15:08 2017 - [info] Alive Slaves: Mon Mar 13 22:15:08 2017 - [info] 192.168.0.20(192.168.0.20:3306) Version=5.7.17-log (oldest major version between slaves) log-bin:enabled Mon Mar 13 22:15:08 2017 - [info] Replicating from 192.168.0.10(192.168.0.10:3306) Mon Mar 13 22:15:08 2017 - [info] Primary candidate for the new Master (candidate_master is set) It is better to execute FLUSH NO_WRITE_TO_BINLOG TABLES on the master before switching. Is it ok to execute on 192.168.0.10(192.168.0.10:3306)? (YES/no): yes Mon Mar 13 22:15:09 2017 - [info] Executing FLUSH NO_WRITE_TO_BINLOG TABLES. This may take long time.. Mon Mar 13 22:15:09 2017 - [info] ok. Mon Mar 13 22:15:09 2017 - [info] Checking MHA is not monitoring or doing failover.. Mon Mar 13 22:15:09 2017 - [info] Checking replication health on 192.168.0.20.. Mon Mar 13 22:15:09 2017 - [info] ok. Mon Mar 13 22:15:09 2017 - [info] 192.168.0.20 can be new master. Mon Mar 13 22:15:09 2017 - [info] From: 192.168.0.10(192.168.0.10:3306) (current master) +--192.168.0.20(192.168.0.20:3306) To: 192.168.0.20(192.168.0.20:3306) (new master) +--192.168.0.10(192.168.0.10:3306) Starting master switch from 192.168.0.10(192.168.0.10:3306) to 192.168.0.20(192.168.0.20:3306)? (yes/NO): yes Mon Mar 13 22:15:10 2017 - [info] Checking whether 192.168.0.20(192.168.0.20:3306) is ok for the new master.. Mon Mar 13 22:15:10 2017 - [info] ok. Mon Mar 13 22:15:10 2017 - [info] 192.168.0.10(192.168.0.10:3306): SHOW SLAVE STATUS returned empty result. To check replication filtering rules, temporarily executing CHANGE MASTER to a dummy host. Mon Mar 13 22:15:11 2017 - [info] 192.168.0.10(192.168.0.10:3306): Resetting slave pointing to the dummy host. Mon Mar 13 22:15:11 2017 - [info] ** Phase 1: Configuration Check Phase completed. Mon Mar 13 22:15:11 2017 - [info] Mon Mar 13 22:15:11 2017 - [info] * Phase 2: Rejecting updates Phase.. Mon Mar 13 22:15:11 2017 - [info] Mon Mar 13 22:15:以上是关于MHA快速搭建的主要内容,如果未能解决你的问题,请参考以下文章