MySQL MHA高可用方案
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了MySQL MHA高可用方案相关的知识,希望对你有一定的参考价值。
介绍
MHA(Master High Availability)目前在mysql高可用方面是一个相对成熟的解决方案,是一套优秀的作为MySQL高可用性环境下故障切换和主从提升的高可用软件。在MySQL故障切换过程中,MHA能做到在0~30秒之内自动完成数据库的故障切换操作,并且在进行故障切换的过程中,MHA能在最大程度上保证数据的一致性,以达到真正意义上的高可用。它由两部分组成:MHA Manager(管理节点)和MHA Node(数据节点)。MHA Manager可以单独部署在一台独立的机器上管理多个master-slave集群,也可以部署在一台slave节点上。MHA Node运行在每台MySQL服务器上,MHA Manager会定时探测集群中的master节点,当master出现故障时,它可以自动将最新数据的slave提升为新的master,然后将所有其他的slave重新指向新的master。整个故障转移过程对应用程序完全透明。在MHA自动故障切换过程中,MHA试图从宕机的主服务器上保存二进制日志,最大程度的保证数据的不丢失,但这并不总是可行的。例如,如果主服务器硬件故障或无法通过ssh访问,MHA没法保存二进制日志,只进行故障转移而丢失了最新的数据。使用MySQL 5.5的半同步复制,可以大大降低数据丢失的风险。MHA可以与半同步复制结合起来。如果只有一个slave已经收到了最新的二进制日志,MHA可以将最新的二进制日志应用于其他所有的slave服务器上,因此可以保证所有节点的数据一致性。
原理
(1)从宕机崩溃的master保存二进制日志事件(binlog events);
(2)识别含有最新更新的slave;
(3)应用差异的中继日志(relay log)到其他的slave;
(4)应用从master保存的二进制日志事件(binlog events);
(5)提升一个slave为新的master;
(6)使其他的slave连接新的master进行复制;
MHA软件由两部分组成,Manager工具包和Node工具包
Manager工具包主要包括以下几个工具:
masterha_check_ssh 检查MHA的SSH配置状况 masterha_check_repl 检查MySQL复制状况 masterha_manger 启动MHA masterha_check_status 检测当前MHA运行状态 masterha_master_monitor 检测master是否宕机 masterha_master_switch 控制故障转移(自动或者手动) masterha_conf_host 添加或删除配置的server信息
Node工具包(这些工具通常由MHA Manager的脚本触发,无需人为操作)主要包括以下几个工具:
save_binary_logs 保存和复制master的二进制日志 apply_diff_relay_logs 识别差异的中继日志事件并将其差异的事件应用于其他的slave filter_mysqlbinlog 去除不必要的ROLLBACK事件(MHA已不再使用这个工具) purge_relay_logs 清除中继日志(不会阻塞SQL线程)
一、安装MHA
1.创建安装目录
Node服务器安装
mkdir -p /usr/local/mha
manage服务器安装
mkdir -p /usr/local/mha/ha1/fail_script
mkdir -p /usr/local/mha/ha1/workdir
/usr/local/mha:程序安装目录
/usr/local/mha/ha1:用于区别每一个mha方案,当前方案ha1
/usr/local/mha/ha1/fail_script:方案ha1的failover脚本保存路径
/usr/local/mha/ha1/workdir:方案ha1的的日志和failover产生的binlog保存路径
2.安装epel插件
使用yum方式安装,需要安装epel源
epel源
wget http://mirrors.ustc.edu.cn/fedora/epel/6/x86_64/epel-release-6-8.noarch.rpm
所有服务器都安装(mananage需要安装以下所有插件,node节点只需要安装perl-DBD-MySQL,cpan)
yum install -y perl-DBD-MySQL perl-Config-Tiny perl-Log-Dispatch perl-Parallel-ForkManager perl-Time-HiRes cpan
也可以使用perl方式安装
#!/bin/bash wget http://xrl.us/cpanm --no-check-certificate mv cpanm /usr/bin chmod 755 /usr/bin/cpanm cat > /root/list << EOF install DBD::mysql install Config::Tiny install Log::Dispatch install Parallel::ForkManager install Time::HiRes install CPAN install Digest::SHA EOF for package in `cat /root/list` do cpanm $package done
3.安装MHA Node软件包,所有服务器都要安装
tar -xvf mha4mysql-node-0.54.tar.gz cd mha4mysql-node-0.54 perl Makefile.PL INSTALL_BASE=/usr/local/mha make && make install
4.安装MHA Manager软件包,只在Manager主机上安装
tar -xvf mha4mysql-manager-0.55.tar.gz cd mha4mysql-manager-0.55 perl Makefile.PL INSTALL_BASE=/usr/local/mha make && make install
cp samples/scripts/* /usr/local/mha/bin/
master_ip_failover:自动切换时vip管理的脚本
master_ip_online_change:手动切换使用的脚本
power_manager:故障发生后关闭主机的脚本
send_report:发送报警的脚本。
5.修改环境变量
将MHA Manager主机的/usr/local/mha/bin加入环境变量
6.添加软链接
为了不麻烦所有服务器都执行吧,其实最后两个mysql,mysqlbinlog的软链接只有Node服务器需要添加,其它的所有服务器都需要添加。
mkdir -p /usr/local/bin mkdir -p /usr/local/share/man/man1 mkdir -p /usr/local/share/perl5/MHA ln -s /usr/local/mha/bin/* /usr/local/bin; ln -s /usr/local/mha/man/man1/* /usr/local/share/man/man1; ln -s /usr/local/mha/lib/perl5/MHA /usr/local/share/perl5/MHA; ln -s /usr/local/mysql/bin/mysqlbinlog /usr/local/bin/mysqlbinlog; ln -s /usr/local/mysql/bin/mysql /usr/local/bin/mysql;
二、配置MHA
1.配置SSH无密码登入
(1)在manage配置到所有Node节点的无密码登入
ssh-keygen -t rsa 一直enter,会在/root/.ssh/下面生成id_rsa.pub ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.10 ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.20 ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.30
(2)在Node 10配置到Node 20,30的无密码登入
ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.20 ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.30
(3)在Node 20配置到Node 10,30的无密码登入
ssh-keygen -t rsa
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.10
ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.30
(4)在Node 30配置到Node 10,20的无密码登入
ssh-keygen -t rsa ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.10 ssh-copy-id -i /root/.ssh/id_rsa.pub root@192.168.137.20
2. 搭建复制环境
复制环境之前已经搭建好,可以参考我前面写过的文章,复制用户和密码都是repl;每个Node上都必须创建这个repl账号,除非Node不作为故障专业的master
1.在所有Node上创建Manage监控用户
grant all privileges on *.* to \'root\'@\'192.168.137.%\' identified by \'root\';
三、配置Manage
1.配置启动文件
vim /usr/local/mha/ha1/ha1.cnf
[server default] manager_workdir=/usr/local/mha/ha1/workdir ##项目的主目录 manager_log=/usr/local/mha/ha1/workdir/manager.log ###mha记录日志 master_binlog_dir=/mysql/log ####node服务器的binlog存放路径,如果每个node的binlog路径不一致的话就在下面的每个server下面单独配置 master_ip_failover_script=/usr/local/mha/ha1/fail_script/master_ip_failover ####mha在线自动failover时处理VIP的配置文件 master_ip_online_change_script=/usr/local/mha/ha1/fail_script/master_ip_online_change ####在线手动执行master切换时VIP的处理文件 secondary_check_script=/usr/local/mha/bin/masterha_secondary_check -s backup -s master --user=root --master_host=master --master_ip=192.168.137.10 --master_port=3306 ##一旦MHA到master之间的网络出现问题,manager会尝试从backup登入到masger #report_script=/usr/local/mha/ha1/fail_script/send_report ###发生切换后执行的报警脚本 shutdown_script="" ####故障后关闭master主机的脚本(主要是使用keepalive做VIP时会出现脑裂导致VIP频繁切换所以会将故障的master关闭) ping_interval=1 ###监控mater,ping的频率 remote_workdir=/tmp ###node服务器在发生master切换时,binlog保持的路径,每个node都会在该目录下保存一份差异的binlog,除非没有差异。 repl_password=repl ##复制使用的用户名,每个node服务器都需要存在 repl_user=repl ##复制使用的密码 user=root ##mnager监控用的mysql root用户 password=root ##root用户密码 ssh_user=root ##ssh登入用户名 [server1] hostname=192.168.137.10 port=3306 candidate_master=1 check_repl_delay=0 [server2] hostname=192.168.137.20 port=3306 #master_binlog_dir=/mysql/log candidate_master=1 ##设置为候选master,如果设置该参数以后,发生主从切换以后将会将此从库提升为主库,即使这个主库不是集群中事件最新的slave check_repl_delay=0 ##默认情况下如果一个slave落后master 100M的relay logs的话,MHA将不会选择该slave作为一个新的master,因为对于这个slave的恢复需要花费很长时间,通过设置check_repl_delay=0,MHA触发切换在选择一个新master的时候将会忽略复制延时,这个参数对于设置了candidate_master=1的主机非常有用,因为这个候选主在切换的过程中一定是新的master [server3] hostname=192.168.137.30 port=3306 ignore_fail=1 ####如果不加上该参数,当该slave主机故障了,mha将无法启动,加上该参数会忽略该主机是否正常,在mha启动的时候加上参数--ignore_fail_on_start no_master=1 ###不将该主机转换为master
注意:对于上面的配置一定要确保server1和server2之间是最新的binlog,一般会配置二者为双主的半同步复制,这样就保证了它们之间的binlog是最新的,否则应用差异的binlog将花费非常长的时间(如果它们和master延时非常大的情况下)
2.master_ip_failover
VIP的配置可以使用keepalived也可以写脚本,keepalived对网络的要求很高否则容易脑裂,在我前面搭建双主环境讲过keepalived的搭建方法,我这里使用脚本的方式。
#!/usr/bin/env perl use strict; use warnings FATAL => \'all\'; use Getopt::Long; my ( $command, $ssh_user, $orig_master_host, $orig_master_ip, $orig_master_port, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password ); my $vip = \'192.168.137.50/24\'; ###VIP my $key = \'1\'; ###用于区别本身的eth0 my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; GetOptions( \'command=s\' => \\$command, \'ssh_user=s\' => \\$ssh_user, \'orig_master_host=s\' => \\$orig_master_host, \'orig_master_ip=s\' => \\$orig_master_ip, \'orig_master_port=i\' => \\$orig_master_port, \'new_master_host=s\' => \\$new_master_host, \'new_master_ip=s\' => \\$new_master_ip, \'new_master_port=i\' => \\$new_master_port, \'new_master_user=s\' => \\$new_master_user, \'new_master_password=s\' => \\$new_master_password, ); exit &main(); sub main { print "\\n\\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\\n\\n"; if ( $command eq "stop" || $command eq "stopssh" ) { my $exit_code = 1; eval { print "Disabling the VIP on old master: $orig_master_host \\n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { my $exit_code = 10; eval { print "Enabling the VIP - $vip on the new master - $new_master_host \\n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \\n"; exit 0; } else { &usage(); exit 1; } } sub start_vip() { `ssh $ssh_user\\@$new_master_host \\" $ssh_start_vip \\"`; } sub stop_vip() { return 0 unless ($ssh_user); `ssh $ssh_user\\@$orig_master_host \\" $ssh_stop_vip \\"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\\n"; }
注意:需要手动先在master服务器上面添加VIP
/sbin/ifconfig eth0:1 192.168.137.50/24
3.master_ip_online_change
perl脚本
#!/usr/bin/env perl use strict; use warnings FATAL =>\'all\'; use Getopt::Long; my $vip = \'192.168.137.50/24\'; # Virtual IP my $key = "1"; my $ssh_start_vip = "/sbin/ifconfig eth0:$key $vip"; my $ssh_stop_vip = "/sbin/ifconfig eth0:$key down"; my $exit_code = 0; my ( $command, $orig_master_is_new_slave, $orig_master_host, $orig_master_ip, $orig_master_port, $orig_master_user, $orig_master_password, $new_master_host, $new_master_ip, $new_master_port, $new_master_user, $new_master_password, ); GetOptions( \'command=s\' => \\$command, \'orig_master_is_new_slave\' => \\$orig_master_is_new_slave, \'orig_master_host=s\' => \\$orig_master_host, \'orig_master_ip=s\' => \\$orig_master_ip, \'orig_master_port=i\' => \\$orig_master_port, \'orig_master_user=s\' => \\$orig_master_user, \'orig_master_password=s\' => \\$orig_master_password, \'new_master_host=s\' => \\$new_master_host, \'new_master_ip=s\' => \\$new_master_ip, \'new_master_port=i\' => \\$new_master_port, \'new_master_user=s\' => \\$new_master_user, \'new_master_password=s\' => \\$new_master_password, ); exit &main(); sub main { #print "\\n\\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\\n\\n"; if ( $command eq "stop" || $command eq "stopssh" ) { # $orig_master_host, $orig_master_ip, $orig_master_port are passed. # If you manage master ip address at global catalog database, # invalidate orig_master_ip here. my $exit_code = 1; eval { print "\\n\\n\\n***************************************************************\\n"; print "Disabling the VIP - $vip on old master: $orig_master_host\\n"; print "***************************************************************\\n\\n\\n\\n"; &stop_vip(); $exit_code = 0; }; if ($@) { warn "Got Error: $@\\n"; exit $exit_code; } exit $exit_code; } elsif ( $command eq "start" ) { # all arguments are passed. # If you manage master ip address at global catalog database, # activate new_master_ip here. # You can also grant write access (create user, set read_only=0, etc) here. my $exit_code = 10; eval { print "\\n\\n\\n***************************************************************\\n"; print "Enabling the VIP - $vip on new master: $new_master_host \\n"; print "***************************************************************\\n\\n\\n\\n"; &start_vip(); $exit_code = 0; }; if ($@) { warn $@; exit $exit_code; } exit $exit_code; } elsif ( $command eq "status" ) { print "Checking the Status of the script.. OK \\n"; `ssh $orig_master_user\\@$orig_master_host \\" $ssh_start_vip \\"`; exit 0; } else { &usage(); exit 1; } } # A simple system call that enable the VIP on the new master sub start_vip() { `ssh $new_master_user\\@$new_master_host \\" $ssh_start_vip \\"`; } # A simple system call that disable the VIP on the old_master sub stop_vip() { `ssh $orig_master_user\\@$orig_master_host \\" $ssh_stop_vip \\"`; } sub usage { print "Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\\n"; }
shell脚本
#/bin/bash #source /root/.bash_profile vip=`echo \'192.168.137.50/24\'` # Virtual IP key=`echo \'1\'` command=`echo "$1" | awk -F = \'{print $2}\'` orig_master_host=`echo "$2" | awk -F = \'{print $2}\'` new_master_host=`echo "$7" | awk -F = \'{print $2}\'` stop_vip=`echo "ssh root@$orig_master_host /sbin/ifconfig eth0:$key down"` start_vip=`echo "ssh root@$new_master_host /sbin/ifconfig eth0:$key $vip"` if [ $command = \'stop\' ] then echo -e "\\n\\n\\n***************************************************************\\n" echo -e "Disabling the VIP - $vip on old master: $orig_master_host\\n" $stop_vip if [ $? -eq 0 ] then echo "Disabled the VIP successfully" else echo "Disabled the VIP failed" fi echo -e "***************************************************************\\n\\n\\n\\n" fi if [ $command = \'start\' -o $command = \'status\' ] then echo -e "\\n\\n\\n***************************************************************\\n" echo -e "Enabling the VIP - $vip on new master: $new_master_host \\n" $start_vip if [ $? -eq 0 ] then echo "Enabled the VIP successfully" else echo "Enabled the VIP failed" fi echo -e "***************************************************************\\n\\n\\n\\n" fi
4.send_report
#!/usr/bin/perl # Copyright (C) 2011 DeNA Co.,Ltd. # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA ## Note: This is a sample script and is not complete. Modify the script based on your environment. use strict; use warnings FATAL => \'all\'; use Mail::Sender; use Getopt::Long; #new_master_host and new_slave_hosts are set only when recovering master succeeded my ( $dead_master_host, $new_master_host, $new_slave_hosts, $subject, $body ); my $smtp=\'smtp.163.com\'; my $mail_from=\'xxxx\'; my $mail_user=\'xxxxx\'; my $mail_pass=\'xxxxx\'; my $mail_to=[\'xxxx\',\'xxxx\']; GetOptions( \'orig_master_host=s\' => \\$dead_master_host, \'new_master_host=s\' => \\$new_master_host, \'new_slave_hosts=s\' => \\$new_slave_hosts, \'subject=s\' => \\$subject, \'body=s\' => \\$body, ); mailToContacts($smtp,$mail_from,$mail_user,$mail_pass,$mail_to,$subject,$body); sub mailToContacts { my ( $smtp, $mail_from, $user, $passwd, $mail_to, $subject, $msg ) = @_; open my $DEBUG, "> /tmp/monitormail.log" or die "Can\'t open the debug file:$!\\n"; my $sender = new Mail::Sender { ctype => \'text/plain; charset=utf-8\', encoding => \'utf-8\', smtp => $smtp, from => $mail_from, auth => \'LOGIN\', TLS_allowed => \'0\', authid => $user, authpwd => $passwd, to => $mail_to, subject => $subject, debug => $DEBUG }; $sender->MailMsg( { msg => $msg, debug => $DEBUG } ) or print $Mail::Sender::Error; return 1; } # Do whatever you want here exit 0;
四、配置relay_log的清除方式(在每个Node上)
(1)所有Node的cnf配置文件加上
relay_log_purge=0
MHA在发生切换的过程中,从库的恢复过程中依赖于relay log的相关信息,所以这里要将relay log的自动清除设置为OFF,采用手动清除relay log的方式。
在默认情况下,从服务器上的中继日志会在SQL线程执行完毕后被自动删除。但是在MHA环境中,这些中继日志在恢复其他从服务器时可能会被用到,因此需要禁用中继日志的自动删除功能。定期清除中继日志需要考虑到复制延时的问题。在ext3的文件系统下,删除大的文件需要一定的时间,会导致严重的复制延时。为了避免复制延时,需要暂时为中继日志创建硬链接,因为在linux系统中通过硬链接删除大文件速度会很快。
提示:在mysql数据库中,删除大表时,通
以上是关于MySQL MHA高可用方案的主要内容,如果未能解决你的问题,请参考以下文章