Mysql高可用集群-解决MMM单点故障
Posted joinbestgo
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Mysql高可用集群-解决MMM单点故障相关的知识,希望对你有一定的参考价值。
preface:
MMM架构相比于MHA来说各方面都逊色不少,写这篇案例也算是整理下思路吧.
一.理论概述
MMM(Master-Master replication Manager for mysql)
是一套支持双主日常管理的脚本程序,使用Perl语言开发,主要用来监控和管理MySQL双主库复制,同一时刻只允许一个主库进行写入
方便的是,mmm不但可以为写库配置VIP,而且实现读库VIP,也可以节省一些构建读负载均衡及高可用的资源
实现的功能简单来说就是实现了主库的故障切换功能,同时也可以实现多个slave读操作的负载均衡
不适用的场景:对数据一致性要求很高不适用
组件介绍
- mmm-mond程序:监控进程,负责所有的监控工作,在管理服务器上运行
- mmm_agentd:运行在每个集群mysql节点上的,完成监控的探针工作和执行简单的远端服务设置
mmm_control:一个简单的管理脚本,用来查看和管理集群运行状态,同时管理mmm_mond进程。
二.环境
- 架构拓扑,本案例只针对于数据库集群示范
主机名称 | IP地址 | 角色 |
---|---|---|
master | 192.168.111.3 | 主库,负责写 |
masterba | 192.168.111.4 | 备用主库,主主同步 |
slave1 | 192.168.111.5 | 从库,读操作 |
slave2 | 192.168.111.6 | 从库,读操作;mmm-monitor |
writeVIP | 192.168.111.100 | |
readVIP | 192.168.111.200 |
实现思路:先部署主主高可用,安装mmm相关组件
三.部署
- 所有主机安装epel源
下载地址
https://mirrors.tuna.tsinghua.edu.cn/epel//
然后选择相应的版本下载并安装
yum clean all && yum makecache
- 基本环境
[[email protected] ~]# vim /etc/hosts
192.168.111.3 master
192.168.111.4 masterba
192.168.111.5 slave1
192.168.111.6 slave2
[[email protected] ~]# hostname master
[[email protected] ~]# bash
[[email protected] ~]# bash
[[email protected] ~]# uname -n
master
#全部修改为对应的
- 部署二进制包mysql5.7.24
yum -y install libaio
wget http://mirrors.sohu.com/mysql/MySQL-5.7/mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz
useradd -M -s /sbin/nologin mysql
tar zxf mysql-5.7.24-linux-glibc2.12-x86_64.tar.gz
mv mysql-5.7.24-linux-glibc2.12-x86_64 /usr/local/mysql
chown -R mysql:mysql /usr/local/mysql
ln -s /usr/local/mysql/bin/* /usr/local/bin/
cp /usr/local/mysql/support-files/mysql.server /etc/init.d/mysqld
mysqld --user=mysql --basedir=/usr/local/mysql --datadir=/usr/local/mysql/data --initialize
#记住生成的随机密码待会启动服务之后修改
vim /etc/my.cnf
--------------------111.3--------------------------
[mysqld]
datadir=/usr/local/mysql/data
socket=/tmp/mysql.sock
symbolic-links=0
server-id=1
log-bin=mysql-bin
log_slave_updates=1
#将另一台备主同步数据给自身时,写入到binarylog
auto-increment-increment=2
auto-increment-offset=1
#auto_increment_increment:自增值
#auto_increment_offset:漂移值,也就是步长
[mysqld_safe]
log-error=/usr/local/mysql/data/mysql.log
pid-file=/usr/local/mysql/data/mysql.pid
!includedir /etc/my.cnf.d
--------------------111.4--------------------------
[mysqld]
datadir=/usr/local/mysql/data
socket=/tmp/mysql.sock
symbolic-links=0
server-id=2
log-bin=mysql-bin
log_slave_updates=1
#将另一台备主同步数据给自身时,写入到binarylog
auto-increment-increment=2
auto-increment-offset=2
#auto_increment_increment:自增值
#auto_increment_offset:漂移值,也就是步长
[mysqld_safe]
log-error=/usr/local/mysql/data/mysql.log
pid-file=/usr/local/mysql/data/mysql.pid
!includedir /etc/my.cnf.d
--------------------111.5--------------------------
[mysqld]
datadir=/usr/local/mysql/data
socket=/tmp/mysql.sock
symbolic-links=0
server-id=3
log-bin=mysql-bin
log_slave_updates=1
[mysqld_safe]
log-error=/usr/local/mysql/data/mysql.log
pid-file=/usr/local/mysql/data/mysql.pid
!includedir /etc/my.cnf.d
--------------------111.6--------------------------
[mysqld]
datadir=/usr/local/mysql/data
socket=/tmp/mysql.sock
symbolic-links=0
server-id=4
log_slave_updates=1
log-bin=mysql-bin
[mysqld_safe]
log-error=/usr/local/mysql/data/mysql.log
pid-file=/usr/local/mysql/data/mysql.pid
!includedir /etc/my.cnf.d
[[email protected] ~]# /etc/init.d/mysqld start
[[email protected] ~]# mysqladmin -u root -p'BZn9B++V06qg' password '123456'
mysqladmin: [Warning] Using a password on the command line interface can be insecure.
Warning: Since password will be sent to server in plain text, use ssl connection to ensure password safety.
#修改默认密码
- 部署主主同步
--master:
[[email protected] ~]# mysql -u root -p123456
mysql> grant replication slave on *.* to 'myslave'@'192.168.111.%' identified by'123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000002 | 879 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
--masterba:
[[email protected] ~]# mysql -uroot -p123456
mysql> grant replication slave on *.* to 'myslave'@'192.168.111.%' identified by'123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.01 sec)
mysql> show master status;
+------------------+----------+--------------+------------------+-------------------+
| File | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |
+------------------+----------+--------------+------------------+-------------------+
| mysql-bin.000002 | 879 | | | |
+------------------+----------+--------------+------------------+-------------------+
1 row in set (0.00 sec)
--master:
mysql> change master to master_host='192.168.111.4',master_user='myslave',master_password='123456',master_log_file='mysql-bin.000002',master_log_pos=879;
Query OK, 0 rows affected, 2 warnings (0.11 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.111.4
Master_User: myslave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 879
Relay_Log_File: master-relay-bin.000002
Relay_Log_Pos: 320
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 879
Relay_Log_Space: 528
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 2
Master_UUID: 01008194-68b1-11e9-bf0b-000c294b0234
Master_Info_File: /usr/local/mysql/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
--masterba:
mysql> change master to master_host='192.168.111.3',master_user='myslave',master_password='123456',master_log_file='mysql-bin.000002',master_log_pos=879;
Query OK, 0 rows affected, 2 warnings (0.01 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
mysql> show slave status\\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: 192.168.111.3
Master_User: myslave
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000002
Read_Master_Log_Pos: 1036
Relay_Log_File: masterba-relay-bin.000002
Relay_Log_Pos: 477
Relay_Master_Log_File: mysql-bin.000002
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 1036
Relay_Log_Space: 687
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
Master_UUID: e13f8b12-7bda-11e9-b71b-000c2935c4a6
Master_Info_File: /usr/local/mysql/data/master.info
SQL_Delay: 0
SQL_Remaining_Delay: NULL
Slave_SQL_Running_State: Slave has read all relay log; waiting for more updates
Master_Retry_Count: 86400
Master_Bind:
Last_IO_Error_Timestamp:
Last_SQL_Error_Timestamp:
Master_SSL_Crl:
Master_SSL_Crlpath:
Retrieved_Gtid_Set:
Executed_Gtid_Set:
Auto_Position: 0
Replicate_Rewrite_DB:
Channel_Name:
Master_TLS_Version:
1 row in set (0.00 sec)
slave1:
mysql> change master to master_host='192.168.111.4',master_user='myslave',master_password='123456',master_log_file='mysql-bin.000002',master_log_pos=879;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
#slave2一样操作,一同指向masterba
- 部署mmm
我们在主库上做授权,由于现在是同步状态,所以授权信息其它数据库也会存在
mysql> grant replication client on *.* to 'mmm_monitor'@'192.168.111.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.11 sec)
#监控用户
mysql> grant super,replication client,process on *.* to 'mmm_agent'@'192.168.111.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.01 sec)
#agent用户
mysql> grant replication slave on *.* to 'myslave'@'192.168.111.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.00 sec)
#复制用户
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)
[[email protected] ~]# vim /etc/mysql-mmm/mmm_common.conf
#该文件所有节点配置都相同
active_master_role writer
<host default>
cluster_interface ens32
pid_path /run/mysql-mmm-agent.pid
bin_path /usr/libexec/mysql-mmm/
replication_user myslave
replication_password 123456
agent_user mmm_agent
agent_password 123456
#刚才所授权用户的账号及密码
</host>
<host db1>
ip 192.168.111.3
mode master
peer db2
</host>
<host db2>
ip 192.168.111.4
mode master
peer db1
</host>
<host db3>
ip 192.168.111.5
mode slave
</host>
<host db4>
ip 192.168.111.6
mode slave
</host>
#每个host是一个节点,按照文件格式配置
<role writer>
hosts db1, db2
ips 192.168.111.100
#写操作服务器的VIP
mode exclusive
#模式:同意时间只有一个节点可以使用资源
</role>
<role reader>
hosts db3, db4
ips 192.168.111.200
#读操作服务器的VIP,可以有多个,逗号分隔
mode balanced
#负载均衡模式
</role>
[[email protected] ~]# scp /etc/mysql-mmm/mmm_common.conf [email protected]:/etc/mysql-mmm/
[[email protected] ~]# scp /etc/mysql-mmm/mmm_common.conf [email protected]:/etc/mysql-mmm/
[[email protected] ~]# scp /etc/mysql-mmm/mmm_common.conf [email protected]:/etc/mysql-mmm/
在db1-4上修改mmm_agent.conf,只需要修改db1这里,是哪台就改成哪台,这里只给出db1的
[[email protected] ~]# vim /etc/mysql-mmm/mmm_agent.conf
include mmm_common.conf
# The 'this' variable refers to this server. Proper operation requires
# that 'this' server (db1 by default), as well as all other servers, have the
# proper IP addresses set in mmm_common.conf.
this db1
配置监控工作的服务器
[[email protected] ~]# vim /etc/mysql-mmm/mmm_mon.conf
8 ping_ips 192.168.111.2
#测试网络可用性的IP地址,一般指定网关
9 auto_set_online 60
#是否设置自动上线,如果该值大于0,抖动的主机在抖动的时间范围过后,则设置自动上线
20 monitor_user mmm_monitor
21 monitor_password 123456
#监控的用户和密码
全部节点启动agent
[[email protected] ~]# systemctl start mysql-mmm-agent.service
[[email protected] ~]# systemctl status mysql-mmm-agent.service
监控机器启动monitor
[[email protected] ~]# systemctl start mysql-mmm-monitor.service
[[email protected] ~]# systemctl status mysql-mmm-monitor.service
#检查集群状态,要全部是ONLINE才正确
[[email protected] ~]# mmm_control show
db1(192.168.111.3) master/ONLINE. Roles: writer(192.168.111.100)
db2(192.168.111.4) master/ONLINE. Roles:
db3(192.168.111.5) slave/ONLINE. Roles: reader(192.168.111.200)
db4(192.168.111.6) slave/ONLINE. Roles:
四.测试
- 查看VIP是否正确分配
写:
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.3/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.100/32 scope global ens32
读:
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.200/32 scope global ens32
- 测试VIP转移
[[email protected] ~]# /etc/init.d/mysqld stop
Shutting down MySQL............ SUCCESS!
[[email protected] ~]# ip a| grep ens32
-----写-----
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.3/24 brd 192.168.111.255 scope global noprefixroute ens32
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.4/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.100/32 scope global ens32
-----读-----
[[email protected] ~]# /etc/init.d/mysqld stop
Shutting down MySQL.... SUCCESS!
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.200/32 scope global ens32
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.200/32 scope global ens32
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.200/32 scope global ens32
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.5/24 brd 192.168.111.255 scope global noprefixroute ens32
[[email protected] ~]# ip a| grep ens32
2: ens32: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
inet 192.168.111.6/24 brd 192.168.111.255 scope global noprefixroute ens32
inet 192.168.111.200/32 scope global ens32
#切换的慢
[[email protected] ~]# mmm_control show
db1(192.168.111.3) master/HARD_OFFLINE. Roles:
db2(192.168.111.4) master/ONLINE. Roles: writer(192.168.111.100)
db3(192.168.111.5) slave/HARD_OFFLINE. Roles:
db4(192.168.111.6) slave/ONLINE. Roles: reader(192.168.111.200)
#离线的已经指明了
- 修复好的数据库服务器重新加入集群的方法
[[email protected] ~]# /etc/init.d/mysqld start
#登上两三分钟最多,比较慢,在monitor上查看状态
[[email protected] ~]# mmm_control show
db1(192.168.111.3) master/ONLINE. Roles:
db2(192.168.111.4) master/ONLINE. Roles: writer(192.168.111.100)
db3(192.168.111.5) slave/HARD_OFFLINE. Roles:
db4(192.168.111.6) slave/ONLINE. Roles: reader(192.168.111.200)
- 连接数据库VIP测试
mysql> grant all privileges on *.* to [email protected]'192.168.111.%' identified by '123456';
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> flush privileges;
Query OK, 0 rows affected (0.10 sec)
#主库上做下授权,从库也会同步授权信息
[[email protected] ~]# mysql -uroot -p123456 -h'192.168.111.100'
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \\g.
Your MySQL connection id is 415
Server version: 5.7.24-log MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\\h' for help. Type '\\c' to clear the current input statement.
mysql>
[[email protected] ~]# mysql -uroot -p123456 -h'192.168.111.200'
mysql: [Warning] Using a password on the command line interface can be insecure.
Welcome to the MySQL monitor. Commands end with ; or \\g.
Your MySQL connection id is 1043
Server version: 5.7.24-log MySQL Community Server (GPL)
Copyright (c) 2000, 2018, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type 'help;' or '\\h' for help. Type '\\c' to clear the current input statement.
mysql>
#输入的必须都是VIP进行连接
五.总结
- 明显的切换速度太慢,案例环境还是没有多少数据的,以及系统环境也不是很复杂,生产环境慎用吧
- 可以做写的高可用,也可以用来负载均衡读的服务器
- 持之以恒
以上是关于Mysql高可用集群-解决MMM单点故障的主要内容,如果未能解决你的问题,请参考以下文章