corosync+drbd+mysql实现的高可用

Posted 2020-06-16

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了corosync+drbd+mysql实现的高可用相关的知识，希望对你有一定的参考价值。

要求：

一、能够在同一网段内直接通信

二、节点名称，要和uname的结果一样，并保证可以根据节点名称解析到节点的IP地址，配置本地/etc/hosts

三、SSH互信通信

四、保证时间同步

环境准备配置：

test1,192.168.10.55配置

1、配置IP地址

[[email protected] ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0

2、配置主机名

[[email protected] ~]# uname -n
[[email protected] ~]# hostname master1.local#临时生效
[[email protected] ~]# vim /etc/sysconfig/network#永久生效

3、配置主机名解析

[[email protected] ~]# vim /etc/hosts
添加：
192.168.10.55master1.local
192.168.10.56master2.local

3.2、测试主机名通信

[[email protected] ~]# ping master1.local
[[email protected] ~]# ping master2.local

4、配置SSH互信认证

[[email protected] ~]# ssh-keygen -t rsa -P ‘‘
[[email protected] ~]# ssh-copy-id -i .ssh/id_rsa.pub [email protected]

5、使用ntp同步时间

在crontab中加入每5分钟执行一次ntpdate命令，用来保证服务器时间是同步的

[[email protected] ~]# crontab -e
*/5 * * * * /sbin/ntpdate 192.168.10.1 &> /dev/null

test2,192.168.10.56配置

1、配置IP地址

[[email protected] ~]# vim /etc/sysconfig/network-scripts/ifcfg-eth0

2、配置主机名

[[email protected] ~]# uname -n
[[email protected] ~]# hostname test2.local#临时生效
[[email protected] ~]# vim /etc/sysconfig/network#永久生效

3、配置主机名解析

[[email protected] ~]# vim /etc/hosts
添加：
192.168.10.55test1.localtest1
192.168.10.56test2.localtest2

3.2、测试主机名通信

[[email protected] ~]# ping test1.local
[[email protected] ~]# ping test1

4、配置SSH互信认证

[[email protected] ~]# ssh-keygen -t rsa -P ‘‘
[[email protected] ~]# ssh-copy-id -i .ssh/id_rsa.pub [email protected]

5、使用ntp同步时间

在crontab中加入每5分钟执行一次ntpdate命令，用来保证服务器时间是同步的

[[email protected] ~]# crontab -e
*/5 * * * * /sbin/ntpdate 192.168.10.1 &> /dev/null

安装配置heartbeat

CentOS直接yum安装报错，提示找不到可用的软件包

解决办法：

[[email protected] src]# wget http://mirrors.sohu.com/fedora-epel/6/i386/epel-release-6-8.noarch.rpm
[[email protected] src]# rpm -ivh epel-release-6-8.noarch.rpm

6.1、安装heartbeat：

[[email protected] src]# yum install heartbeat

6.2、copy配置文件：

[[email protected] src]# cp /usr/share/doc/heartbeat-3.0.4/{ha.cf,authkeys,haresources} /etc/ha.d/

6.3、配置认证文件：

[[email protected] src]# dd if=/dev/random count=1 bs=512 |md5sum   #生成随机数
[[email protected] src]# vim /etc/ha.d/authkeys
auth 1
1 md5 d0f70c79eeca5293902aiamheartbeat
[[email protected] src]# chmod 600 authkeys

test2节点的heartbeat安装和test1一样，此处略过。

6.4、heartbeat主配置文件参数：

[[email protected] ~]# vim /etc/ha.d/ha.cf
#debugfile /var/log/ha-debug    #排错日志
logfile                         #日志位置
keepalive 2                 #多长时间发送一次心跳检测，默认2秒，可以使用ms
deadtime 30                     #多长时间检测不到主机就认为挂掉
warntime 10                     #如果没有收到心跳信息，那么在等待多长时间就认为对方挂掉
initdead 120                    #第一个节点起来后，等待其他节点的时间
baud   19200                   #串行线缆的发送速率是多少
auto_failback on                #故障恢复后是否转移回来
ping 10.10.10.254               #ping node，万一节点主机不通，要ping哪个主机
ping_group group1 10.10.10.254 10.10.10.253            #ping node group，只要组内有一台主机能ping通就可以
respawn hacluster /usr/lib/heartbeat/ipfail            #当一个heartbeat服务停止了，会重启对端的heartbeat服务
deadping 30                    #ping nodes多长时间ping不通，就真的故障了
# serial  serialportname ...                          #串行设备是什么
serial /dev/ttyS0            # Linux
serial /dev/cuaa0                  # FreeBSD
serial /dev/cuad0                  # FreeBSD 6.x
serial /dev/cua/a                  # Solaris
#  What interfaces to broadcast heartbeats over?            #如果使用以太网，定义使用单播、组播还是广播发送心跳信息
bcast  eth0                            #广播
mcast eth0 225.0.0.1 694 1 0                                    #组播
ucast eth0 192.168.1.2                                             #单播，只有两个节点的时候才用单播
#定义stonith主机
stonith_host *     baytech 10.0.0.3 mylogin mysecretpassword
stonith_host ken3  rps10 /dev/ttyS1 kathy 0 
stonith_host kathy rps10 /dev/ttyS1 ken3 0 
#    Tell what machines are in the cluster                 #告诉集群中有多少个节点，每一个节点用node和主机名写一行，主机名要和uname -n保持一致
node   ken3
node   kathy
一般只要定义心跳信息的发送方式、和集群中的节点就行。
bcasteth0
nodetest1.local
nodetest2.local

6.5、定义haresources资源配置文件：

[[email protected] ~]# vim /etc/ha.d/haresources
#node110.0.0.170Filesystem::/dev/sda1::/data1::ext2#默认用作主节点的主机名，要跟uname -n一样。VIP是多少。自动挂载哪个设备，到哪个目录下，文件类型是什么。资源类型的参数要用双冒号隔开
#just.linux-ha.org135.9.216.110http#和上面一样，这里使用的资源是在/etc/rc.d/init.d/下面的，默认先到/etc/ha.d/resource.d/目录下找资源，找不到在到/etc/rc.d/init.d/目录找
master1.localIPaddr::192.168.10.2/24/eth0 mysqld
master1.localIPaddr::192.168.10.2/24/eth0 drbddisk::data Filesystem::/dev/drbd1::/data::ext3mysqld#使用IPaddr脚本来配置VIP

6.6、拷贝master1.local的配置文件到master2.local上

[[email protected] ~]# scp -p ha.cf haresources authkeys master2.local:/etc/ha.d/

7、启动heartbeat

[[email protected] ~]# service heartbeat start
[[email protected] ~]# ssh master2.local ‘service heartbeat start‘#一定要在test1上通过ssh的方式启动test2节点的heartbeat

7.1、查看heartbeat启动日志

[[email protected] ~]# tail -f /var/log/messages
Feb 16 15:12:45 test-1 heartbeat: [16056]: info: Configuration validated. Starting heartbeat 3.0.4
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: heartbeat: version 3.0.4
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Heartbeat generation: 1455603909
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: UDP Broadcast heartbeat started on port 694 (694) interface eth0
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: UDP Broadcast heartbeat closed on port 694 interface eth0 - Status: 1
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: glib: ping heartbeat started.
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_TriggerHandler: Added signal manual handler
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: G_main_add_SignalHandler: Added signal handler for signal 17
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Local status now set to: ‘up‘
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Link 192.168.10.1:192.168.10.1 up.
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Status update for node 192.168.10.1: status ping
Feb 16 15:12:45 test-1 heartbeat: [16057]: info: Link test1.local:eth0 up.
Feb 16 15:12:51 test-1 heartbeat: [16057]: info: Link test2.local:eth0 up.
Feb 16 15:12:51 test-1 heartbeat: [16057]: info: Status update for node test2.local: status up
Feb 16 15:12:51 test-1 harc(default)[16068]: info: Running /etc/ha.d//rc.d/status status
Feb 16 15:12:52 test-1 heartbeat: [16057]: WARN: 1 lost packet(s) for [test2.local] [3:5]
    Feb 16 15:12:52 test-1 heartbeat: [16057]: info: No pkts missing from test2.local!
Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Comm_now_up(): updating status to active
Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Local status now set to: ‘active‘
Feb 16 15:12:52 test-1 heartbeat: [16057]: info: Status update for node test2.local: status active
Feb 16 15:12:52 test-1 harc(default)[16086]: info: Running /etc/ha.d//rc.d/status status
Feb 16 15:13:02 test-1 heartbeat: [16057]: info: local resource transition completed.
Feb 16 15:13:02 test-1 heartbeat: [16057]: info: Initial resource acquisition complete (T_RESOURCES(us))
Feb 16 15:13:02 test-1 heartbeat: [16057]: info: remote resource transition completed.
Feb 16 15:13:02 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16138]: INFO:  Resource is stopped
Feb 16 15:13:02 test-1 heartbeat: [16102]: info: Local Resource acquisition completed.
Feb 16 15:13:02 test-1 harc(default)[16219]: info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp
Feb 16 15:13:02 test-1 ip-request-resp(default)[16219]: received ip-request-resp IPaddr::192.168.10.2/24/eth0 OK yes
Feb 16 15:13:02 test-1 ResourceManager(default)[16238]: info: Acquiring resource group: test1.local IPaddr::192.168.10.2/24/eth0 mysqld
Feb 16 15:13:02 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16264]: INFO:  Resource is stopped
Feb 16 15:13:03 test-1 ResourceManager(default)[16238]: info: Running /etc/ha.d/resource.d/IPaddr 192.168.10.2/24/eth0 start
Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: Adding inet address 192.168.10.2/24 with broadcast address 192.168.10.255 to device eth0
Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: Bringing device eth0 up
Feb 16 15:13:03 test-1 IPaddr(IPaddr_192.168.10.2)[16386]: INFO: /usr/libexec/heartbeat/send_arp -i 200 -r 5 -p /var/run/resource-agents/send_arp-192.168.10.2 eth0 192.168.10.2 auto not_used not_used
Feb 16 15:13:03 test-1 /usr/lib/ocf/resource.d//heartbeat/IPaddr(IPaddr_192.168.10.2)[16360]: INFO:  Success
Feb 16 15:13:03 test-1 ResourceManager(default)[16238]: info: Running /etc/init.d/mysqld  start
Feb 16 15:13:04 test-1 ntpd[1605]: Listen normally on 15 eth0 192.168.10.2 UDP 123

说明：

1、Link test1.local:eth0 up、Link test2.local:eth0 up #两个节点连接成功并为UP状态。

2、Link 192.168.10.1:192.168.10.1 up #ping节点的IP也已经启动

3、info: Running /etc/init.d/mysqld start #mysql启动成功

4、Listen normally on 15 eth0 192.168.10.2 UDP 123 #VIP启动成功

7.2、查看heartbeat的VIP

[[email protected] ha.d]# ip add |grep "10.2"
inet 192.168.10.55/24 brd 192.168.10.255 scope global eth0
inet 192.168.10.2/24 brd 192.168.10.255 scope global secondary eth0

[[email protected] ha.d]# ip add |grep "10.2"
inet 192.168.10.56/24 brd 192.168.10.255 scope global eth0

注：可以看到现在VIP是在master1.local主机上。而master2.local上没有VIP

8、测试效果

8.1、正常情况下连接mysql

[[email protected] ha.d]# mysql -uroot -h‘192.168.10.2‘ -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.44 Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.
mysql> show variables like ‘server_id‘;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 1     |
+---------------+-------+
1 row in set (0.00 sec)
mysql>

8.2、关闭master1.local上的heartbeat

[[email protected] ha.d]# service heartbeat stop
Stopping High-Availability services: Done.
[[email protected] ha.d]# ip add |grep "192.168.10.2"
inet 192.168.10.55/24 brd 192.168.10.255 scope global eth0
[[email protected] ha.d]# ip add |grep "192.168.10.2"
inet 192.168.10.56/24 brd 192.168.10.255 scope global eth0
inet 192.168.10.2/24 brd 192.168.10.255 scope global secondary eth0

注：这个时候VIP已经漂移到了master2.local主机上，我们在来看看连接mysql的server_id

[[email protected] ha.d]# mysql -uroot -h‘192.168.10.2‘ -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 2
Server version: 5.5.44 Source distribution
Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved.
Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.
Type ‘help;‘ or ‘\h‘ for help. Type ‘\c‘ to clear the current input statement.
mysql> show variables like ‘server_id‘;
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| server_id     | 2     |
+---------------+-------+
1 row in set (0.00 sec)
mysql>

注：server_id已经从1变成了2，证明此时访问的是master2.local主机上的mysql服务

测试完毕。下面配置drbd让两台mysql服务器之间使用同一个文件系统，以实现mysql的写高可用。

9、配置DRBD

DRBD：（Distributed Replicated Block Device）分布式复制块设备，是linux内核中的一个模块。DRBD作为磁盘镜像来讲，它一定是主从架构的，它决不允许两个节点同时读写，仅允许一个节点能读写，从节点不能读写和挂载，

但是DRDB有双主的概念，主、从的角色可以切换。DRBD分别将位于两台主机上的硬盘或磁盘分区做成镜像设备，当我们客户端的程序向主节点发起存储请求的时候，这个数据会在底层以TCP/IP协议按位同布一份到备节点，

所以这能保证只要我们在主节点上存的数据，备节点上在按位一定有一模一样的一份数据。这是在两台主机上实现的，这意味着DRBD是工作在内核模块当中。不像RAID1的镜像是在同一台主机上实现的。

DRBD双主模型的实现：一个节点在数据访问的时候，它一定会将数据、元数据载入内存的，而且它对于某个文件内核中加锁的操作，另一个节点的内核是看不到的，那如果它能将它自己施加的锁通知给另一个节点的内核就可以了。

在这种情况下，我们就只能通过message layer（heartbeat、corosync都可）、pathmaker（把DRBD定义成资源），然后把这两个主机上对应的镜像格式化成集群文件系统（GFS2/OCFS2）。

这就是基于结合分布式文件锁管理器（DLM Distributed Lock Manager）以及集群文件系统所完成的双主模型。DRBD集群只允许有两个节点，要么双主，要么主从。

9.1、DRBD的三种工作模型

A、数据在本地DRBD存储完成后向应用程序返回存储成功的消息，异步模型。效率高，性能好。数据不安全

B、数据在本地DRBD存储完成后并且通过TCP/IP把所有数据发送到从DRBD中，才向本地的应用程序返回存储成功的消息，半同步模型。一般不用。

C、数据在本地DRBD存储完成后，通过TCP/IP把所有数据发送到从DRBD中，从DRBD存储完成后才向应用程序返回成功的消息，同步模型。效率低，性能若，但是数据安全可靠性大，用的最多。

9.2、DRBD的资源

1、资源名称，可以是任意的ascii码不包含空格的字符

2、DRBD设备，在双方节点上，此DRBD设备的设备文件，一般为/dev/drbdN，其主设备号相同都是147，此设备号用来标识不通的设备

3、磁盘配置，在双方节点上，各自提供的存储设备，可以是个分区，可以是任何类型的块设备，也可以是lvm

4、网络配置，双方数据同步时，所使用的网络属性

9.3、安装DRBD

drbd在2.6.33开始，才整合进内核的。

9.3.1、下载drbd

[[email protected] ~]# wget -O /usr/local/src http://oss.linbit.com/drbd/8.4/drbd-8.4.3.tar.gz

9.3.2、安装drbd软件

[[email protected] ~]# cd /usr/local/src
[[email protected] src]# tar -zxvf drbd-8.4.3.tar.gz
[[email protected] src]# cd /usr/local/src/drbd-8.4.3
[[email protected] drbd-8.4.3]# ./configure --prefix=/usr/local/drbd --with-km
[[email protected] drbd-8.4.3]# make KDIR=/usr/src/kernels/2.6.32-573.18.1.el6.x86_64

[[email protected] drbd-8.4.3]# make install
[[email protected] drbd-8.4.3]# mkdir -p /usr/local/drbd/var/run/drbd
[[email protected] drbd-8.4.3]# cp /usr/local/drbd/etc/rc.d/init.d/drbd  /etc/rc.d/init.d/

9.3.3、安装drbd模块

[[email protected] drbd-8.4.3]# cd drbd/
[[email protected] drbd]# make clean
[[email protected] drbd]# make KDIR=/usr/src/kernels/2.6.32-573.18.1.el6.x86_64
[[email protected] drbd]# cp drbd.ko /lib/modules/`uname -r`/kernel/lib/
[[email protected] drbd]# modprobe drbd
[[email protected] drbd]# lsmod | grep drbd

9.3.4、为drbd创建新分区

[[email protected] drbd]# fdisk /dev/sdb
WARNING: DOS-compatible mode is deprecated. It‘s strongly recommended to
switch off the mode (command ‘c‘) and change display units to
sectors (command ‘u‘).
Command (m for help): n
Command action
e   extended
p   primary partition (1-4)
p
Partition number (1-4): 1
First cylinder (1-1305, default 1): 
Using default value 1
Last cylinder, +cylinders or +size{K,M,G} (1-1305, default 1305): +9G
Command (m for help): w
The partition table has been altered!
Calling ioctl() to re-read partition table.
AWRNING: Re-reading the partition table failed with error 16: Device or resource busy.
The kernel still uses the old table.
The new table will be used at the next reboot.
Syncing disks.
[[email protected] drbd]# partprobe /dev/sdb

test2节点的drbd安装和分区配置步骤略过，和test1上一样安装，test2节点的drbd配置文件保证和test1节点一样，使用scp传到test2节点即可

10、配置drbd

10.1、配置drbd的通用配置文件

[[email protected] drbd.d]# cd /usr/local/drbd/etc/drbd.d
[[email protected] drbd.d]# vim global_common.conf 
global {                #global是全局配置
usage-count no;        #官方用来统计有多少个客户使用drbd的
# minor-count dialog-refresh disable-ip-verification
}
common {              #通用配置，用来配置所有资源那些相同属性的。为drbd提供默认属性的
protocol C;            #默认使用协议C，即同步模型。
handlers {        #处理器段，用来配置drbd的故障切换操作
# These are EXAMPLE handlers only.
# They may have severe implications,
# like hard resetting the node under certain circumstances.
# Be careful when chosing your poison.
pri-on-incon-degr "/usr/lib/drbd/notify-pri-on-incon-degr.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";#
pri-lost-after-sb "/usr/lib/drbd/notify-pri-lost-after-sb.sh; /usr/lib/drbd/notify-emergency-reboot.sh; echo b > /proc/sysrq-trigger ; reboot -f";        #脑裂之后的操作
local-io-error "/usr/lib/drbd/notify-io-error.sh; /usr/lib/drbd/notify-emergency-shutdown.sh; echo o > /proc/sysrq-trigger ; halt -f";                    #本地i/o错误之后的操作
# fence-peer "/usr/lib/drbd/crm-fence-peer.sh";
# split-brain "/usr/lib/drbd/notify-split-brain.sh root";
# out-of-sync "/usr/lib/drbd/notify-out-of-sync.sh root";
# before-resync-target "/usr/lib/drbd/snapshot-resync-target-lvm.sh -p 15 -- -c 16k";
# after-resync-target /usr/lib/drbd/unsnapshot-resync-target-lvm.sh;
}
startup {
     # wfc-timeout degr-wfc-timeout outdated-wfc-timeout wait-after-sb    #设备启动时，两个节点要同步，设置节点的等待时间，超时时间等
}
options {
# cpu-mask on-no-data-accessible
}
disk {
on-io-error detach;            #一旦发生i/o错误，就把磁盘卸载。不继续进行同步
# size max-bio-bvecs on-io-error fencing disk-barrier disk-flushes
# disk-drain md-flushes resync-rate resync-after al-extents
# c-plan-ahead c-delay-target c-fill-target c-max-rate
# c-min-rate disk-timeout
}
net {                         #设置网络的buffers/cache大小，初始化时间等
# protocol timeout max-epoch-size max-buffers unplug-watermark
# connect-int ping-int sndbuf-size rcvbuf-size ko-count
# allow-two-primaries cram-hmac-alg shared-secret after-sb-0pri
# after-sb-1pri after-sb-2pri always-asbp rr-conflict
# ping-timeout data-integrity-alg tcp-cork on-congestion
# congestion-fill congestion-extents csums-alg verify-alg
# use-rle
cram-hmac-alg "sha1";        #数据加密使用的算法
shared-secret "mydrbd1fa2jg8";        #验证密码
}
syncer {
rate 200M;                        #定义数据传输速率
}
}

10.2、配置资源文件，资源配置文件的名字要和资源文件中的一样

[[email protected] drbd.d]# vim mydrbd.res
resource mydrbd {            #资源名称，可以是任意的ascii码不包含空格的字符
on test1.local {            #节点1，每个节点必须要能使用名称的方式解析对方节点
device /dev/drbd0;            #drbd设备的文件名叫什么
disk /dev/sdb1;                #分区设备是哪个
address 192.168.10.55:7789;#节点ip和监听的端口
meta-disk internal;            #drbd的meta（原数据）放在什么地方，internal是放在设备内部
}
on test2.local {
device /dev/drbd0;
disk /dev/sdb1;
address 192.168.10.56:7789;
meta-disk internal;
}
}

10.3、两个节点的配置文件一样，使用工具把配置文件传到另一个节点

[[email protected] drbd.d]# scp -r /usr/local/drbd/etc/drbd.* test2.local:/usr/local/drbd/etc/

10.4、在每个节点上初始化已定义的资源

[[email protected] drbd.d]# drbdadm create-md mydrbd
--== Thank you for participating in the global usage survey ==--
The server‘s response is:
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
[[email protected] drbd.d]#

[[email protected] drbd.d]# drbdadm create-md mydrbd
--== Thank you for participating in the global usage survey ==--
The server‘s response is:
Writing meta data...
initializing activity log
NOT initialized bitmap
New drbd meta data block successfully created.
[[email protected] drbd.d]#

10.5、分别启动两个节点的drbd服务

[[email protected] drbd.d]# service drbd start
[[email protected] drbd.d]# service drbd start

11、测试drbd的同步

11.1、查看drbd的启动状态

[[email protected] drbd.d]# cat /proc/drbd
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2016-02-23 10:23:03
0: cs:Connected ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----        #两个节点都是从，将来可以把一个提升为主。Inconsistent处于没有同步的状态
ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:0

11.2、提升一个节点为主，并覆盖从节点的drbd分区数据。在要提升为主的节点上执行

[[email protected] drbd.d]# drbdadm -- --overwrite-data-of-peer primary mydrbd

11.3、查看主节点同步状态

[[email protected] drbd.d]# watch -n 1 cat /proc/drbd
Every 1.0s: cat /proc/drbd                                                                                                                                                                Tue Feb 23 17:10:55 2016
version: 8.4.3 (api:1/proto:86-101)
    GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2016-02-23 10:23:03
0: cs:SyncSource ro:Primary/Secondary ds:UpToDate/Inconsistent C r---n-
ns:619656 nr:0 dw:0 dr:627840 al:0 bm:37 lo:1 pe:8 ua:64 ap:0 ep:1 wo:b oos:369144
[=============>.......] sync‘ed: 10.3% (369144/987896)K
finish: 0:00:12 speed: 25,632 (25,464) K/sec

11.4、查看从节点的状态

[[email protected] drbd]# cat /proc/drbd 
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2016-02-22 16:05:34
    0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:4 nr:9728024 dw:9728028 dr:1025 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

11.5、在主节点格式化分区并挂在写入数据测试

[[email protected] drbd]# mke2fs -j /dev/drbd0
[[email protected] drbd]# mkdir /mydrbd
[[email protected] drbd]# mount /dev/drdb0 /mydrbd
[[email protected] drbd]# cd /mydrbd
[[email protected] mydrbd]# touch drbd_test_file
[[email protected] mydrbd]# ls /mydrbd/
drbd_test_file  lost+found

11.6、把主节点降级为从，把从节点提升为主。查看数据是否同步

11.1、主节点操作

11.1.1、卸载分区，注意卸载的时候要退出挂在目录，否则会显示设备忙，不能卸载

[[email protected] mydrbd]# cd ~
[[email protected] ~]# umount /mydrbd
[[email protected] ~]# drbdadm secondary mydrbd

11.1.2、查看现在的drbd状态

[[email protected] ~]# cat /proc/drbd 
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2016-02-22 16:05:34
    0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:4 nr:9728024 dw:9728028 dr:1025 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注：可以看到，现在drbd的两个节点的状态都是secondary的，下面把从节点提升为主

11.2、从节点操作

11.2.1、提升操作

[[email protected] ~]# drdbadm primary mydrbd

11.2.2、挂在drbd分区

[[email protected] ~]# mkdir /mydrbd
[[email protected] ~]# mount /dev/drbd0 /mydrbd/

11.2.3、查看是否有数据

[[email protected] ~]# ls /myddrbd/
drbd_test_file  lost+found

注：可以看到从节点切换成主后，已经同步了数据。drbd搭建完成。下面结合corosync+mysql配置双主高可用。

12、结合corosync+drbd+mysql实现数据库双主高可用

将drbd配置为corosync双节点高可用集群中的资源，能够实现主从角色的自动切换，注意，要把某一个服务配置为高可用集群的资源，一定不能让这个服务开机自动启动。

12.1、关闭两台节点的drbd开机自启动

12.1.1、主节点操作

[[email protected] drbd.d]# chkconfig drbd off
[[email protected] drbd.d]# chkconfig --list |grep drbd
drbd          0:off    1:off    2:off    3:off    4:off    5:off    6:off

12.1.2、从节点操作

[[email protected] drbd.d]# chkconfig drbd off
[[email protected] drbd.d]# chkconfig --list |grep drbd
drbd           0:off    1:off    2:off    3:off    4:off    5:off    6:off

12.2、卸载drbd的文件系统并把主节点降级为从节点

12.2.1、从节点操作，注意，这里的从节点刚才提升为主了。现在把他降级

[[email protected] drbd]# umount /mydata/
[[email protected] drbd]# drbdadm secondary mydrbd
[[email protected] drbd]# cat /proc/drbd 
version: 8.4.3 (api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515 build by [email protected], 2016-02-22 16:05:34
0: cs:Connected ro:Secondary/Secondary ds:UpToDate/UpToDate C r-----
ns:8 nr:9728024 dw:9728032 dr:1073 al:1 bm:577 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0

注：确保两个节点都是secondary

12.3、停止两个节点的drbd服务

12.3.1、从节点操作

[[email protected] drbd]# service drbd stop
Stopping all DRBD resources: .
[[email protected] drbd]#

12.3.2、主节点操作

[[email protected] drbd.d]# service drbd stop
Stopping all DRBD resources: .
[[email protected] drbd.d]#

12.4、安装corosync并创建日志目录

12.4.1、主节点操作

[[email protected] drbd.d]# wget -P /etc/yum.repos.d http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo
[[email protected] drbd.d]# yum install corosync pacemaker crmsh
[[email protected] drbd.d]# mkdir /var/log/cluster

12.4.2、从节点操作

[[email protected] drbd.d]# wget  -P /etc/yum.repos.d http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo
[[email protected] drbd.d]# mkdir /var/log/cluster
[[email protected] drbd.d]# yum install corosync pacemaker crmsh

12.5、corosync配置文件

12.5.1、主节点操作

[[email protected] drbd.d]# cd /etc/corosync/
[[email protected] corosync]# cp corosync.conf.example corosync.conf

12.6、配置主节点配置文件，生成corosync秘钥文件并复制给从节点（包括主配置文件）

12.6.1、主节点操作

[[email protected] corosync]# vim corosync.conf # Please read the corosync.conf.5 manual page compatibility: whitetank totem { version: 2# secauth: Enable mutual node authentication. If you choose to # enable this ("on"), then do remember to create a shared # secret with "corosync-keygen". secauth: on threads: 2 # interface: define at least one interface to communicate # over. If you define more than one interface stanza, you must # also set rrp_mode. interface { # Rings must be consecutively numbered, starting at 0. ringnumber: 0 # This is normally the *network* address of the # interface to bind to. This ensures that you can use # identical instances of this configuration file # across all your cluster nodes, without having to # modify this option. bindnetaddr: 192.168.10.0 # However, if you have multiple physical network # interfaces configured for the same subnet, then the # network address alone is not sufficient to identify # the interface Corosync should bind to. In that case, # configure the *host* address of the interface # instead: bindnetaddr: 192.168.10.0 # When selecting a multicast address, consider RFC # 2365 (which, among other things, specifies that # 239.255.x.x addresses are left to the discretion of # the network administrator). Do not reuse multicast # addresses across multiple Corosync clusters sharing # the same network. mcastaddr: 239.212.16.19 # Corosync uses the port you specify here for UDP # messaging, and also the immediately preceding # port. Thus if you set this to 5405, Corosync sends # messages over UDP ports 5405 and 5404. mcastport: 5405 # Time-to-live for cluster communication packets. The # number of hops (routers) that this ring will allow # itself to pass. Note that multicast routing must be # specifically enabled on most network routers. ttl: 1 #每一个数据报文不允许经过路由 } } logging { # Log the source file&nbs

以上是关于corosync+drbd+mysql实现的高可用的主要内容，如果未能解决你的问题，请参考以下文章

corosync+pacemaker+drbd 实现mysql的高可用性

Corosync+pacemaker+DRBD+mysql（mariadb）实现高可用（ha）的mysql集群（centos7）

drbd+corosync实现高可用mysql

drbd+corosync+pacemaker构建高可用MySQL集群

高可用分布式存储（Corosync+Pacemaker+DRBD+MooseFS）

corosync+pacemaker+crm实现drbd高可用