Corosync+Pacemaker+DRBD+NFS高可用实例配置
Posted koumm
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Corosync+Pacemaker+DRBD+NFS高可用实例配置相关的知识,希望对你有一定的参考价值。
环境说明:
操作系统: CentOS 6.6 x64,本文采用rpm方式安装corosync+pacemaker+drbd+nfs。
本文与上文配置进行了一个对比,实现相同的功能,具体哪个好,还是根据需求以及对哪个方案理解比较透,Heartbeat+DRBD+NFS高可用实例配置http://koumm.blog.51cto.com/703525/1737702
一、双机Heartbeat配置
1. app1,app2配置hosts文件,以及主机名。
[root@app1 soft]# vi /etc/hosts
127.0.0.1 localhost
localhost.localdomain localhost4 localhost4.localdomain4
::1
localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.0.24 app1
192.168.0.25 app2
10.10.10.24 app1-priv
10.10.10.25 app2-priv
说明:10段是心跳IP, 192.168段是业务IP, 采用VIP地址是192.168.0.26。
2. 关闭selinux与防火墙
sed -i \'/SELINUX/s/enforcing/disabled/\' /etc/selinux/config
setenforce 0
chkconfig iptables off
service iptables stop
3. 配置各节点ssh互信,好像可配\\可不配,方便管理。
app1:
[root@app1 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -P \'\'
[root@app1 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@app2
app2:
[root@app2 ~]# ssh-keygen -t rsa -f ~/.ssh/id_rsa -P \'\'
[root@app2 ~]# ssh-copy-id -i .ssh/id_rsa.pub root@app1
二、DRDB安装配置
1. app1,app2配置hosts文件以及准备磁盘分区
app1: /dev/sdb1 —> app2: /dev/sdb1
2. app1,app2安装drbd并安装
(1) 下载drbd安装包, CentOS6.6采用kmod-drbd84-8.4.5-504.1安装包才可用。
http://rpm.pbone.net/
drbd84-utils-8.9.1-1.el6.elrepo.x86_64.rpm
kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm
# rpm -ivh drbd84-utils-8.9.5-1.el6.elrepo.x86_64.rpm
kmod-drbd84-8.4.5-504.1.el6.x86_64.rpm
Preparing...
########################################### [100%]
1:drbd84-utils ########################################### [ 50%]
2:kmod-drbd84 ###########################################
[100%]
Working. This may take some time ...
Done.
#
(2) 加载DRBD到内核模块
app1,app2分别操作,并加入到/etc/rc.local文件中。
modprobe drbd
lsmode |grep
drbd
3. 创建修改配置文件。节点1,节点2一样配置。
[root@app1 ~]# vi /etc/drbd.d/global_common.conf
global {
usage-count no;
}
common {
protocol C;
disk {
on-io-error detach;
no-disk-flushes;
no-md-flushes;
}
net {
sndbuf-size 512k;
max-buffers 8000;
unplug-watermark 1024;
max-epoch-size
8000;
cram-hmac-alg "sha1";
shared-secret "hdhwXes23sYEhart8t";
after-sb-0pri
disconnect;
after-sb-1pri disconnect;
after-sb-2pri disconnect;
rr-conflict disconnect;
}
syncer {
rate 300M;
al-extents 517;
}
}
resource data {
on app1 {
device /dev/drbd0;
disk /dev/sdb1;
address
10.10.10.24:7788;
meta-disk internal;
}
on app2 {
device /dev/drbd0;
disk /dev/sdb1;
address 10.10.10.25:7788;
meta-disk internal;
}
}
4. 初始化资源
在app1和app2上分别执行:
# drbdadm create-md data
initializing activity log
NOT initializing bitmap
Writing meta
data...
New drbd meta data block successfully created.
5. 启动服务
在app1和app2上分别执行:或采用 drbdadm up data
# service drbd start
Starting DRBD resources: [
create res: data
prepare disk:
data
adjust disk: data
adjust net: data
]
..........
#
6. 查看启动状态, 两节点应均处于Secondary状态。
cat /proc/drbd #或者直接使用命令drbd-overview
节点1:
[root@app1 drbd.d]# cat /proc/drbd
version: 8.4.5
(api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96
build by root@node1.magedu.com, 2015-01-02 12:06:20
0: cs:Connected
ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0
dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116
节点2:
[root@app2 drbd.d]# cat /proc/drbd
version: 8.4.5
(api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96
build by root@node1.magedu.com, 2015-01-02 12:06:20
0: cs:Connected
ro:Secondary/Secondary ds:Inconsistent/Inconsistent C r-----
ns:0 nr:0
dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:20964116
7. 将其中一个节点配置为主节点
我们需要将其中一个节点设置为Primary,在要设置为Primary的节点上执行如下两条命令均可:
drbdadm --
--overwrite-data-of-peer primary data
主节点查看同步状态:
[root@app1
drbd.d]# cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash:
1d360bde0e095d495786eaeb2a1ac76888e4db96 build by root@node1.magedu.com,
2015-01-02 12:06:20
0: cs:SyncSource ro:Primary/Secondary
ds:UpToDate/Inconsistent C r-----
ns:1229428 nr:0 dw:0 dr:1230100 al:0
bm:0 lo:0 pe:2 ua:0 ap:0 ep:1 wo:d oos:19735828
[>...................] sync\'ed: 5.9% (19272/20472)M
finish:
0:27:58 speed: 11,744 (11,808) K/sec
[root@app1 drbd.d]#
8. 创建文件系统
文件系统的挂载只能在Primary节点进行,只有在设置了主节点后才能对drbd设备进行格式化, 格式化与手动挂载测试。
[root@app1 ~]# mkfs.ext4 /dev/drbd0
[root@app1 ~]# mount /dev/drbd0
/data
三、安装配置NFS
1. app1,app2节点配置nfs
# vi /etc/exports
/data 192.168.0.0/24(rw,no_root_squash)
2. app1,app2节点配置nfs
# service rpcbind start
# service nfs start
# chkconfig rpcbind on
# chkconfig nfs on
四、corosync+pacemaker
1. app1,app2配置安装corosync pacemaker
# yum install corosync pacemaker -y
2. app1,app2安装crmsh
RHEL自6.4起不再提供集群的命令行配置工具crmsh,要实现对集群资源管理,还需要独立安装crmsh。
crmsh的rpm安装可从如下地址下载:http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/
[root@app1 crm]# yum install python-dateutil -y
说明:python-pssh、pssh依懒于python-dateutil包
[root@app1 crm]# rpm -ivh pssh-2.3.1-4.2.x86_64.rpm
python-pssh-2.3.1-4.2.x86_64.rpm crmsh-2.1-1.6.x86_64.rpm
warning:
pssh-2.3.1-4.2.x86_64.rpm: Header V3 RSA/SHA1 Signature, key ID 17280ddf: NOKEY
Preparing... ###########################################
[100%]
1:python-pssh
########################################### [ 33%]
2:pssh ########################################### [ 67%]
3:crmsh ###########################################
[100%]
[root@app1 crm]#
[root@app1 crm]#
3. 创建corosync配置文件,app1,app2一样。
cd /etc/corosync/
cp corosync.conf.example corosync.conf
vi /etc/corosync/corosync.conf
# Please read the corosync.conf.5 manual
page
compatibility: whitetank
totem {
version: 2
secauth: on
threads: 0
interface {
ringnumber: 0
bindnetaddr: 10.10.10.0
mcastaddr: 226.94.8.8
mcastport: 5405
ttl: 1
}
}
logging {
fileline: off
to_stderr: no
to_logfile: yes
to_syslog: no
logfile:
/var/log/cluster/corosync.log
debug: off
timestamp: on
logger_subsys {
subsys: AMF
debug: off
}
}
amf {
mode: disabled
}
service {
ver: 1
name:
pacemaker
}
aisexec {
user: root
group:
root
}
4. 创建认证文件,app1,app2一样
各节点之间通信需要安全认证,需要安全密钥,生成后会自动保存至当前目录下,命名为authkey,权限为400。
[root@app1 corosync]# corosync-keygen
Corosync Cluster Engine
Authentication key generator.
Gathering 1024 bits for key from /dev/random.
Press keys on your keyboard to generate entropy.
Press keys on your
keyboard to generate entropy (bits = 128).
Press keys on your keyboard to
generate entropy (bits = 192).
Press keys on your keyboard to generate
entropy (bits = 256).
Press keys on your keyboard to generate entropy (bits
= 320).
Press keys on your keyboard to generate entropy (bits = 384).
Press keys on your keyboard to generate entropy (bits = 448).
Press keys
on your keyboard to generate entropy (bits = 512).
Press keys on your
keyboard to generate entropy (bits = 576).
Press keys on your keyboard to
generate entropy (bits = 640).
Press keys on your keyboard to generate
entropy (bits = 704).
Press keys on your keyboard to generate entropy (bits
= 768).
Press keys on your keyboard to generate entropy (bits = 832).
Press keys on your keyboard to generate entropy (bits = 896).
Press keys
on your keyboard to generate entropy (bits = 960).
Writing corosync key to
/etc/corosync/authkey.
[root@app1 corosync]#
5. 将刚才配置的两个文件同步至app2
# scp authkeys corosync.conf root@app2:/etc/corosync/
6. 启动corosync\\pacemaker服务,测试能否正常提供服务
节点1:
[root@app1 ~]# service corosync start
Starting Corosync
Cluster Engine (corosync): [OK]
[root@app1 ~]# service pacemaker start
Starting Pacemaker Cluster
Manager [OK]
配置服务开机自启动:
chkconfig corosync on
chkconfig pacemaker on
节点2:
[root@app2 ~]# service corosync start
Starting Corosync
Cluster Engine (corosync): [OK]
[root@app1 ~]# service pacemaker start
Starting Pacemaker Cluster
Manager [OK]
配置服务开机自启动:
chkconfig corosync on
chkconfig pacemaker on
7. 测试corosync,pacemaker,crmsh安装情况
(1) 查看节点情况
[root@app1 ~]# crm status
Last updated: Tue Jan 26 13:13:19 2016
Last
change: Mon Jan 25 17:46:04 2016 via cibadmin on app1
Stack: classic openais
(with plugin)
Current DC: app1 - partition with quorum
Version:
1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
0 Resources
configured
Online: [ app1 app2 ]
(2) 查看端口启动情况
# netstat -tunlp
Active Internet connections (only servers)
Proto
Recv-Q Send-Q Local Address Foreign Address
State PID/Program name
udp 0 0
10.10.10.25:5404 0.0.0.0:*
2828/corosync
udp 0 0 10.10.10.25:5405
0.0.0.0:* 2828/corosync
udp
0 0 226.94.8.8:5405 0.0.0.0:*
2828/corosync
(3) 查看日志
[root@app1 corosync]# tail -f /var/log/cluster/corosync.log
可以查看日志中关键信息:
Jan 23 16:09:30 corosync [MAIN ] Corosync Cluster Engine
(\'1.4.7\'): started and ready to provide service.
Jan 23 16:09:30 corosync
[MAIN ] Successfully read main configuration file
\'/etc/corosync/corosync.conf\'.
....
Jan 23 16:09:30 corosync [TOTEM ]
Initializing transmit/receive security: libtomcrypt SOBER128/SHA1HMAC (mode 0).
Jan 23 16:09:31 corosync [TOTEM ] The network interface [10.10.10.24] is now
up.
Jan 23 16:09:31 corosync [TOTEM ] A processor joined or left the
membership and a new membership was formed.
Jan 23 16:09:48 corosync [TOTEM
] A processor joined or left the membership and a new membership was formed.
[root@app1 corosync]#
五、配置pacemaker
1. 基本配置
corosync默认启用了stonith功能,而我们要配置的集群并没有stonith设备,因此在配置集群的全局属性时要对其禁用。
# crm
crm(live)# configure ##进入配置模式
crm(live)configure# property stonith-enabled=false ##禁用stonith设备
crm(live)configure# property no-quorum-policy=ignore ##不具备法定票数时采取的动作
crm(live)configure# rsc_defaults resource-stickiness=100
##设置默认的资源黏性,只对当前节点有效。
crm(live)configure#
verify ##校验
crm(live)configure#
commit ##校验没有错误再提交
crm(live)configure#
show ##查看当前配置
node app1
node app2
property cib-bootstrap-options: \\
dc-version=1.1.11-97629de \\
cluster-infrastructure="classic openais (with plugin)" \\
expected-quorum-votes=2 \\
stonith-enabled=false \\
default-resource-stickiness=100 \\
no-quorum-policy=ignore
或:
# crm configure property stonith-enabled=false
# crm configure property
no-quorum-policy=ignore
# crm configure property
default-resource-stickiness=100
2. 资源配置
#命令使用经验说明:verify报错的,可以直接退出,也可以采用edit编辑,修改正确为止。
# crm configure edit
可以直接编辑配置文件
(1) 添加VIP
不要单个资源提交,等所有资源及约束一起建立之后提交。
crm(live)configure# primitive vip
ocf:heartbeat:IPaddr params ip=192.168.0.26 cidr_netmask=24 nic=eth0:1 op
monitor interval=30s timeout=20s on-fail=restart
crm(live)configure#
verify #验证一下参数是否正确
说明:
primitive :定义资源命令
myip
:资源ID名,可自行定义
ocf:heartbeat:IPaddr :资源代理(RA)
params
ip=192.168.0.26 :定义VIP
op monitor :监控该资源
interval=30s :间隔时间
timeout=20s :超时时间
on-fail=restart :如服务非正常关闭,让其重启,如重启不了,再转移至其他节点
(2) 添加drdb服务
crm(live)configure# primitive mydrbd ocf:linbit:drbd params
drbd_resource=data op monitor role=Master interval=20 timeout=30 op monitor
role=Slave interval=30 timeout=30 op start timeout=240 op stop timeout=100
crm(live)configure# verify
把drbd设为主从资源:
crm(live)configure# ms ms_mydrbd mydrbd meta master-max=1 master-node-max=1
clone-max=2 clone-node-max=1 notify=true
crm(live)configure# verify
(3) 文件系统挂载服务:
crm(live)configure# primitive mystore ocf:heartbeat:Filesystem params
device=/dev/drbd0 directory=/data fstype=ext4 op start timeout=60s op stop
timeout=60s op monitor interval=30s timeout=40s on-fail=restart
crm(live)configure# verify
(3) 创建约束,很关键,VIP,DRBD, 目录挂载均在一台节点上,而且VIP,目录挂载均依懒于主DRBD.
创建组资源,vip与mystore一起。
crm(live)configure# group
g_service vip mystore
crm(live)configure# verify
创建位置约束,组资源的启动依懒于drbd主节点
crm(live)configure# colocation c_g_service inf:
g_service ms_mydrbd:Master
创建位置约整,mystore存储挂载依赖于drbd主节点
crm(live)configure# colocation mystore_with_drbd_master inf: mystore
ms_mydrbd:Master
启动顺序依懒,drbd启动后,创建g_service组资源
crm(live)configure# order
o_g_service inf: ms_mydrbd:promote g_service:start
crm(live)configure#
verify
crm(live)configure# commit
3. 配置完成后,查看状态
[root@app1 ~]# crm status
Last updated: Mon Jan 25 22:24:55 2016
Last
change: Mon Jan 25 22:24:46 2016
Stack: classic openais (with plugin)
Current DC: app2 - partition with quorum
Version: 1.1.11-97629de
2
Nodes configured, 2 expected votes
4 Resources configured
Online: [ app1 app2 ]
Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app1 ]
Slaves: [ app2 ]
Resource Group: g_service
vip
(ocf::heartbeat:IPaddr): Started app1
mystore
(ocf::heartbeat:Filesystem): Started app1
[root@app1 ~]#
#说明:切换测试时有时会出现警告提示,影响真实状态查看,可以采用如下方式清除,提示哪个资源报警就清哪个,清理后,再次crm status查看状态显示正常。
Failed actions:
mystore_stop_0 on app1 \'unknown error\'
(1): call=97, status=complete, last-rc-change=\'Tue Jan 26 14:39:21 2016\',
queued=6390ms, exec=0ms
[root@app1 ~]# crm resource cleanup mystore
Cleaning up mystore on app1
Cleaning up mystore on app2
Waiting for 2 replies from the CRMd.. OK
[root@app1 ~]#
(1) 查看DRBD挂载目录
[root@app2 ~]# df -h
Filesystem Size Used Avail Use% Mounted
on
/dev/mapper/vg_app2-lv_root
35G 3.7G 30G
11% /
tmpfs 497M 45M 452M 10% /dev/shm
/dev/sda1 477M 34M 418M 8% /boot
192.168.1.26:/data
20G 44M 19G 1% /mnt
/dev/drbd0 20G 44M 19G 1% /data
[root@app2 ~]#
(2) 查看DRBD主备情况
[root@app2 ~]# cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
root@node1.magedu.com, 2015-01-02 12:06:20
0: cs:Connected
ro:Primary/Secondary ds:UpToDate/UpToDate C r-----
ns:20484 nr:336
dw:468 dr:21757 al:4 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
[root@app1 ~]# cat /proc/drbd
version: 8.4.5 (api:1/proto:86-101)
GIT-hash: 1d360bde0e095d495786eaeb2a1ac76888e4db96 build by
root@node1.magedu.com, 2015-01-02 12:06:20
0: cs:Connected
ro:Secondary/Primary ds:UpToDate/UpToDate C r-----
ns:0 nr:20484
dw:20484 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0
(3) NFS客户端挂载读写正常
[root@vm15 ~]# df -h
Filesystem Size Used Avail Use% Mounted
on
/dev/sda3 21G 4.6G 15G 24% /
/dev/sda1
99M 23M 72M 25% /boot
tmpfs 7.4G 0 7.4G 0%
/dev/shm
/dev/mapper/vg-data 79G 71G 4.2G 95% /data
192.168.0.26:/data/ 5.0G 138M 4.6G 3% /mnt
[root@vm15 ~]#
[root@vm15 ~]#
[root@vm15 ~]# cd /mnt
[root@vm15 mnt]# ls
abc.txt lost+found
[root@vm15 mnt]# cp abc.txt a.txt
[root@vm15
mnt]#
[root@vm15 mnt]#
[root@vm15 mnt]# ls
a.txt abc.txt
lost+found
[root@vm15 mnt]#
[root@vm15 mnt]#
[root@vm15 mnt]#
4. 关机节点1测试
(1) 关闭app1节点,资源全都在节点2启动
[root@app2 ~]# crm status
Last updated: Tue Jan 26 13:31:54 2016
Last
change: Tue Jan 26 13:30:21 2016 via cibadmin on app1
Stack: classic openais
(with plugin)
Current DC: app2 - partition with quorum
Version:
1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
4 Resources
configured
Online: [ app2 ]
OFFLINE: [ app1 ]
Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app2 ]
Stopped: [ app1 ]
Resource Group: g_service
vip
(ocf::heartbeat:IPaddr): Started app2
mystore
(ocf::heartbeat:Filesystem): Started app2
[root@app2 ~]#
(2) 磁盘目录挂载成功
[root@app2 ~]# df -h
Filesystem Size
Used Avail Use% Mounted on
/dev/mapper/vg_app2-lv_root 36G 3.7G 30G
11% /
tmpfs 1004M 44M 960M 5% /dev/shm
/dev/sda1 485M 39M 421M 9% /boot
/dev/drbd0 5.0G 138M 4.6G 3% /data
[root@app2 ~]#
(3) DRBD也切换成了主节点:
[root@app2 ~]# cat /proc/drbd
version: 8.4.3
(api:1/proto:86-101)
GIT-hash: 89a294209144b68adb3ee85a73221f964d3ee515
build by gardner@, 2013-11-29 12:28:00
0: cs:WFConnection ro:Primary/Unknown
ds:UpToDate/DUnknown C r-----
ns:0 nr:144 dw:148 dr:689 al:1 bm:0 lo:0
pe:0 ua:0 ap:0 ep:1 wo:d oos:4
[root@app2 ~]#
节点1启动后,可以直接加入,资源也无需要再次切换。
5. 节点切换测试
# crm node standby app2 #app2离线
查看资源,节点资源直接切换到app1上面,还是重启效果好。
[root@app1 ~]# crm status
Last updated:
Tue Jan 26 14:30:05 2016
Last change: Tue Jan 26 14:29:59 2016 via
crm_attribute on app2
Stack: classic openais (with plugin)
Current DC:
app2 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes
configured, 2 expected votes
4 Resources configured
Node app2: standby
Online: [ app1 ]
Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app1 ]
Stopped: [ app2 ]
Resource Group: g_service
vip
(ocf::heartbeat:IPaddr): Started app1
mystore
(ocf::heartbeat:Filesystem): Started app1
[root@app1 ~]#
6. 配置stonith,之前配置是关闭的,这里补充一下主要测试功能,实现环境中例如IBM x系列服务器可以采用ipmi等stonith设备配置。
本文采用VMware ESXi5.1虚拟机,stonith也是采用VMware
ESXi的fence设备fence_vmware_soap
注:在测试corosync+pacemaker过程中出现无法快速reboot/shutdown.stonith对一些服务器无法重启时配置该操作很有用。
需要在app1,app2安装fence-agents安装包。
# yum install fence-agents
安装之后位置以及stonith测试功能
[root@app1 ~]# /usr/sbin/fence_vmware_soap -a
192.168.0.61 -z -l root -p 876543 -o list
...
...
DRBD_HEARTBEAT_APP1,564d09c3-e8ee-9a01-e5f4-f1b11f03c810
DRBD_HEARTBEAT_APP2,564dddb8-f4bf-40e6-dbad-9b97b97d3d25
...
...
例如:重启虚拟机:
[root@app1 ~]# /usr/sbin/fence_vmware_soap -a 192.168.0.61 -z -l
root -p 876543 -n DRBD_HEARTBEAT_APP2 -o reboot
[root@app1 ~]# crm
crm(live)# configure
crm(live)configure#
primitive vm-fence-app1 stonith:fence_vmware_soap params ipaddr=192.168.0.61
login=root passwd=876543 port=app1 ssl="1" pcmk_host_list="DRBD_HEARTBEAT_APP1"
retry_on="10" shell_timeout="120" login_timeout="120" action="reboot" op start
interval="0" timeout="120"
crm(live)configure# primitive vm-fence-app2
stonith:fence_vmware_soap params ipaddr=192.168.0.61 login=root passwd=876543
port=app2 ssl="1" pcmk_host_list="DRBD_HEARTBEAT_APP2" retry_on="10"
shell_timeout="120" login_timeout="120" action="reboot" op start interval="0"
timeout="120"
crm(live)configure# location l-vm-fence-app1 vm-fence-app1
-inf: app1
crm(live)configure# location l-vm-fence-app2 vm-fence-app2 -inf:
app2
crm(live)configure# property stonith-enabled=true
crm(live)configure#
verify
crm(live)configure# commit
[root@app1 ~]# crm status
Last updated: Tue Jan 26 16:50:53 2016
Last
change: Tue Jan 26 16:50:27 2016 via crmd on app2
Stack: classic openais
(with plugin)
Current DC: app2 - partition with quorum
Version:
1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
6 Resources
configured
Online: [ app1 app2 ]
Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app2 ]
Slaves: [ app1 ]
Resource Group: g_service
vip
(ocf::heartbeat:IPaddr): Started app2
mystore
(ocf::heartbeat:Filesystem): Started app2
vm-fence-app1
(stonith:fence_vmware_soap): Started app2
vm-fence-app2
(stonith:fence_vmware_soap): Started app1
查看整个配置文件:
[root@app1 ~]# crm
crm(live)# configure
crm(live)configure# show
xml
<?xml version="1.0" ?>
<cib num_updates="4" dc-uuid="app2"
update-origin="app2" crm_feature_set="3.0.7" validate-with="pacemaker-1.2"
update-client="crmd" epoch="91" admin_epoch="0" cib-last-written="Tue Jan 26
16:50:27 2016" have-quorum="1">
<configuration>
<crm_config>
<cluster_property_set
id="cib-bootstrap-options">
<nvpair
id="cib-bootstrap-options-dc-version" name="dc-version"
value="1.1.10-14.el6-368c726"/>
<nvpair
id="cib-bootstrap-options-cluster-infrastructure" name="cluster-infrastructure"
value="classic openais (with plugin)"/>
<nvpair
id="cib-bootstrap-options-expected-quorum-votes" name="expected-quorum-votes"
value="2"/>
<nvpair name="stonith-enabled" value="false"
id="cib-bootstrap-options-stonith-enabled"/>
<nvpair
name="no-quorum-policy" value="ignore"
id="cib-bootstrap-options-no-quorum-policy"/>
<nvpair
name="default-resource-stickiness" value="100"
id="cib-bootstrap-options-default-resource-stickiness"/>
<nvpair id="cib-bootstrap-options-last-lrm-refresh" name="last-lrm-refresh"
value="1453798227"/>
</cluster_property_set>
</crm_config>
<nodes>
<node id="app2"
uname="app2">
<instance_attributes
id="nodes-app2">
<nvpair id="nodes-app2-standby"
name="standby" value="off"/>
</instance_attributes>
</node>
<node id="app1" uname="app1">
<instance_attributes id="nodes-app1">
<nvpair
id="nodes-app1-standby" name="standby" value="off"/>
</instance_attributes>
</node>
</nodes>
<resources>
<primitive id="vm-fence-app1" class="stonith"
type="fence_vmware_soap">
<instance_attributes
id="vm-fence-app1-instance_attributes">
<nvpair name="ipaddr"
value="192.168.0.61"
id="vm-fence-app1-instance_attributes-ipaddr"/>
<nvpair
name="login" value="root"
id="vm-fence-app1-instance_attributes-login"/>
<nvpair
name="passwd" value="xjj876543"
id="vm-fence-app1-instance_attributes-passwd"/>
<nvpair
name="port" value="app1"
id="vm-fence-app1-instance_attributes-port"/>
<nvpair
name="ssl" value="1"
id="vm-fence-app1-instance_attributes-ssl"/>
<nvpair
name="pcmk_host_list" value="DRBD_HEARTBEAT_APP1"
id="vm-fence-app1-instance_attributes-pcmk_host_list"/>
<nvpair name="retry_on" value="10"
id="vm-fence-app1-instance_attributes-retry_on"/>
<nvpair
name="shell_timeout" value="120"
id="vm-fence-app1-instance_attributes-shell_timeout"/>
<nvpair name="login_timeout" value="120"
id="vm-fence-app1-instance_attributes-login_timeout"/>
<nvpair name="action" value="reboot"
id="vm-fence-app1-instance_attributes-action"/>
</instance_attributes>
<operations>
<op
name="start" interval="0" timeout="120"
id="vm-fence-app1-start-0"/>
</operations>
</primitive>
<primitive id="vm-fence-app2" class="stonith"
type="fence_vmware_soap">
<instance_attributes
id="vm-fence-app2-instance_attributes">
<nvpair name="ipaddr"
value="192.168.0.61"
id="vm-fence-app2-instance_attributes-ipaddr"/>
<nvpair
name="login" value="root"
id="vm-fence-app2-instance_attributes-login"/>
<nvpair
name="passwd" value="xjj876543"
id="vm-fence-app2-instance_attributes-passwd"/>
<nvpair
name="port" value="app2"
id="vm-fence-app2-instance_attributes-port"/>
<nvpair
name="ssl" value="1"
id="vm-fence-app2-instance_attributes-ssl"/>
<nvpair
name="pcmk_host_list" value="DRBD_HEARTBEAT_APP2"
id="vm-fence-app2-instance_attributes-pcmk_host_list"/>
<nvpair name="retry_on" value="10"
id="vm-fence-app2-instance_attributes-retry_on"/>
<nvpair
name="shell_timeout" value="120"
id="vm-fence-app2-instance_attributes-shell_timeout"/>
<nvpair name="login_timeout" value="120"
id="vm-fence-app2-instance_attributes-login_timeout"/>
<nvpair name="action" value="reboot"
id="vm-fence-app2-instance_attributes-action"/>
</instance_attributes>
<operations>
<op
name="start" interval="0" timeout="120"
id="vm-fence-app2-start-0"/>
</operations>
</primitive>
<group id="g_service">
<primitive id="vip" class="ocf" provider="heartbeat"
type="IPaddr">
<instance_attributes
id="vip-instance_attributes">
<nvpair name="ip"
value="192.168.0.26" id="vip-instance_attributes-ip"/>
<nvpair name="cidr_netmask" value="24"
id="vip-instance_attributes-cidr_netmask"/>
<nvpair
name="nic" value="eth0:1" id="vip-instance_attributes-nic"/>
</instance_attributes>
<operations>
<op name="monitor" interval="30s" timeout="20s" on-fail="restart"
id="vip-monitor-30s"/>
</operations>
</primitive>
<primitive id="mystore" class="ocf"
provider="heartbeat" type="Filesystem">
<instance_attributes
id="mystore-instance_attributes">
<nvpair name="device"
value="/dev/drbd0" id="mystore-instance_attributes-device"/>
<nvpair name="directory" value="/data"
id="mystore-instance_attributes-directory"/>
<nvpair
name="fstype" value="ext4"
id="mystore-instance_attributes-fstype"/>
</instance_attributes>
<operations>
<op name="start" timeout="60s" interval="0"
id="mystore-start-0"/>
<op name="stop" timeout="60s"
interval="0" id="mystore-stop-0"/>
<op name="monitor"
interval="30s" timeout="40s" on-fail="restart"
id="mystore-monitor-30s"/>
</operations>
</primitive>
</group>
<master
id="ms_mydrbd">
<meta_attributes
id="ms_mydrbd-meta_attributes">
<nvpair name="master-max"
value="1" id="ms_mydrbd-meta_attributes-master-max"/>
<nvpair
name="master-node-max" value="1"
id="ms_mydrbd-meta_attributes-master-node-max"/>
<nvpair
name="clone-max" value="2"
id="ms_mydrbd-meta_attributes-clone-max"/>
<nvpair
name="clone-node-max" value="1"
id="ms_mydrbd-meta_attributes-clone-node-max"/>
<nvpair
name="notify" value="true" id="ms_mydrbd-meta_attributes-notify"/>
</meta_attributes>
<primitive id="mydrbd" class="ocf"
provider="linbit" type="drbd">
<instance_attributes
id="mydrbd-instance_attributes">
<nvpair
name="drbd_resource" value="data"
id="mydrbd-instance_attributes-drbd_resource"/>
</instance_attributes>
<operations>
<op name="monitor" role="Master" interval="20" timeout="30"
id="mydrbd-monitor-20"/>
<op name="monitor" role="Slave"
interval="30" timeout="30" id="mydrbd-monitor-30"/>
<op
name="start" timeout="240" interval="0" id="mydrbd-start-0"/>
<op name="stop" timeout="100" interval="0"
id="mydrbd-stop-0"/>
</operations>
</primitive>
</master>
</resources>
<constraints>
<rsc_colocation id="c_g_service"
score="INFINITY" rsc="g_service" with-rsc="ms_mydrbd"
with-rsc-role="Master"/>
<rsc_colocation
id="mystore_with_drbd_master" score="INFINITY" rsc="mystore"
with-rsc="ms_mydrbd" with-rsc-role="Master"/>
<rsc_order
id="o_g_service" score="INFINITY" first="ms_mydrbd" first-action="promote"
then="g_service" then-action="start"/>
<rsc_location
id="l-vm-fence-app1" rsc="vm-fence-app1" score="-INFINITY"
node="app1"/>
<rsc_location id="l-vm-fence-app2"
rsc="vm-fence-app2" score="-INFINITY" node="app2"/>
</constraints>
</configuration>
</cib>
清空资源,重新配置操作方法:
[root@app2 ~]# crm status
Last updated: Wed Jan 27 10:39:24 2016
Last change: Tue Jan 26 16:50:27 2016 via crmd on app2
Stack: classic openais (with plugin)
Current DC: app2 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
6 Resources configured
Online: [ app1 app2 ]
Master/Slave Set: ms_mydrbd [mydrbd]
Masters: [ app2 ]
Slaves: [ app1 ]
Resource Group: g_service
vip (ocf::heartbeat:IPaddr): Started app2
mystore (ocf::heartbeat:Filesystem): Started app2
vm-fence-app1 (stonith:fence_vmware_soap): Started app2
vm-fence-app2 (stonith:fence_vmware_soap): Started app1
[root@app2 ~]#
先依次关闭资源 :
[root@app2 ~]#
[root@app2 ~]# crm resource stop vm-fence-app2
[root@app2 ~]# crm resource stop vm-fence-app1
[root@app2 ~]# crm resource stop mystore
[root@app2 ~]# crm resource stop vip
[root@app2 ~]# crm resource stop ms_mydrbd
[root@app2 ~]# crm status
Last updated: Wed Jan 27 10:40:28 2016
Last change: Wed Jan 27 10:40:23 2016 via cibadmin on app2
Stack: classic openais (with plugin)
Current DC: app2 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
6 Resources configured
Online: [ app1 app2 ]
[root@app2 ~]#
再清空配置:
[root@app2 ~]# crm configure erase
INFO: resource references in colocation:c_g_service updated
INFO: resource references in colocation:mystore_with_drbd_master updated
INFO: resource references in order:o_g_service updated
[root@app2 ~]#
[root@app2 ~]#
[root@app2 ~]# crm status
Last updated: Wed Jan 27 10:40:58 2016
Last change: Wed Jan 27 10:40:52 2016 via crmd on app2
Stack: classic openais (with plugin)
Current DC: app2 - partition with quorum
Version: 1.1.10-14.el6-368c726
2 Nodes configured, 2 expected votes
0 Resources configured
Online: [ app1 app2 ]
[root@app2 ~]#
[root@app2 ~]#
就可以再次重新配置了。
8. 配置小结:
之前多次未成功配置的成功主要在于资源的排列与定位启动上面,造成切换,启动均不成功,这个也是corosync+pacemaker的配置要理解的重点, DRBD+可以实现很多种组合,本文仅提供技术实现参考。
以上是关于Corosync+Pacemaker+DRBD+NFS高可用实例配置的主要内容,如果未能解决你的问题,请参考以下文章
Corosync+Pacemaker+DRBD+NFS高可用实例配置
drbd+corosync+pacemaker构建高可用MySQL集群
基于corosync和pacemaker+drbd实现mfs高可用
Corosync+pacemaker+DRBD+mysql(mariadb)实现高可用(ha)的mysql集群(centos7)