Centos7上利用corosync+pacemaker+crmsh构建高可用集群
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Centos7上利用corosync+pacemaker+crmsh构建高可用集群相关的知识,希望对你有一定的参考价值。
一、高可用集群框架
资源类型:
primitive(native):表示主资源
group:表示组资源,组资源里包含多个主资源
clone:表示克隆资源
master/slave:表示主从资源
资源约束方式:
位置约束:定义资源对节点的倾向性
排序约束:定义资源彼此能否运行在同一节点的倾向性
顺序约束:多个资源启动顺序的依赖关系
HA集群常用的工作模型:
A/P:两节点,active/passive,工作于主备模型
A/A:两节点,active/active,工作于主主模型
N-M:N>M,N个节点,M个服务,假设每个节点运行一个服务,活动节点数为N,备用节点数为N-M
在集群分裂(split-brain)时需要使用到资源隔离,有两种隔离级别:
STONITH:节点级别的隔离,通过断开一个节点的电源或者重新启动节点
fencing:资源级别的隔离,类似通过向交换机发出隔离信号,特意让数据无法通过此接口
当集群分裂,即分裂后的一个集群的法定票数小于总票数一半时采取对资源的控制策略,
二、在centos7上建立Ha cluster
centos7(corosync v2 + pacemaker)
集群的全生命周期管理工具:
pcs: agent(pcsd)
crmsh: agentless (pssh)
1、集群配置前提
时间同步,基于当前正在使用的主机名互相访问,是否会用到仲裁设备
#更改主机名,两台主机的都要修改(192.168.1.114为ns2.xinfeng.com)(192.168.1.113为ns3.xinfeng.com) [[email protected] ~]# hostnamectl set-hostname ns2.xinfeng.com [[email protected] ~]# uname -n ns2.xinfeng.com [[email protected] ~]# vim /etc/hosts 192.168.1.114 ns2.xinfeng.com 192.168.1.113 ns3.xinfeng.com #同步时间 [[email protected] ~]# ntpdate s1a.time.edu.cn [[email protected] ~]# ssh 192.168.1.113 ‘date‘;date The authenticity of host ‘192.168.1.113 (192.168.1.113)‘ can‘t be established. ECDSA key fingerprint is 09:f9:39:8c:35:4d:ba:2d:13:4f:3c:9c:b1:58:54:ec. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added ‘192.168.1.113‘ (ECDSA) to the list of known hosts. [email protected]‘s password: 2016年 05月 28日 星期六 13:18:07 CST 2016年 05月 28日 星期六 13:18:07 CST
2、安装pcs并启动集群
#192.168.1.113 [[email protected] ~]# yum install pcs #192.168.1.114 [[email protected] ~]# yum install pcs #用ansible开起服务并使服务开机运行 [[email protected] ~]# vim /etc/ansible/hosts [ha] 192.168.1.114 192.168.1.113 [[email protected] ~]# ansible ha -m service -a ‘name=pcsd state=started enabled=yes‘ 192.168.1.113 | SUCCESS => { "changed": false, "enabled": true, "name": "pcsd", "state": "started" } 192.168.1.114 | SUCCESS => { "changed": true, "enabled": true, "name": "pcsd", "state": "started" } #用ansible查看服务是否启动 [[email protected] ~]# ansible ha -m shell -a ‘systemctl status pcsd‘ 192.168.1.114 | SUCCESS | rc=0 >> ● pcsd.service - PCS GUI and remote configuration interface Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled) Active: active (running) since 六 2016-05-28 13:36:19 CST; 2min 32s ago Main PID: 2736 (pcsd) CGroup: /system.slice/pcsd.service ├─2736 /bin/sh /usr/lib/pcsd/pcsd start ├─2740 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb └─2741 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb 5月 28 13:36:16 ns2.xinfeng.com systemd[1]: Starting PCS GUI and remote configuration interface... 5月 28 13:36:19 ns2.xinfeng.com systemd[1]: Started PCS GUI and remote configuration interface. 192.168.1.113 | SUCCESS | rc=0 >> ● pcsd.service - PCS GUI and remote configuration interface Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled) Active: active (running) since 六 2016-05-28 13:35:26 CST; 3min 24s ago Main PID: 2620 (pcsd) CGroup: /system.slice/pcsd.service ├─2620 /bin/sh /usr/lib/pcsd/pcsd start ├─2624 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb └─2625 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb 5月 28 13:35:24 ns3.xinfeng.com systemd[1]: Starting PCS GUI and remote configuration interface... 5月 28 13:35:26 ns3.xinfeng.com systemd[1]: Started PCS GUI and remote configuration interface. #给hacluster用户增加密码 [[email protected] ~]# ansible ha -m shell -a ‘echo "123" | passwd --stdin hacluster‘ 192.168.1.113 | SUCCESS | rc=0 >> 更改用户 hacluster 的密码 。 passwd:所有的身份验证令牌已经成功更新。 192.168.1.114 | SUCCESS | rc=0 >> 更改用户 hacluster 的密码 。 passwd:所有的身份验证令牌已经成功更新。 #认证节点身份,用户名和密码为上面设置的hacluster和123,注意iptables规则,否则会出现无法联系的情况 [[email protected] ~]# pcs cluster auth ns2.xinfeng.com ns3.xinfeng.com Username: hacluster Password: ns2.xinfeng.com: Authorized ns3.xinfeng.com: Authorized #配置集群,集群名字为xinfengcluster,集群中有2个节点 [[email protected] ~]# pcs cluster setup --name xinfengcluster ns2.xinfeng.com ns3.xinfeng.com Shutting down pacemaker/corosync services... Redirecting to /bin/systemctl stop pacemaker.service Redirecting to /bin/systemctl stop corosync.service Killing any remaining services... Removing all cluster configuration files... ns2.xinfeng.com: Succeeded ns3.xinfeng.com: Succeeded Synchronizing pcsd certificates on nodes ns2.xinfeng.com, ns3.xinfeng.com... ns2.xinfeng.com: Success ns3.xinfeng.com: Success Restaring pcsd on the nodes in order to reload the certificates... ns2.xinfeng.com: Success ns3.xinfeng.com: Success #查看下配置文件 [[email protected] ~]# cat /etc/corosync/corosync.conf totem { #集群的信息 version: 2 #版本 secauth: off #安全功能是否开起 cluster_name: xinfengcluster #集群名称 transport: udpu #传输协议udpu也可以设置为udp } nodelist { #集群中的所有节点 node { ring0_addr: ns2.xinfeng.com nodeid: 1 #节点ID } node { ring0_addr: ns3.xinfeng.com nodeid: 2 } } quorum { #仲裁投票 provider: corosync_votequorum #投票系统 two_node: 1 #是否为2节点集群 } logging { #日志 to_logfile: yes #是否记录日志 logfile: /var/log/cluster/corosync.log #日志文件位置 to_syslog: yes #是否记录系统日志 } #启动集群 [[email protected] ~]# pcs cluster start --all ns3.xinfeng.com: Starting Cluster... ns2.xinfeng.com: Starting Cluster... #查看ns2.xinfeng.com节点是否启动 [[email protected] ~]# corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 192.168.1.114 status = ring 0 active with no faults #查看ns3.xinfeng.com节点是否启动 [[email protected] ~]# corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id = 192.168.1.113 status = ring 0 active with no faults #查看集群信息 [[email protected] ~]# corosync-cmapctl | grep members runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0 runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.1.114) runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1 runtime.totem.pg.mrp.srp.members.1.status (str) = joined runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0 runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.1.113) runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1 runtime.totem.pg.mrp.srp.members.2.status (str) = joined [[email protected] ~]# pcs status Cluster name: xinfengcluster WARNING: no stonith devices and stonith-enabled is not false Last updated: Sat May 28 14:38:23 2016 Last change: Sat May 28 14:33:15 2016 by hacluster via crmd on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum #DC为全局仲裁节点 2 nodes and 0 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: PCSD Status: ns2.xinfeng.com: Online ns3.xinfeng.com: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
3、使用crmsh配置集群
安装opensuse上的yum源
centos6
cd /etc/yum.repos.d/ wget http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-6/network:ha-clustering:Stable.repo cd yum -y install crmsh
centos7
cd /etc/yum.repos.d/ wget http://download.opensuse.org/repositories/network:/ha-clustering:/Stable/CentOS_CentOS-7/network:ha-clustering:Stable.repo cd yum -y install crmsh
在其中一台主机上配置crmsh(192.1681.1.114)
#显示当前的集群状态 [[email protected] yum.repos.d]# crm status Last updated: Sat May 28 16:52:52 2016 Last change: Sat May 28 14:33:15 2016 by hacluster via crmd on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 0 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources:
4、两个节点上分别装上httpd
[[email protected] ~]# ansible ha -m shell -a ‘yum install httpd -y‘ [[email protected] ~]# echo "<h1>ns2.xinfeng.com</h1>" > /var/www/html/index.html [[email protected] ~]# echo "<h1>ns3.xinfeng.com</h1>" > /var/www/html/index.html #测试能否正常启动,页面是否正常 [[email protected] ~]# ansible ha -m service -a ‘name=httpd state=started enabled=yes‘ #centos6必须关闭服务,关闭开机启动,之后服务会变成资源,所以请确保服务不启动也不开机启动 [[email protected] ~]# ansible ha -m service -a ‘name=httpd state=stopped enabled=no‘ #centos7必须关闭服务,开起开机启动 [[email protected] ~]# ansible ha -m service -a ‘name=httpd state=stopped enabled=yes‘
5、配置集群
VIP为192.168.1.91,服务是httpd,将VIP和httpd作为资源来进行配置
[[email protected] ~]# crm crm(live)# ra #进入资源代理 crm(live)ra# classes #查看可以代理的资源类型 lsb ocf / .isolation heartbeat openstack pacemaker service stonith systemd crm(live)ra# list systemd #查看systemd类型可代理的服务,其中有httpd NetworkManager NetworkManager-wait-online auditd brandbot corosync cpupower crond dbus display-manager dm-event dracut-shutdown ebtables emergency exim firewalld [email protected] httpd ip6tables iptables irqbalance kdump kmod-static-nodes ldconfig libvirtd lvm2-activation lvm2-lvmetad lvm2-lvmpolld lvm2-monitor [email protected]:2 microcode network pacemaker pcsd plymouth-quit plymouth-quit-wait plymouth-read-write plymouth-start polkit postfix rc-local rescue rhel-autorelabel rhel-autorelabel-mark rhel-configure rhel-dmesg rhel-import-state rhel-loadmodules rhel-readonly rsyslog sendmail sshd sshd-keygen syslog systemd-ask-password-console systemd-ask-password-plymouth systemd-ask-password-wall systemd-binfmt systemd-firstboot systemd-fsck-root systemd-hwdb-update systemd-initctl systemd-journal-catalog-update systemd-journal-flush systemd-journald systemd-logind systemd-machine-id-commit systemd-modules-load systemd-random-seed systemd-random-seed-load systemd-readahead-collect systemd-readahead-done systemd-readahead-replay systemd-reboot systemd-remount-fs systemd-shutdownd systemd-sysctl systemd-sysusers systemd-tmpfiles-clean systemd-tmpfiles-setup systemd-tmpfiles-setup-dev systemd-udev-trigger systemd-udevd systemd-update-done systemd-update-utmp systemd-update-utmp-runlevel systemd-user-sessions systemd-vconsole-setup tuned wpa_supplicant crm(live)ra# cd crm(live)# configure #配置资源,资源名为webip,ip为192.168.1.91 crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=192.168.1.91 crm(live)configure# show node 1: ns2.xinfeng.com node 2: ns3.xinfeng.com primitive webip IPaddr params ip=192.168.1.91 property cib-bootstrap-options: have-watchdog=false dc-version=1.1.13-10.el7_2.2-44eb2dd cluster-infrastructure=corosync cluster-name=xinfengcluster crm(live)configure# verify #校验,因为没有隔离设备所以报错 ERROR: error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid crm(live)configure# property stonith-enabled=false #关闭隔离设备的设置 crm(live)configure# verify #再次校验 crm(live)configure# commit #提交,是配置生效 crm(live)configure# cd crm(live)# status Last updated: Sat May 28 17:41:46 2016 Last change: Sat May 28 17:41:31 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 1 resource configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns2.xinfeng.com #VIP已经启动在了ns2上 crm(live)# quit bye [[email protected] ~]# ip addr #验证一下VIP是否已经配置到网卡上 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno16777728: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:91:57:d1 brd ff:ff:ff:ff:ff:ff inet 192.168.1.114/24 brd 192.168.1.255 scope global dynamic eno16777728 valid_lft 5918sec preferred_lft 5918sec inet 192.168.1.91/24 brd 192.168.1.255 scope global secondary eno16777728 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe91:57d1/64 scope link valid_lft forever preferred_lft forever #将当前节点切换为备用节点 [[email protected] ~]# crm crm(live)# node crm(live)node# standby crm(live)node# cd crm(live)# status Last updated: Sat May 28 17:45:41 2016 Last change: Sat May 28 17:45:34 2016 by root via crm_attribute on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 1 resource configured Node ns2.xinfeng.com: standby Online: [ ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns3.xinfeng.com #资源重新上线 crm(live)# node online crm(live)# status Last updated: Sat May 28 17:46:40 2016 Last change: Sat May 28 17:46:37 2016 by root via crm_attribute on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 1 resource configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns3.xinfeng.com #配置httpd资源,资源名为httpd [[email protected] ~]# crm crm(live)# configure crm(live)configure# primitive webser systemd:httpd crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sat May 28 17:50:15 2016 Last change: Sat May 28 17:49:56 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns3.xinfeng.com webser (systemd:httpd): Started ns2.xinfeng.com #将两个资源放在webhttp组中,资源启动顺序是webip,webser crm(live)# configure crm(live)configure# group webhttp webip webser crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sat May 28 17:52:48 2016 Last change: Sat May 28 17:52:41 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: Resource Group: webhttp webip (ocf::heartbeat:IPaddr): Started ns3.xinfeng.com webser (systemd:httpd): Started ns3.xinfeng.com
由于是2个节点,会存在法定票数不足导致的资源不转移的情况,解决此问题的方法有四种:
1、可以增加一个ping node节点。
2、可以增加一个仲裁磁盘。
3、让集群中的节点数成奇数个。
4、直接忽略当集群没有法定票数时直接忽略。
这里我用的是第四种方式
[[email protected] ~]# crm crm(live)# configure crm(live)configure# property no-quorum-policy=ignore crm(live)configure# verify crm(live)configure# commit
这样做还不够,因为没有对资源进行监控,所以资源出现问题依然不会转移
现在只能测试下资源是否在ns3.xinfeng.com上启动了
要对资源进行监控需要在全局下命令primitive定义资源时一同定义,因此先把之前定义的资源删掉后重新定义
[[email protected] ~]# crm crm(live)# resource crm(live)resource# show Resource Group: webhttp webip (ocf::heartbeat:IPaddr): Started webser (systemd:httpd): Started crm(live)resource# stop webhttp #停掉所有资源 crm(live)resource# show Resource Group: webhttp webip (ocf::heartbeat:IPaddr): (target-role:Stopped) Stopped webser (systemd:httpd): (target-role:Stopped) Stopped crm(live)configure# edit #编辑资源定义配置文件 node 1: ns2.xinfeng.com attributes standby=off node 2: ns3.xinfeng.com primitive webip IPaddr \ #删除 params ip=192.168.1.91 #删除 primitive webser systemd:httpd #删除 group webhttp webip webser \ #删除 meta target-role=Stopped #删除 property cib-bootstrap-options: have-watchdog=false dc-version=1.1.13-10.el7_2.2-44eb2dd cluster-infrastructure=corosync cluster-name=xinfengcluster stonith-enabled=false no-quorum-policy=ignore crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sat May 28 23:32:01 2016 Last change: Sat May 28 23:31:49 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 0 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources:
6、重新定义带有监控的资源
crm(live)# configure #每60秒监控一次,超时时长为20秒,时间不能小于建议时长,否则会报错 crm(live)configure# primitive webip ocf:IPaddr params ip=192.168.1.91 op monitor timeout=20s interval=60s crm(live)configure# primitive webser systemd:httpd op monitor timeout=20s interval=60s crm(live)configure# group webhttp webip webser crm(live)configure# property no-quorum-policy=ignore crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sat May 28 23:41:03 2016 Last change: Sat May 28 23:40:36 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns2.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: Resource Group: webhttp webip (ocf::heartbeat:IPaddr): Started ns2.xinfeng.com webser (systemd:httpd): Started ns2.xinfeng.com
测试一下,将服务停掉,20秒后服务又自动会启动
[[email protected] ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: eno16777728: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:91:57:d1 brd ff:ff:ff:ff:ff:ff inet 192.168.1.114/24 brd 192.168.1.255 scope global dynamic eno16777728 valid_lft 6374sec preferred_lft 6374sec inet 192.168.1.91/24 brd 192.168.1.255 scope global secondary eno16777728 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe91:57d1/64 scope link valid_lft forever preferred_lft forever [[email protected] ~]# service httpd stop Redirecting to /bin/systemctl stop httpd.service
编辑配置文件,随便乱加几行让服务不能启动
[[email protected] ~]# vim /etc/httpd/conf/httpd.conf [[email protected] ~]# service httpd stop Redirecting to /bin/systemctl stop httpd.service
服务成功切换到ns3上
7、【注意】当重新恢复httpd服务后记得清除资源的错误信息,否则无法启动资源
crm(live)# resource crm(live)resource# cleanup webser #清楚webser之前的错误信息 Cleaning up webser on ns2.xinfeng.com, removing fail-count-webser Cleaning up webser on ns3.xinfeng.com, removing fail-count-webser Waiting for 2 replies from the CRMd.. OK crm(live)resource# show webip (ocf::heartbeat:IPaddr): Started webser (systemd:httpd): Started
8、定义资源约束
删除组资源
[[email protected] ~]# crm crm(live)# configure crm(live)configure# delete webhttp #删除webhttp组 crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sun May 29 09:22:57 2016 Last change: Sun May 29 09:22:48 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns2.xinfeng.com webser (systemd:httpd): Started ns3.xinfeng.com
排列约束
[[email protected] ~]# crm crm(live)# configure crm(live)configure# colocation webser_with_webip inf: webser webip #定义webser和webip两个资源必须在一起 crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sun May 29 09:40:36 2016 Last change: Sun May 29 09:40:28 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns2.xinfeng.com #可以看到两个资源都在ns2上启动了 webser (systemd:httpd): Started ns2.xinfeng.com
顺序约束
crm(live)# configure #webip先于webser启动,强制的先启动webip在启动webser crm(live)configure# order webip_before_webser mandatory: webip webser crm(live)configure# verify crm(live)configure# commit crm(live)configure# show xml #查看之前详细定义
位置约束
#先看下当前的位置 crm(live)# status Last updated: Sun May 29 09:48:42 2016 Last change: Sun May 29 09:44:35 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns2.xinfeng.com webser (systemd:httpd): Started ns2.xinfeng.com #定义位置约束让资源更倾向于ns3上 crm(live)# configure 定义一个webservice的位置约束在节点2上,资源webip对ns3的倾向性是100 crm(live)configure# location webservice_pref_node2 webip 100: ns3.xinfeng.com crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd crm(live)# status Last updated: Sun May 29 10:12:12 2016 Last change: Sun May 29 10:12:04 2016 by root via cibadmin on ns2.xinfeng.com Stack: corosync Current DC: ns3.xinfeng.com (version 1.1.13-10.el7_2.2-44eb2dd) - partition with quorum 2 nodes and 2 resources configured Online: [ ns2.xinfeng.com ns3.xinfeng.com ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started ns3.xinfeng.com #很明显,资源都转移到了ns3上了 webser (systemd:httpd): Started ns3.xinfeng.com
三、总结
1、当重新恢复资源的服务后一定记得清除资源的错误信息,否则无法启动资源
2、在利用corosync+pacemaker且是两个节点实现高可用时,需要注意的是要设置全局属性把stonith设备关闭,忽略法定票数不大于一半的机制
3、注意selinux和iptables对服务的影响
4、注意节点相互用/etc/hosts来解析
5、节点时间一定要保持同步
6、节点相互间进行无密钥通信
本文出自 “信风” 博客,请务必保留此出处http://xsllqs.blog.51cto.com/2308669/1785043
以上是关于Centos7上利用corosync+pacemaker+crmsh构建高可用集群的主要内容,如果未能解决你的问题,请参考以下文章
CentOS7/RHEL7 pacemaker+corosync高可用集群搭建