pcs+pacemaker+corosync+nfs配置高可用
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了pcs+pacemaker+corosync+nfs配置高可用相关的知识,希望对你有一定的参考价值。
Pacemaker:
Pacemaker是一个集群资源管理器。它利用集群基础构件(OpenAIS 、heartbeat或corosync)提供的消息和成员管理能力来探测并从节点或资源级别的故障中恢复,以实现群集服务(亦称资源)的最大可用性。
Corosync:
Corosync是集群管理套件的一部分,它在传递信息的时候可以通过一个简单的配置文件来定义信息传递的方式和协议等。
实验环境:
部署软件 | IP | 域名 |
nfs | 10.0.0.128 | nfs |
pcs+pacemaker+corosync | 10.0.0.129 | zxb2 |
pcs+pacemaker+corosync | 10.0.0.130 | zxb3 |
实施:
配置前提:
①时间同步
[[email protected] ~]# crontab -l
* * * * * ntpdate cn.pool.ntp.org
[[email protected] ~]# crontab -l
* * * * * ntpdate cn.pool.ntp.org
②主机名互相访问
[[email protected] ~]# cat /etc/hosts
10.0.0.129 zxb2
10.0.0.130 zxb3
[[email protected] ~]# cat /etc/hosts
10.0.0.129 zxb2
10.0.0.130 zxb3
[[email protected] ~]# hostnamectl set-hostname zxb2
[[email protected] ~]# hostnamectl set-hostname zxb3
③ssh免密钥
[[email protected] ~]# ssh-keygen
[[email protected] ~]# ssh-copy-id 10.0.0.130
配置过程
1.两个结点上安装所需要的软件包
[[email protected] ~]# yum install -y pacemaker pcs psmisc policycoreutils-python
[[email protected] ~]# yum install -y pacemaker pcs psmisc policycoreutils-python
2.启动pcs并设置开机启动
[[email protected] ~]# systemctl start pcsd.service
[[email protected] ~]# systemctl enable pcsd.service
[[email protected] ~]#systemctl start pcsd.service
[[email protected] ~]# systemctl enable pcsd.service
3.修改hacluster的密码(用户装完软件后默认生成)
[[email protected] ~]# echo 1 | passwd --stdin hacluster
[[email protected] ~]# echo 1 | passwd --stdin hacluster
接下来在某一节点上执行
4.注册pcs集群主机(用户名密码在上)
[[email protected] ~]# pcs cluster auth zxb2 zxb3
zxb2: authorized
zxb3: authorized
5.在集群上创建两台集群
[[email protected] ~]# pcs cluster setup --name mycluster zxb2 zxb3 --force
6.查看生成的corosync配置文件
[[email protected] ~]# ls /etc/corosync/
corosync.conf corosync.conf.example corosync.conf.example.udpu corosync.xml.example uidgid.d
7.查看corosync配置文件是否有刚刚注册的文件
[[email protected] corosync]# cat corosync.conf
totem { version: 2 secauth: off cluster_name: mycluster transport: udpu } nodelist { node { ring0_addr: zxb2 nodeid: 1 } node { ring0_addr: zxb3 nodeid: 2 } } quorum { provider: corosync_votequorum two_node: 1 } logging { to_logfile: yes logfile: /var/log/cluster/corosync.log to_syslog: yes }
8.启动集群
[[email protected] ~]# pcs cluster start --all
zxb2: Starting Cluster...
zxb3: Starting Cluster...
##相当于启动pacemaker和corosync
[[email protected] ~]# ps -ef| grep pacemaker
root 10412 1 0 07:55 ? 00:00:02 /usr/sbin/pacemakerd -f haclust+ 10413 10412 0 07:55 ? 00:00:03 /usr/libexec/pacemaker/cib root 10414 10412 0 07:55 ? 00:00:01 /usr/libexec/pacemaker/stonithd root 10415 10412 0 07:55 ? 00:00:01 /usr/libexec/pacemaker/lrmd haclust+ 10416 10412 0 07:55 ? 00:00:01 /usr/libexec/pacemaker/attrd haclust+ 10417 10412 0 07:55 ? 00:00:02 /usr/libexec/pacemaker/pengine haclust+ 10418 10412 0 07:55 ? 00:00:03 /usr/libexec/pacemaker/crmd root 24087 23406 0 13:01 pts/0 00:00:00 grep --color=auto pacemaker
[[email protected] ~]# ps -ef| grep corosync
root 10405 1 0 07:55 ? 00:02:04 corosync root 24093 23406 0 13:01 pts/0 00:00:00 grep --color=auto corosync
9.查看集群状态(显示no faults为正常)
[[email protected] ~]# corosync-cfgtool -s
Printing ring status. Local node ID 1 RING ID 0 id = 10.0.0.129 status = ring 0 active with no faults [[email protected] ~]# corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id = 10.0.0.130 status = ring 0 active with no faults
10.查看集群状态
[[email protected] ~]# pcs status
Cluster name: mycluster Stack: corosync Current DC: zxb3(version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 15:48:12 2017 Last change: Wed Oct 25 13:51:05 2017 by root via crm_attribute on zxb3 2 nodes configured 0 resources configured Online: [ zxb2 zxb3 ] No resources: #还没有资源 Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled
11.查看集群是否有错
[[email protected] ~]# crm_verify -L -V
error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined
th shared data need STONITH to ensure data integrity
error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option
error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity
##因为没有配置STONITH,所以需要关闭
[[email protected] ~]# pcs property set stonith-enabled=false
[[email protected] ~]# crm_verify -L -V
12.配置nfs服务器(10.0.0.128)
①创建共享目录,并写入文件
[[email protected] ~]# mkdir /nfs
[[email protected] ~]# echo "test--nfs" >/nfs/index.html
②安装nfs,赋予权限
[[email protected] ~]# yum install -y nfs-utils
[[email protected] ~]# cat /etc/exports
/nfs *(rw,sync,no_root_squash)
③启动nfs服务,并关闭防火墙
[[email protected] ~]# systemctl start nfs
[[email protected] ~]# systemctl stop firewalld
#查看是否能获取的nfs服务器共享的目录
[[email protected] ~]# showmount -e 10.0.0.128
Export list for 10.0.0.128:
/nfs *
13.安装httpd服务(不用启动)
[[email protected] ~]# yum install -y httpd
[[email protected] ~]# yum install -y httpd
14.安装crmsh来管理集群(从github来下载,然后解压直接安装):只在一个节点安装即可
[[email protected] ~]# ls
crmsh-2.3.2.tar
[[email protected] ~]# tar xf crmsh-2.3.2.tar
[[email protected] ~]#cd crmsh-2.3.2
[[email protected] crmsh-2.3.2]# python setup.py install
15.要想使一个服务具有高可用,我们必须要有资源代理,在systemd资源代理下,要有enable 才能被crm识别,所以
[[email protected] ~]# systemctl enable httpd
[[email protected] ~]# systemctl enable httpd
16.资源调用使用方法,接下来直接配置
[[email protected] ~]# crm
crm(live)# racrm(live)ra# info systemd:httpd systemd unit file for httpd (systemd:httpd) The Apache HTTP Server Operations‘ defaults (advisory minimum): start timeout=100 stop timeout=100 status timeout=100 monitor timeout=100 interval=60 crm(live)ra# cd.. crm(live)configure# primitive webserver systemd:httpd
16.配置高可用的webip: Vip:10.0.0.200
crm(live)configure#primitive webip ocf:heartbeat:IPaddr params ip=10.0.0.200
17.定义/nfs 自动挂载
crm(live)configure# primitive webstore ocf:heartbeat:Filesystem params device="10.0.0.128:/nfs" directory="/var/www/html" fstype="nfs"
crm(live)configure# verify ##配置完后检查是否出错,警告可忽略
crm(live)configure# commit ##保存
18定义排列约束
crm(live)configure# colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore)
crm(live)configure# commit
19.查看状态
crm(live)configure# show
node 1: zxb2 attributes standby=off node 2: zxb3 attributes standby=off primitive webip IPaddr params ip=10.0.0.200 meta target-role=Started primitive webserver systemd:httpd primitive webstore Filesystem params device="10.0.0.128:/nfs" directory="/var/www/html" fstype=nfs order webserver_after_webstore Mandatory: webstore webserver colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore ) order webstore_after_webip Mandatory: webip webstore property cib-bootstrap-options: have-watchdog=false dc-version=1.1.16-12.el7_4.4-94ff4df cluster-infrastructure=corosync cluster-name=mycluster stonith-enabled=false
20.定义排列顺序
crm(live)configure# order webstore_after_webip Mandatory: webip webstore
crm(live)configure# order webserver_after_webstore Mandatory: webstore webserver
crm(live)configure# verify
crm(live)configure# commit
21.查看
crm(live)# status
Stack: corosync Current DC: zxb2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 16:35:15 2017 Last change: Wed Oct 25 16:35:10 2017 by root via crm_attribute on zxb3 2 nodes configured 3 resources configured Online: [ zxb2 zxb3 ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started zxb2 webserver (systemd:httpd): Started zxb2 webstore (ocf::heartbeat:Filesystem): Started zxb2
测试:
正常访问:
[[email protected] ~]# curl 10.0.0.200
test--nfs
模拟故障
##我们的顺序是webip webstore webserver
crm(live)# status
Stack: corosync Current DC: zxb2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 16:43:41 2017 Last change: Wed Oct 25 16:35:10 2017 by root via crm_attribute on zxb3 2 nodes configured 3 resources configured Online: [ zxb2 zxb3 ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started zxb2 webserver (systemd:httpd): Started zxb2 webstore (ocf::heartbeat:Filesystem): Started zxb2
crm(live)# node standby
crm(live)# status
Stack: corosync Current DC: zxb2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 16:43:55 2017 Last change: Wed Oct 25 16:43:52 2017 by root via crm_attribute on zxb2 2 nodes configured 3 resources configured Node zxb2: standby Online: [ zxb3 ] Full list of resources: webip (ocf::heartbeat:IPaddr): Stopped webserver (systemd:httpd): Stopped webstore (ocf::heartbeat:Filesystem): Stopped
crm(live)# status
Stack: corosync Current DC: zxb2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 16:44:17 2017 Last change: Wed Oct 25 16:43:52 2017 by root via crm_attribute on zxb2 2 nodes configured 3 resources configured Node zxb2: standby Online: [ zxb3 ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started zxb3 webserver (systemd:httpd): Started zxb3 webstore (ocf::heartbeat:Filesystem): Started zxb3
访问:
[[email protected] ~]# curl 10.0.0.200
test--nfs
本文出自 “XiaoBingZ” 博客,请务必保留此出处http://1767340368.blog.51cto.com/13407496/1976071
以上是关于pcs+pacemaker+corosync+nfs配置高可用的主要内容,如果未能解决你的问题,请参考以下文章
http高可用+负载均衡 corosync + pacemaker + pcs
LINUX集群学习二——pacemaker+corosync+pcs实验
pcs+pacemaker+corosync+nfs配置高可用
LinuxServicesOpenstack安装与部署(3.集群软件corosync,pacemake,pcs)