基于corosync+pacemaker实现nfs+nginx部署
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了基于corosync+pacemaker实现nfs+nginx部署相关的知识,希望对你有一定的参考价值。
基于corosync+pacemaker实现nfs+nginx(crm管理)高可用-centos7
pcs相关配置:(因为在7版本,所以pcs支持比较好,crmsh比较复杂)
环境主机-centos7:node1:172.25.0.29 node2:172.25.0.30
配置集群的前提:
1、时间同步
2、主机名互相访问
3、是否使用仲裁设备。
生命周期管理工具主要包括以下:
Pcs:agent(pcsd) :应用于corosync+pacemaker
Crash:pssh : 应用于ansible相关的服务
一、安装corosync+pacemaker和crm管理包
1、先配置相关主机和相关时间同步服务器:
node1:
[[email protected] ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.25.0.29 node1 172.25.0.30 node2 [[email protected] ~]# crontab -e */5 * * * * ntpdate cn.pool.ntp.org ###添加任务
node2:
[[email protected] ~]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 172.25.0.29 node1 172.25.0.30 node2 [[email protected] ~]# crontab -e */5 * * * * ntpdate cn.pool.ntp.org ###添加任务
在node1和node2上可以看到已经添加时间任务:
[[email protected] ~]# crontab -l */5 * * * * ntpdate cn.pool.ntp.org [[email protected] ~]# crontab -l */5 * * * * ntpdate cn.pool.ntp.org
添加node1和node2的信任关系
[[email protected] ~]# ssh-keygen [[email protected] ~]# ssh-copy-id node2 The authenticity of host ‘node2 (172.25.0.30)‘ can‘t be established. ECDSA key fingerprint is ae:88:02:59:f9:7f:e9:4f:48:8d:78:d2:6f:c7:7a:f1. Are you sure you want to continue connecting (yes/no)? yes /usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed /usr/bin/ssh-copy-id: WARNING: All keys were skipped because they already exist on the remote system.
我这里已经添加了,才会出现警告
2、在node1和node2个结点上执行:
[[email protected] corosync]# yum install -y pacemaker pcs psmisc policycoreutils-python [[email protected] corosync]# yum install -y pacemaker pcs psmisc policycoreutils-python
3、node1和node2上启动pcs并且让开机启动:
[[email protected] corosync]# systemctl start pcsd.service [[email protected] corosync]# systemctl enable pcsd [[email protected] corosync]# systemctl start pcsd.service [[email protected] corosync]# systemctl enable pcsd
4、在两台主机上修改用户hacluster的密码:
[[email protected] corosync]# echo 123456 | passwd --stdin hacluster [[email protected] corosync]# echo 123456 | passwd --stdin hacluster
下面的可以一台主机同步配置了
node1上:
5、注册pcs集群主机(默认注册使用用户名hacluster,和密码):
[[email protected] corosync]# pcs cluster auth node1 node2 ##设置注册那个集群节点 node2: Already authorized node1: Already authorized
6、在集群上注册两台集群:
[[email protected] corosync]# pcs cluster setup --name mycluster node1 node2 --force ##设置集群
7、接下来就在某个节点上已经生成来corosync配置文件:
[[email protected] ~]# cd /etc/corosync/ ##进入corosync目录 [[email protected] corosync]# ls corosync.conf corosync.conf.example corosync.conf.example.udpu corosync.xml.example uidgid.d
#我们看到生成来corosync.conf配置文件:
8、启动集群:
[[email protected] corosync]# pcs cluster start --all node1: Starting Cluster... node2: Starting Cluster... ##相当于启动pacemaker和corosync: [[email protected] corosync]# ps -ef | grep corosync root 19586 1 0 18:05 ? 00:00:40 corosync root 29230 21295 0 19:13 pts/1 00:00:00 grep --color=auto corosync [[email protected] corosync]# ps -ef | grep pacemaker root 1843 1 0 11:21 ? 00:00:04 /usr/libexec/pacemaker/lrmd haclust+ 1845 1 0 11:21 ? 00:00:03 /usr/libexec/pacemaker/pengine root 19593 1 0 18:05 ? 00:00:01 /usr/sbin/pacemakerd -f haclust+ 19594 19593 0 18:05 ? 00:00:01 /usr/libexec/pacemaker/cib root 19595 19593 0 18:05 ? 00:00:00 /usr/libexec/pacemaker/stonithd haclust+ 19596 19593 0 18:05 ? 00:00:00 /usr/libexec/pacemaker/attrd haclust+ 19597 19593 0 18:05 ? 00:00:01 /usr/libexec/pacemaker/crmd root 29288 21295 0 19:14 pts/1 00:00:00 grep --color=auto pacemaker ###可以看到corosync和pacemaker已经起来了
9、查看集群的状态
[[email protected] corosync]# corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id= 172.25.0.29 status= ring 0 active with no faults [[email protected] corosync]# ssh node2 corosync-cfgtool -s Printing ring status. Local node ID 2 RING ID 0 id= 172.25.0.30 status= ring 0 active with no faults ###可以发现node1和node2的集群都已经起来。
10、到这里我们先查看集群是否有错:
[[email protected] corosync]# crm_verify -L -V error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid ##发现有错,要我们关掉 stonith-enabled,避免下一步出错我们先关掉这个 [[email protected] corosync]# pcs property set stonith-enabled=false [[email protected] corosync]# crm_verify -L -V [[email protected] corosync]# pcs property list Cluster Properties: cluster-infrastructure: corosync cluster-name: mycluster dc-version: 1.1.16-12.el7_4.2-94ff4df have-watchdog: false stonith-enabled: false
11、现在我们可以下载安装crmsh来操作(从github来下载,然后解压直接安装):
https://codeload.github.com/ClusterLabs/crmsh/tar.gz/2.3.2
node1上:
[[email protected] ~]# cd /usr/local/src/ [[email protected] src]# ls crmsh-2.3.2.tar [[email protected] src]#tar xvf crmsh-2.3.2.tar [[email protected] src]# ls crmsh-2.3.2.tar crmsh-2.3.2 [[email protected] src]# cd crmsh-2.3.2 [[email protected] crmsh-2.3.2]# python setup.py install ##编译安装
node2上:跟node1同样的操作
二、源代码安装nginx和安装nfs
###在node1和node2安装nginx,下面是node1的操作:
1、安装nginx软件依赖包:
yum -y groupinstall "Development Tools" "Server Platform Deveopment" yum -y install openssl-devel pcre-devel
2、在所有的主机上面都操作,下载nginx包
[[email protected] src]# yum install wget –y ##安装wget工具
3、下载nginx包:
[[email protected] src]# wget http://nginx.org/download/nginx-1.12.0.tar.gz
4、添加nginx运行的用户:
[[email protected] sbin]# useradd nginx
5解压nginx包,并且安装:
[[email protected] src]# tar zxvf nginx-1.12.0.tar.gz [[email protected] src]# cd nginx-1.12.0/
6、安装nginx包:
[[email protected] nginx-1.12.0]# ./configure --prefix=/usr/local/nginx --user=nginx --group=nginx --with-http_ssl_module --with-http_flv_module --with-http_stub_status_module --with-http_gzip_static_module --with-pcre ###编译安装 [[email protected] nginx-1.12.0]# make && make install node1、node2装完后测试nginx
6、测试nginx:
node1上:
[[email protected] nginx]# cd /usr/local/nginx/ [[email protected] nginx]# echo node1 > html/index.html [[email protected] nginx]#/usr/local/nginx/sbin/nginx
node2上:
[[email protected] nginx]# cd /usr/local/nginx/ [[email protected] nginx]# echo node2 > html/index.html [[email protected] nginx]#/usr/local/nginx/sbin/nginx
访问web服务:
[[email protected] nginx]#curl 172.25.0.29 node1 [[email protected] nginx]#curl 172.25.0.29 node2
node1、node2可以正常访问
把nginx关闭,因为等会利用corosync和pacemaker自动管理nginx
建个nginx启动脚本,等下启动nginx需要,在node1和node2上都要新建
[[email protected] ~]# cat /etc/systemd/system/nginx.service [Unit] Description=nginx After=network.target [Service] Type=forking ExecStart=/usr/local/nginx/sbin/nginx ExecReload=/usr/local/nginx/sbin/nginx -s reload ExecStop=/usr/local/nginx/sbin/nginx -s quit PrivateTmp=true [Install] WantedBy=multi-user.target ##node2上也同样的操作
需要给脚本执行权限
[[email protected] ~]# chmod a+x /etc/systemd/system/nginx.service [[email protected] ~]# chmod a+x /etc/systemd/system/nginx.service [[email protected] ~]# systemctl enable nginx [[email protected] ~]# systemctl enable nginx ##在systemd资源代理下,要有enable 才能被crm识别,所以要把nginx enable掉
nfs搭建:
nfs的作用我们都明确,所以我们只需在一台上安装就好,我这里在node1安装
[[email protected] ~]#yum install -y rpc-bind nfs-utils [[email protected] ~]# mkdir /www ###新建www的目录,等会用于共享。 [[email protected] ~]# cat /etc/exports /www *(rw,async,no_root_squash) [[email protected] ~]#systemctl restart nfs ###重启nfs [[email protected] ~]# showmount -e 172.25.0.29 Export list for 172.25.0.29: /www * ##可以发现www这个目录已经共享了 [[email protected] ~]# echo node > /www/index.html ###给共享目录添加index.html,用于虚拟ip的访问
三、高可用实现nfs+nginx
1、资源嗲里的使用方法:
在node1上配置:
[[email protected] ~]# crm ra crm(live)ra# info systemd:nginx systemd unit file for nginx (systemd:nginx) Cluster Controlled nginx Operations‘ defaults (advisory minimum): start timeout=100 stop timeout=100 status timeout=100 monitor timeout=100 interval=60
2、进入配置模式configure下:
crm(live)ra# cd crm(live)#cd configure crm(live)configure# primitive webip ocf:heartbeat:IPaddr params ip=172.25.0.100 ###添加虚拟ip ##配置好之后用show查看 crm(live)configure# show node 1: node1 node 2: node2 primitive webip IPaddr params ip=172.25.0.100 property cib-bootstrap-options: have-watchdog=false dc-version=1.1.16-12.el7_4.2-94ff4df cluster-infrastructure=corosync cluster-name=mycluster stonith-enabled=false crm(live)configure# verify #检查脚本是否有错 crm(live)configure# commit ##提交、保存 crm(live)configure# cd
3、定义web服务资源:
进入配置模式configure:
crm(live)configure# primitive webserver systemd:nginx ##添加nginx服务 crm(live)configure# verify WARNING: webserver: default timeout 20s for start is smaller than the advised 100 WARNING: webserver: default timeout 20s for stop is smaller than the advised 100 ### 小于时间间隔会有警告,可以不用理会。 crm(live)configure# commit WARNING: webserver: default timeout 20s for start is smaller than the advised 100 WARNING: webserver: default timeout 20s for stop is smaller than the advised 100
##提交有个警告不用管:
crm(live)configure# show node 1: node1 attributes standby=off node 2: node2 primitive vip IPaddr params ip=172.25.0 primitive web systemd:nginx op monitor interval=30s timeout=100s op start timeout=100s interval=0 op stop timeout=100s interval=0 property cib-bootstrap-options: have-watchdog=false dc-version=1.1.16-12.el7_4.4-94ff4df cluster-infrastructure=corosync cluster-name=mycluster stonith-enabled=false
##我们检测下已经有两个资源了:
crm(live)configure# cd crm(live)# status Stack: corosync Current DC: node1 (version 1.1.16-12.el7_4.2-94ff4df) - partition with quorum Last updated: Sat Oct 14 21:20:59 2017 Last change: Sat Oct 14 21:17:43 2017 by root via cibadmin on node1 2 nodes configured 2 resources configured Online: [ node1 node2 ] Full list of resources: webip (ocf::heartbeat:IPaddr): Started node2 webserver (systemd:nginx): Started node1
##我们也发现默认资源也是均衡了,但是我们发现不均衡了分配了资源,但是我们需要定义是一个组的,所以把两个资源加一组(为了实现高可用)
把两个添加到同个组里面:
crm(live)# configure crm(live)configure# group webservice webip webserver ##添加 webservice webip在同个组里面 crm(live)configure# verify crm(live)configure# commit crm(live)configure# cd .. crm(live)# status Stack: corosync Current DC: node1 (version 1.1.16-12.el7_4.2-94ff4df) - partition with quorum Last updated: Sat Oct 14 21:24:17 2017 Last change: Sat Oct 14 21:24:12 2017 by root via cibadmin on node1 2 nodes configured 2 resources configured Online: [ node1 node2 ] Full list of resources: Resource Group: webservice webip (ocf::heartbeat:IPaddr): Started node1 webserver (systemd:httpd): Started node1 ##可以发现 webservice webip在同个组里面了
4、定义nfs资源:
查看文件系统类型
crm(live)ra# info ocf:heartbeat:Filesystem device* (string): block device The name of block device for the filesystem, or -U, -L options for mount, or NFS mount specification. directory* (string): mount point The mount point for the filesystem. fstype* (string): filesystem type The type of filesystem to be mounted.
###有三个必填项目
##开始配置
crm(live)configure# primitive webstore ocf:heartbeat:Filesystem params device="172.25.0.29:/www" directory="/usr/local/nginx/html" fstype="nfs" op start timeout=60s op stop timeout=60s op monitor interval=20s timeout=40s ###定义/www 挂载到/usr/local/nginx/html下
5、定义排列约束:
crm(live)configure# colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore) crm(live)configure# verify WARNING: webserver_with_webstore_and_webip: resource webserver is grouped, constraints should apply to the group WARNING: webserver_with_webstore_and_webip: resource webip is grouped, constraints should apply to the group crm(live)configure# commit
##查看状态:
crm(live)configure# show node 1: node1 attributes standby=off node 2: node2 primitive webip IPaddr params ip=172.25.0.100 primitive webserver systemd:nginx op monitor interval=30s timeout=100s op start timeout=60s interval=0 op stop timeout=60s interval=0 primitive webstore Filesystem params device="172.25.0.29:/www" directory="/usr/local/nginx/html" fstype=nfs op start timeout=60s interval=0 op stop timeout=60s interval=0 op monitor interval=20s timeout=40s group webservice webip webserver colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore ) property cib-bootstrap-options: have-watchdog=false dc-version=1.1.16-12.el7_4.4-94ff4df cluster-infrastructure=corosync cluster-name=mycluster stonith-enabled=false \
6、定义执行顺序:
crm(live)configure# order webstore_after_webip Mandatory: webip webstore crm(live)configure# verify crm(live)configure# order webserver_after_webstore Mandatory: webstore webserver crm(live)configure#
###查看一下状态
crm(live)# status Stack: corosync Current DC: node1 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 20:46:41 2017 Last change: Wed Oct 25 16:56:52 2017 by root via cibadmin on node1 2 nodes configured 3 resources configured Online: [ node1 node2 ] Full list of resources: Resource Group: webservice webip(ocf::heartbeat:IPaddr):Started node1 webserver(systemd:nginx):Started node1 webstore(ocf::heartbeat:Filesystem):Started node1 ##可以看到我们的顺序是webip webserver webstore
7、测试
[[email protected] ~]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:49:e9:da brd ff:ff:ff:ff:ff:ff inet 172.25.0.29/24 brd 172.25.0.255 scope global ens33 valid_lft forever preferred_lft forever inet 172.25.0.100/24 brd 172.25.0.255 scope global secondary ens33 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe49:e9da/64 scope link
可以看到vip已经起来了
接下来访问web服务:
[[email protected] ~]# curl 172.25.0.100 node
可以发现访问的是/www/index里的内用
我们把node1的pacemaker和corosync停掉
[[email protected] ~]# systemctl stop pacemaker ##先关pacemaker先 [[email protected] ~]# systemctl stop corosync
在node2上可以看到node2已经接管了
[[email protected] crmsh-2.3.2]# crm crm(live)# status Stack: corosync Current DC: node2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 20:54:33 2017 Last change: Wed Oct 25 16:56:52 2017 by root via cibadmin on node1 2 nodes configured 3 resources configured Online: [ node2 ] OFFLINE: [ node1 ] Full list of resources: Resource Group: webservice webip(ocf::heartbeat:IPaddr):Started node2 webserver(systemd:nginx):Started node2 webstore(ocf::heartbeat:Filesystem):Started node2 crm(live)#exit [[email protected] crmsh-2.3.2]# ip addr 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN qlen 1 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo valid_lft forever preferred_lft forever inet6 ::1/128 scope host valid_lft forever preferred_lft forever 2: ens33: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000 link/ether 00:0c:29:64:00:b1 brd ff:ff:ff:ff:ff:ff inet 172.25.0.30/24 brd 172.25.0.255 scope global ens33 valid_lft forever preferred_lft forever inet 172.25.0.100/24 brd 172.25.0.255 scope global secondary ens33 valid_lft forever preferred_lft forever inet6 fe80::20c:29ff:fe64:b1/64 scope link
##vip已经转移到node2上
[[email protected] crmsh-2.3.2]# df -h Filesystem Size Used Avail Use% Mounted on /dev/mapper/cl-root 18G 2.5G 16G 14% / devtmpfs 226M 0 226M 0% /dev tmpfs 237M 86M 151M 37% /dev/shm tmpfs 237M 8.6M 228M 4% /run tmpfs 237M 0 237M 0% /sys/fs/cgroup /dev/sda1 1014M 197M 818M 20% /boot tmpfs 48M 0 48M 0% /run/user/0 172.25.0.29:/www 18G 2.5G 16G 14% /usr/local/nginx/html
###/www 也已经挂载到 /usr/local/nginx/html下
[[email protected] crmsh-2.3.2]# curl 172.25.0.100 node
###访问web资源也没问题了,说明实现成功
在node1上把pacemaker和corosync重启
[[email protected] ~]# crm crm(live)# status Stack: corosync Current DC: node2 (version 1.1.16-12.el7_4.4-94ff4df) - partition with quorum Last updated: Wed Oct 25 21:00:40 2017 Last change: Wed Oct 25 16:56:52 2017 by root via cibadmin on node1 2 nodes configured 3 resources configured Online: [ node1 node2 ] Full list of resources: Resource Group: webservice webip(ocf::heartbeat:IPaddr):Started node2 webserver(systemd:nginx):Started node2 webstore(ocf::heartbeat:Filesystem):Started node2 crm(live)#
###可以看到node2已经接管了。
四、其他优化
如果设置抢占模式可以这样设
crm(live)configure# location nginx_in_node1 nginx inf: node1 ###位置绑定,慎用
服务管理
crm(live)configure# property migration-limit=1 ###当本地服务停掉了,将会本地服务一次,如果起不来就换到另一主机的服务。
crm更改文件
crm(live)# configure crm(live)configure# edit ###会进入配置文件,模式相当于vim的模式 node 1: node1 attributes standby=off node 2: node2 primitive webip IPaddr params ip=172.25.0.100 primitive webserver systemd:nginx op monitor interval=30s timeout=100s op start timeout=60s interval=0 op stop timeout=60s interval=0 primitive webstore Filesystem params device="172.25.0.29:/www" directory="/usr/local/nginx/html" fstype=nfs op start timeout=60s interval=0 op stop timeout=60s interval=0 op monitor interval=20s timeout=40s group webservice webip webserver order webserver_after_webstore Mandatory: webstore webserver colocation webserver_with_webstore_and_webip inf: webserver ( webip webstore ) order webstore_after_webip Mandatory: webip webstore property cib-bootstrap-options: have-watchdog=false dc-version=1.1.16-12.el7_4.4-94ff4df cluster-infrastructure=corosync cluster-name=mycluster stonith-enabled=false migration-limit=1
###可以看到刚刚配的内容 ,可以增删修改。
以上所有是我基于pacemaker+corosync实现nfs+nginx部署内容。
本文出自 “我的运维” 博客,请务必保留此出处http://xiaozhagn.blog.51cto.com/13264135/1976185
以上是关于基于corosync+pacemaker实现nfs+nginx部署的主要内容,如果未能解决你的问题,请参考以下文章
pcs+pacemaker+corosync+nfs配置高可用
Corosync+Pacemaker+DRBD+NFS高可用实例配置
corosync+pacemaker+nginx+nfs高可用
corosync+pacemaker+crmsh的高可用web集群的实现