5-4keepalived与nginx实现高可用故障转移实战

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了5-4keepalived与nginx实现高可用故障转移实战相关的知识,希望对你有一定的参考价值。

回顾:
keepalived:HA Cluster高可用集群的实现
vrrp:虚拟冗余路由协议
虚拟路由器:物理路由器
VRID:Virtual Router ID
Master/Backup
一主一备货一主多备
priority
抢占模式/非抢占模式
ipvs wrapper(checkers);
checkers:对各VS的各RS做健康状态检测
应用层检测:HTTP_GET,SSL_GET,SMTP_CHECK
传输层检测:TCP_CHECK
自定义检测:MISC_CHECK(例如mysql数据检测),自定义脚本检测

keepalived内建是没有高可用nginx这种功能,要想高可用nginx,要确保两个节点上的nginx服务都运行起来就可以,不用管是不是主节点,需要借助外部脚本把nginx服务启动起来或者重启,并且nginx服务发生故障时还能转移故障,降低优先级(不能当作主节点了)

视频内课件:
keepalived调用外部的辅助脚本进行资源监控,并根据监控的结果状态能实现有限动态调整;
分两步:(1)先定义一个脚本;(2)调用此脚本;
vrrp_script <SCRIPT_NAME> {---定义一个脚本
script "一行命令或者外部脚本路径"
interval INT---每隔多长时间,上边的脚本要执行一次,万一失败了,权重要减去多少
weight -INT
}

track_script {---使用这个命令去调用脚本,而且可以调用多个脚本
    SCRIPT_NAME_1
    SCRIPT_NAME_2
    ...
}

示例:高可用nginx服务
!Configuration File for keepalived

global_defs {
notification_email {br/>[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.100.19
}

vrrp_script chk_down {
script "[[-f /etc/keepalived/down]] && exit 1 || exit 0"---这个文件如果存在就错误,不存在就成功,意思就是想让nginx降权就touch一个down文件
interval 1
weight -5
}

vrrp_script chk_nginx {
script "killall -0 nginx && exit 0 ||exit 1"---killall -0看这个进程能不能关闭,表示这个进程在,不真杀进程,而是看能不能杀,如果成功了返回0,如果失败了返回1
interval 1
weight -5
fall 2---检测失败2次,才会认为有问题
rise 1---如果以前是失败的,现在一检测又成功了,立即加上减去的权重,并抢占资源
}

vrrp_instance VI_1 {
state MASTER
interface eno16777736
virtual_router_id 14
priority 100
advert_int 1
authentication {
auth_type PASS
auth_pass 571f97h2
}
virtual_ipaddress {
10.1.0.93/16 dev eno16777736
}
track_script {---调用脚本
chk_down
chk_nginx
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"
}
博客作业:
(1)双主模型的ipvs高可用集群
(2)双主模型的nginx proxy高可用集群

测试:ipvs使用sh算法或持久连接时,故障切换后,同一个客户端是否依然能关联至此前绑定的RS
nginx使用ip_hash或hash $request_url算法时,故障切换后,同一个客户端是否依然能关联至此前绑定的upstream server;

视频中的演示:两台nginx,一台虚拟主机启动多个web服务(监听多个接口)用来模拟多台主机

首先都同步下时间,并安装keepalived服务
yum -y install keepalived
ntpdate 172.16.0.1

===================================================================
node1:172.16.0.6
ntpdate 172.16.0.1
vim /etc/keepalived/keepalived.conf
!Configuration File for keepalived

global_defs {
notification_email {
br/>[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.101.33
}

vrrp_script chk_down {---脚本要定义在示例外边
script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0"---存在就错误退出,不存在就正确退出
weight -10---脚本失败了就降权
interval 1---监测间隔时间1秒
fall 1--失败几次认为失效
rise 1---检测几次认为正常
}

vrrp_script chk_ngx {
script "killall -0 nginx && exit 0 || exit 1"---nginx存在就失败,不存在就成功
weight -10---脚本失败了就降权
interval 2---监测间隔时间1秒
fall 3--失败几次认为失效
rise 3---检测几次认为正常
}

vrrp_instance VI_1 {
state MASTER
priority 100
interface eno16777736
virtual_router_id 33
advert_int 1
authentication {
auth_type PASS
auth_pass RT3SKUI2
}
virtual_ipaddress {
172.16.0.77/16 dev eno16777736 label eno16777736:0
}

track_script {---跟踪下面这个脚本
    chk_down
    chk_ngx
}

notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"

}
systemclt start keepalived.service
systemctl status keepalived.service---查看服务状态
ifconfig---可以看到已经配置上地址了
此时在/etc/keepalived/下创建文件downnode1节点会变为备用节点
在node2节点上运行下面的命令
tcpdump -i eno16777736 -nn host 224.1.101.33---可以看到监听在这个地址上的信息
node1节点上运行下面的命令
rm -f down---删除以后就可以看到地址转移给node2节点了

下边的演示是单主节点,节点变为主节点nginx服务上线,变为备用节点,nginx服务下线,两个节点都做如下配置
先安装nginx服务
yum -y install nginx
vim /etc/nginx/nginx.conf---nginx主要是作为反代服务器
在server上下文中添加一行
location / {
proxy_pass
http://websrvs;
}
upstream websrvs {
server 192.168.10.11:80;
server 192.168.10.12:80;
server 192.168.10.13:80;
}
nginx -t
systemctl start nginx.service
curl http://172.16.0.6/---可以看到是轮询访问三个主机
curl http://172.16.0.7/---可以看到是轮询访问三个主机

现在先验证能不能监控节点变为主节点以后nginx服务能启动起来(先把两个节点的nginx服务都停掉,systemctl stop nginx.service)
vim /etc/keepalived/notify.sh
#!/bin/bash
#
contact=‘[email protected]

notify {
local mailsubject="$(hostname) to be $1,vip floating"
local mailbody="$(date + ‘%F %T‘):vrrp transition,$(hostname) changed to be $1"
echo "$mailbody" | mail -s "$mailsubject" $contact
}
case $1 in
master)
systemctl start nginx.service---成为主节点就启动nginx
notify master
;;
backup)
systemctl start nginx.service---成为备用节点就启动nginx
notify backup
;;
fault)
systemctl stop nginx.service---成为故障节点就停掉nginx
notify fault
;;
*)
echo "Usage:$(basename $0) {master|backup|fault}"
exit 1
;;
esac

此时,创建down文件,就会转移地址到node2节点,删除down文件,就会转移到node1节点
注意:不要随便停掉nginx服务也不要重启,因为一旦监测失败就会降权,主节点备节点都是这样,所以还要修改通知脚本中的backup状态也改为启动nginx服务,保证服务不下线,但是地址会转移,而且还要监控nginx进程来完成降权目的,还要在配置文件中添加一个脚本vrrp_script chk_ngx,
注意:如何让nginx启动不起来?启动httpd抢占80端口即可
killall nginx && systemctl start httpd
自己强行让服务下线以后,需要手动启动服务,才能让地址转移过来,或者让另外一个节点下线

下面的是双主模型
vim /etc/keepalived/keepalived.conf
!Configuration File for keepalived

global_defs {
notification_email {br/>[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node1
vrrp_mcast_group4 224.0.101.33
}

vrrp_script chk_down {---脚本要定义在示例外边
script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0"---存在就错误退出,不存在就正确退出
weight -10---脚本失败了就降权
interval 1---监测间隔时间1秒
fall 1--失败几次认为失效
rise 1---检测几次认为正常
}

vrrp_script chk_ngx {
script "killall -0 nginx && exit 0 || exit 1"---nginx存在就失败,不存在就成功
weight -10---脚本失败了就降权
interval 2---监测间隔时间1秒
fall 3--失败几次认为失效
rise 3---检测几次认为正常
}

vrrp_instance VI_1 {
state MASTER
priority 100
interface eno16777736
virtual_router_id 33
advert_int 1
authentication {
auth_type PASS
auth_pass RT3SKUI2
}
virtual_ipaddress {
172.16.0.77/16 dev eno16777736 label eno16777736:0
}

track_script {---跟踪下面这个脚本
    chk_down
    chk_ngx
}

notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"---双主模型nginx就不能停掉了

}

vrrp_instance VI_2 {
state BACKUP---另一个节点改为master
priority 96---另一个节点改为100
interface eno16777736
virtual_router_id 43
advert_int 1
authentication {
auth_type PASS
auth_pass RT7SKUI2
}
virtual_ipaddress {
172.16.0.78/16 dev eno16777736 label eno16777736:1
}

track_script {---跟踪下面这个脚本
    chk_down
    chk_ngx
}

track_interface {---生产环境中还会监控接口信息
    eno16777736
    eno33554984
}
notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"---双主模型nginx就不能停掉了

}

systemctl stop keepalived.service
systemctl start keepalived.service
systemctl status keepalived.service---然后就可以看到每个节点都拿到地址了,业务正常了

=====================================================================
node2:172.16.0.7
ntpdate 172.16.0.1
vim /etc/keepalived/keepalived.conf
!Configuration File for keepalived

global_defs {
notification_email {
br/>[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id node2
vrrp_mcast_group4 224.0.101.33
}

vrrp_script chk_down {---脚本要定义在示例外边
script "[[ -f /etc/keepalived/down ]] && exit 1 || exit 0"---存在就错误退出,不存在就正确退出
weight -10---脚本失败了就降权
interval 1---监测间隔时间1秒
fall 1--失败几次认为失效
rise 1---检测几次认为正常
}

vrrp_instance VI_1 {
state BACKUP
priority 96
interface eno16777736
virtual_router_id 33
advert_int 1
authentication {
auth_type PASS
auth_pass RT3SKUI2
}
virtual_ipaddress {
172.16.0.77/16 dev eno16777736 label eno16777736:0
}

track_script {---跟踪下面这个脚本
    chk_down
}

notify_master "/etc/keepalived/notify.sh master"
notify_backup "/etc/keepalived/notify.sh backup"
notify_fault "/etc/keepalived/notify.sh fault"

}

server:192.168.10.11/24,192.168.10.12/24,192.168.10.13/24配置三个IP地址
ntpdate 172.16.0.1
vim /etc/httpd/conf.d/vhosts.conf
<VirtualHost 192.168.10.11:80>
ServerName 192.168.10.11
DocumentRoot "/data/web/vhost1"
<Directory "/data/web/vhost1">
Options FollowSymLinks
AllowOverride None
Require all granted
</Directory>
</VirtualHost>

<VirtualHost 192.168.10.12:80>
ServerName 192.168.10.12
DocumentRoot "/data/web/vhost2"
<Directory "/data/web/vhost2">
Options FollowSymLinks
AllowOverride None
Require all granted
</Directory>
</VirtualHost>

<VirtualHost 192.168.10.13:80>
ServerName 192.168.10.13
DocumentRoot "/data/web/vhost3"
<Directory "/data/web/vhost3">
Options FollowSymLinks
AllowOverride None
Require all granted
</Directory>
</VirtualHost>

编辑好以后测试语法
httpd -t---测试语法,提示目录不存在
mkdir -pv /data/web/vhost{1,2,3}
vim /data/web/vhost1/index.
html
<h1>Vhost1</h1>
vim /data/web/vhost2/index.html
<h1>Vhost2</h1>
vim /data/web/vhost3/index.html
<h1>Vhost3</h1>

systemctl start httpd.service

以上是关于5-4keepalived与nginx实现高可用故障转移实战的主要内容,如果未能解决你的问题,请参考以下文章

Rabbitmq+Nginx+keepalived高可用热备

Keepalived实现LVS-DR集群高可用

Nginx与keepalived实现高可用

Keepalived实现Nginx与LVS高可用

初识keepalived——keepalived与nginx代理实现高可用

keepalive高可用nginx(nginx动静分离)的实现