一、keepalived简介:

keepalived是一个类似于layer3, 4 & 5交换机制的软件,也就是我们平时说的第3层、第4层和第5层交换。Keepalived的作用是检测web服务器的状态,如果有一台web服务器死机,或工作出现故障,Keepalived将检测到,并将有故障的web服务器从系统中剔除,当web服务器工作正常后Keepalived自动将web服务器加入到服务器群中,这些工作全部自动完成,不需要人工干涉,需要人工做的只是修复故障的web服务器。

 

工作原理

技术分享图片

Layer3,4&5工作在IP/TCP协议栈的IP层,TCP层,及应用层,原理分别如下:

Layer3:Keepalived使用Layer3的方式工作式时,Keepalived会定期向服务器群中的服务器发送一个ICMP的数据包(既我们平时用的Ping程序),如果发现某台服务的IP地址没有激活,Keepalived便报告这台服务器失效,并将它从服务器群中剔除,这种情况的典型例子是某台服务器被非法关机。Layer3的方式是以服务器的IP地址是否有效作为服务器工作正常与否的标准。

Layer4:如果您理解了Layer3的方式,Layer4就容易了。Layer4主要以TCP端口的状态来决定服务器工作正常与否。如web server的服务端口一般是80,如果Keepalived检测到80端口没有启动,则Keepalived将把这台服务器从服务器群中剔除。

Layer5:Layer5就是工作在具体的应用层了,比Layer3,Layer4要复杂一点,在网络上占用的带宽也要大一些。Keepalived将根据用户的设定检查服务器程序的运行是否正常,如果与用户的设定不相符,则Keepalived将把服务器从服务器群中剔除。

 

二、实验步骤:

1.创建管理节点在node1上,建立双机互信node1和node2,然后同步时间,安装keepalived

[[email protected]~]# ansible all -m yum -a ‘name=keepalived state=present‘
[[email protected]]# rpm -qc keepalived
/etc/keepalived/keepalived.conf//生成的主配置文件
/etc/sysconfig/keepalived
 

 

2.在node1上配置文件需要做一下修改

global_defs{
   notification_email {
        [email protected]         //收邮件人,可以定义多个
   }
   notification_email_from [email protected]       //发邮件人可以伪装
   smtp_server 127.0.0.1  //发送邮件的服务器地址
   smtp_connect_timeout 30 //连接超时时间
   router_id LVS_DEVEL        
}
vrrp_instanceVI_1 {    //每一个vrrp_instance就是定义一个虚拟路由器的
    state MASTER       //由初始状态状态转换为master状态
    interface eth0     
    virtual_router_id 51    //虚拟路由的id号,一般不能大于255的
    priority 100    //初始化优先级
    advert_int 1    //初始化通告
    authentication {   //认证机制
        auth_type PASS
        auth_pass 1111   //密码
    }
    virtual_ipaddress {     //虚拟地址vip
       172.16.2.8
    }
}
 

3.把配置文件复制到node2上一份,并修改初始状态和优先级

[[email protected]]# scp keepalived.conf node2:/etc/keepalived/
[[email protected]~]# cd /etc/keepalived/
[[email protected]]# ls
keepalived.conf
[[email protected]]# vim keepalived.conf
vrrp_instanceVI_1 {
    state BACKUP //初始化状态
    interface eth0
    virtual_router_id 51
    priority 99      //优先级,一定要比master的优先级要低
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.2.8
    }
}
 

在node1上开始启动服务[[email protected] ~]# servicekeepalived start

然后检查ip地址

[[email protected]~]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:4e:22:fb brdff:ff:ff:ff:ff:ff
  inet 172.16.2.1/16 brd 172.16.255.255 scopeglobal eth0
    inet 172.16.2.8/32 scopeglobal eth0
 inet 172.16.10.8/16 brd 172.16.255.255 scopeglobal secondary eth0:0
    inet6 fe80::20c:29ff:fe4e:22fb/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
    link/ether 2e:79:b3:b2:3e:31 brdff:ff:ff:ff:ff:ff
 

4.现在把node1的keepalived停掉

[[email protected]]# service keepalived stop

Stoppingkeepalived: [ OK ]

验证node2是否把virtual_ipaddress拿走

[[email protected]~]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:74:c7:7b brdff:ff:ff:ff:ff:ff
    inet 172.16.2.16/16 brd172.16.255.255 scope global eth0
    inet 172.16.2.8/32 scopeglobal eth0
    inet6 fe80::20c:29ff:fe74:c77b/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether0a:b1:ef:7b:93:18 brd ff:ff:ff:ff:ff:ff
 

验证成功

 

可以在配置文件中手动通过vrrp_script定义一个外围的检测机制,并在vrrp_instance中通过定义track_script来追踪脚本执行过程,实现节点转移

实验测试在/etc/keepalived/keepalived.conf中做一下修改

global_defs{
   notification_email {
        [email protected]
   }
   notification_email_from [email protected]
   smtp_server 127.0.0.1
   smtp_connect_timeout 30
   router_id LVS_DEVEL
}
vrrp_script chk_maintainace { //检测机制的脚本名称为chk_maintainace
        script "[[ -e/etc/keepalived/down ]] && exit 1 || exit 0" //可以是个脚本路径,也可以是脚本命令
        interval 1 //每隔1秒中检测一次
        weight -2 //优先级减2
}
vrrp_instanceVI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.2.8
    }
   track_script { //调用外围脚本,追踪外围脚本执行过程
        chk_maintainace
}
}
[[email protected] keepalived]# touch down //在node1上创建down文件
[[email protected] keepalived]# ls
down  keepalived.conf  keepalived.conf.bak
 

在node2上做同样的操作,但不创建down文件,之后一起重启服务

[[email protected] keepalived]# ansible all -m shell -a ‘service keepalivedrestart‘
node2.magedu.com| success | rc=0 >>
Stoppingkeepalived: [FAILED]
Startingkeepalived: [  OK  ]
node1.magedu.com| success | rc=0 >>
Stoppingkeepalived: [  OK  ]
Startingkeepalived: [  OK  ]
 

进行检测

[[email protected]]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:74:c7:7b brdff:ff:ff:ff:ff:ff
    inet 172.16.2.16/16 brd172.16.255.255 scope global eth0
    inet 172.16.2.8/32 scopeglobal eth0
    inet6 fe80::20c:29ff:fe74:c77b/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether0a:b1:ef:7b:93:18 brd ff:ff:ff:ff:ff:ff
 

此时将node1中/etc/keepalived/下的down删除,进行查看

[[email protected]]# ls
down  keepalived.conf  keepalived.conf.bak
[[email protected]]# rm down
rm:remove regular empty file `down‘? y
[[email protected]]# ls
keepalived.conf  keepalived.conf.bak
 [[email protected] keepalived]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:4e:22:fb brdff:ff:ff:ff:ff:ff
   inet 172.16.2.1/16 brd 172.16.255.255 scopeglobal eth0
    inet 172.16.2.8/32 scopeglobal eth0
    inet 172.16.10.8/16 brd 172.16.255.255scope global secondary eth0:0
    inet6 fe80::20c:29ff:fe4e:22fb/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether0a:bd:4f:a9:ed:67 brd ff:ff:ff:ff:ff:ff
 

 

验证成功

 

三、详细介绍以下四个功能实现操作

1.如何在状态转换时进行通知?

2.如何配置Ipvs?

3.如何对某特定服务做高可用

4.如何实现基于多虚拟路由的master/master模型?

 

1.要在状态转换是进行通知,需要定义通知脚本可以在

vrrp_sync_group{

}中定义,也可以在

vrrp_instance{

}中定义

通过man keepalived命令可以查看通知脚本定义的两种方法

第一种

# to MASTER transition

notify_master /path/to_master.sh

# to BACKUP transition

notify_backup /path/to_backup.sh

# FAULT transition

notify_fault "/path/fault.sh VG_1"

第二种

#arguments

# $1 ="GROUP"|"INSTANCE"

# $2 = name of group or instance

# $3 = target state of transition

# ("MASTER"|"BACKUP"|"FAULT")

notify /path/notify.sh

 

例如:

转换为MASTER的状态通知

#!/bin/bash
#
vip=172.16.2.8
contact=[email protected]‘
thisip=`ifconfigeth0 | awk ‘/inet addr:/{print $2}‘ | awk -F: ‘{print $2}‘`
notify(){
      mailbody="vrrp transaction, $vipfloated to $thisip."
      subject="$thisip is to be $vipmaster"
      echo $mailbody | mail -s $subject $contact
}
notify
 

其他状态转换类似

下面用一个脚本notify.sh实现状态转换通知的简单示例:

#!/bin/bash
#Author: MageEdu <[email protected]>
#description: An example of notify script
#
vip=172.16.2.8
contact=[email protected]‘
notify(){
    mailsubject="`hostname` to be $1: $vipfloating"
    mailbody="`date ‘+%F %H:%M:%S‘`: vrrptransition, `hostname` changed to be $1"
    echo $mailbody | mail -s"$mailsubject" $contact
}
case"$1" in
    master)
        notify master
        exit 0
    ;;
    backup)
        notify backup
        exit 0
    ;;
    fault)
        notify fault
        exit 0
    ;;
    *)
        echo ‘Usage: `basename $0`{master|backup|fault}‘
        exit 1
    ;;
esac
 

进行测试

[[email protected]]# ./notify.sh backup
[[email protected]]# mail
HeirloomMail version 12.4 7/29/08.  Type ? forhelp.
"/var/spool/mail/root":6 messages 1 new 6 unread
 U  [email protected]  Sat Aug 1709:34  17/644   "*** SECURITY"
 U  2Cron Daemon           Tue Aug 2700:01  22/747   "Cron <[email protected]"
 U  3Cron Daemon           Fri Aug 3000:01  22/747   "Cron <[email protected]"
 U  4Mail Delivery System  Fri Aug 3017:42  91/2751  "Undelivered "
 U  5Cron Daemon           Tue Sep  3 00:01 22/747   "Cron<[email protected]"
>N  6 root                  Thu Sep 26 21:19  18/700  "node1.magedu"
&6
Message  6:
[email protected]  Thu Sep 2621:19:32 2013
Return-Path:<[email protected]>
X-Original-To:[email protected]
Delivered-To:[email protected]
Date:Thu, 26 Sep 2013 21:19:32 +0800
To:[email protected]
Subject:node1.magedu.com to be backup: 172.16.2.8 floating
User-Agent:Heirloom mailx 12.4 7/29/08
Content-Type:text/plain; charset=us-ascii
From:[email protected] (root)
Status:R
2013-09-26 21:19:32: vrrp transition, node1.magedu.com changed to bebackup
&quit
Held6 messages in /var/spool/mail/root
Youhave mail in /var/spool/mail/root
 

通过传参数master|backup|fault验证都可以成功

在配置文件keepalived.conf中进行脚本调用

vrrp_instanceVI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.2.8
    }
   track_script {
        chk_maintainace
}
  notify_master "/etc/keepalived/notify.shmaster"
   notify_backup"/etc/keepalived/notify.sh backup"
   notify_fault"/etc/keepalived/notify.sh fault"
}
 

为node2提供同样的配置然后进行测试

[[email protected]]# ls

down keepalived.conf keepalived.conf.bak notify.sh

[[email protected]]# rm -f down

[[email protected]]# mail

>N18 root Thu Sep 2621:57 18/700 "node1.magedu.comto be master: 172.16.2.8 floating"截取了一条

验证都可以成功

 

2、如何配置ipvs

virtual_server172.16.2.8 80{
    delay_loop 6
    lb_algo rr
    lb_kind NAT
    nat_mask 255.255.0.0
    persistence_timeout 0
    protocol TCP
#
    real_server 172.16.2.1 80 {
        weight 1
        HTTP_GET {
            url {
              path /
      state_code 200
            }
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
    }
    real_server 172.16.2.16 80 {
        weight 1
        HTTP_GET {
            url {
              path /
      state_code 200
            }
            connect_timeout 3
            nb_get_retry 3
            delay_before_retry 3
        }
}
}
 

在node2上做同样的修改,启动httpd服务,keepalived能自动生成规则,然后查看ipvsadm规则

[[email protected]]# ipvsadm -L -n
IPVirtual Server version 1.2.1 (size=4096)
ProtLocalAddress:Port Scheduler Flags
  -> RemoteAddress:Port           Forward Weight ActiveConn InActConn
TCP  172.16.2.8:80 rr
  -> 172.16.2.1:80                Local   1     0          0        
  -> 172.16.2.16:80               Masq    1     0          0
 

3、如何对某特定服务做高可用?以nginx为例进行讲解

在两个节点上安装nginx

[[email protected]~]# ansible all -m yum -a ‘name=nginx state=present‘

启动nginx服务,启动之前注意要停止httpd服务

[[email protected]~]# ansible all -m shell -a ‘service nginx start‘
node2.magedu.com| success | rc=0 >>
Startingnginx: [  OK  ]
node1.magedu.com| success | rc=0 >>
Startingnginx: [  OK  ]
 

对node1和node2中/etc/keepalived/下的notify.sh脚本进行修改

#!/bin/bash
#Author: MageEdu <[email protected]>
#description: An example of notify script
#
vip=172.16.2.8
contact=[email protected]‘
notify(){
    mailsubject="`hostname` to be $1: $vipfloating"
    mailbody="`date ‘+%F %H:%M:%S‘`: vrrptransition, `hostname` changed to be $1"
    echo $mailbody | mail -s"$mailsubject" $contact
}
case"$1" in
    master)
        notify master
      /etc/rc.d/init.d/nginx start
        exit 0
    ;;
    backup)
        notify backup
      /etc/rc.d/init.d/nginx stop
        exit 0
    ;;
    fault)
        notify fault
      /etc/rc.d/init.d/nginx stop
        exit 0
    ;;
    *)
        echo ‘Usage: `basename $0`{master|backup|fault}‘
        exit 1
    ;;
esac
 

然后启动keepalived服务,可以看到在node1上80端口开始启用

[[email protected]]# ss -tanl | grep :80

LISTEN 0 128 *:80 *:*

然后在/etc/keepalive/下创建down文件,看nginx服务是否可以转移到node2上

[[email protected]]# ls
keepalived.conf  keepalived.conf.bak  notify.sh
[[email protected]]# touch down
[[email protected]]# ss -tanl | grep :80
[[email protected]]#
在node2上进行查看
[[email protected]]# ss -tanl | grep :80
LISTEN     0     128                      *:80                       *:*
 

验证成功,说明实现了nginx的高可用服务

总结:要对某特定服务做高可用有两个要点

一是:要提供监控服务脚本

二是:在vrrp实例中追踪服务

修改配置文件keepalived.conf

vrrp_script chk_nginx {
        script "killall -0 nginx"
        interval 1
        weight -2
}
vrrp_instanceVI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
        172.16.2.8
    }
   track_script {
        chk_maintainace
      chk_nginx
}
 

在node2上做同样的修改

测试:

[[email protected]]# killall nginx

Youhave new mail in /var/spool/mail/root

[[email protected]]# ss -tanl | grep :80

[[email protected]]#

在node1上

[[email protected]]# ss -tanl | grep :80

LISTEN 0 128 *:80 *:*

验证成功

4、如何实现基于多虚拟路由的master/master模型?

要实现双主模型需要定义两个vrrp_instance,在node1的配置文件中要一下修改:

vrrp_instanceVI_1 {
    state MASTER
    interface eth0
    virtual_router_id 51
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 1111
    }
    virtual_ipaddress {
      172.16.2.8
    }
   track_script {
      chk_maintainace
      chk_nginx
}    
   notify_master"/etc/keepalived/notify.sh master"
   notify_backup "/etc/keepalived/notify.shbackup"
   notify_fault "/etc/keepalived/notify.shfault"
}
vrrp_instance VI_2 {
    state BACKUP
    interface eth0
    virtual_router_id 55
    priority 99
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass 2111
    }
    virtual_ipaddress {
        172.16.2.18
    }
   track_script {
        chk_maintainace
        chk_nginx
}
   notify_master"/etc/keepalived/notify.sh master"
   notify_backup"/etc/keepalived/notify.sh backup"
   notify_fault "/etc/keepalived/notify.shfault"
}
 

在node2上做同样的修改,重启keepalived,进行测试

[[email protected]]# service nginx status
nginx(pid  28688) is running...
[[email protected]]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:4e:22:fb brdff:ff:ff:ff:ff:ff
   inet 172.16.2.1/16 brd 172.16.255.255 scopeglobal eth0
    inet 172.16.2.8/32 scopeglobal eth0
    inet 172.16.10.8/16 brd 172.16.255.255scope global secondary eth0:0
    inet6 fe80::20c:29ff:fe4e:22fb/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
    link/ether 6a:7a:4f:e0:c1:8a brdff:ff:ff:ff:ff:ff
Youhave new mail in /var/spool/mail/root
 

在node2上

[[email protected]]# service nginx start
Startingnginx:                                           [  OK  ]
[[email protected]]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:74:c7:7b brd ff:ff:ff:ff:ff:ff
    inet 172.16.2.16/16 brd172.16.255.255 scope global eth0
    inet 172.16.2.18/32 scopeglobal eth0
    inet6 fe80::20c:29ff:fe74:c77b/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
link/ether3a:4e:e8:4c:57:04 brd ff:ff:ff:ff:ff:ff
让node2的keepalived停掉,查看地址是否发生转移
[[email protected]]# ip addr show
1:lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue state UNKNOWN
    link/loopback 00:00:00:00:00:00 brd00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
2:eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast stateUP qlen 1000
    link/ether 00:0c:29:4e:22:fb brd ff:ff:ff:ff:ff:ff
    inet 172.16.2.1/16 brd172.16.255.255 scope global eth0
    inet 172.16.2.8/32 scopeglobal eth0
    inet 172.16.2.18/32 scopeglobal eth0
    inet 172.16.10.8/16 brd 172.16.255.255scope global secondary eth0:0
    inet6 fe80::20c:29ff:fe4e:22fb/64 scopelink
       valid_lft forever preferred_lft forever
3:pan0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN
    link/ether 6a:7a:4f:e0:c1:8a brdff:ff:ff:ff:ff:ff
Youhave new mail in /var/spool/mail/root
 

验证成功