4keepalived高可用nginx负载均衡

Posted 2021-01-26 Hi

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了4keepalived高可用nginx负载均衡相关的知识，希望对你有一定的参考价值。

keepalived：

HTTP_GET //使用keepalived获取后端real server健康状态检测

SSL_GET(https) //这里以为这后端使用的是http协议

TCP_CHECK

下面演示基于TCP_CHECK做检测

# man keepalived //查看TCP_CHECK配置段

# TCP healthchecker
TCP_CHECK
{
# ======== generic connection options
# Optional IP address to connect to.
# The default is the realserver IP //默认使用real server的IP
connect_ip <IP ADDRESS> //可省略
# Optional port to connect to
# The default is the realserver port
connect_port <PORT> //可省略
# Optional interface to use to
# originate the connection
bindto <IP ADDRESS>
# Optional source port to
# originate the connection from
bind_port <PORT>
# Optional connection timeout in seconds.
# The default is 5 seconds
connect_timeout <INTEGER>
# Optional fwmark to mark all outgoing
# checker packets with
fwmark <INTEGER>

# Optional random delay to start the initial check
# for maximum N seconds.
# Useful to scatter multiple simultaneous
# checks to the same RS. Enabled by default, with
# the maximum at delay_loop. Specify 0 to disable
warmup <INT>
# Retry count to make additional checks if check
# of an alive server fails. Default: 1
retry <INT>
# Delay in seconds before retrying. Default: 1
delay_before_retry <INT>
} #TCP_CHECK

# cd /etc/keepalived

# vim keepalived.conf //两台keepalived都要设置

 1 virtual_server 192.168.184.150 80 {    //这里可以合并
 2     delay_loop 6
 3     lb_algo wrr 
 4     lb_kind DR
 5     net_mask 255.255.0.0
 6     protocol TCP 
 7     sorry_server 127.0.0.1 80
 8 
 9     real_server 192.168.184.143 80 {
10         weight 1
11         TCP_CHECK {
12             connect_timeout 3
13         }   
14     }   
15 
16     real_server 192.168.184.144 80 {
17         weight 2
18         TCP_CHECK {
19             connect_timeout 3
20         }   
21     }   
22 }

systemctl restart keepalived

# systemctl status keepalived

 1 ● keepalived.service - LVS and VRRP High Availability Monitor
 2    Loaded: loaded (/usr/lib/systemd/system/keepalived.service; disabled; vendor preset: disabled)
 3    Active: active (running) since Thu 2018-12-13 23:11:06 CST; 1min 32s ago
 4   Process: 6233 ExecStart=/usr/sbin/keepalived $KEEPALIVED_OPTIONS (code=exited, status=0/SUCCESS)
 5  Main PID: 6234 (keepalived)
 6    CGroup: /system.slice/keepalived.service
 7            ├─6234 /usr/sbin/keepalived -D
 8            ├─6235 /usr/sbin/keepalived -D
 9            └─6236 /usr/sbin/keepalived -D
10 
11 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: Check on service [192.168.184.144]:80 failed after 1 retry.
12 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: Removing service [192.168.184.144]:80 from VS [192.168.184.150]:80
13 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: Remote SMTP server [127.0.0.1]:25 connected.
14 Dec 13 23:11:11 node1 Keepalived_healthcheckers[6235]: SMTP alert successfully sent.
15 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
16 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: VRRP_Instance(VI_1) Sending/queueing gratuitous ARPs on eth0 for 192.168.184.150
17 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
18 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
19 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150
20 Dec 13 23:11:14 node1 Keepalived_vrrp[6236]: Sending gratuitous ARP on eth0 for 192.168.184.150    //发送广播地址已经添加
21 You have new mail in /var/spool/mail/root

示例：

HTTP_GET {

url {

path /

status_code 200

}

connect_timeout 3

nb_get_retry 3

delay_before_retry 3

}

TCP_CHECK {

connect_timeout 3

}

HA Services:

nginx

100: -25

96: -20 79 --> 99 --> 79

博客作业：

keepalived 高可用 ipvs

nginx

active/active

Linux HA Cluster

LB, HA, HP, hadoop

LB:

传输层：lvs

应用层：nginx, haproxy, httpd, perlbal, ats, varnish

HA:

vrrp: keepalived

AIS: heartbeat, OpenAIS, corosync/pacemaker, cman/rgmanager(conga) RHCS

HA:

故障场景：

硬件故障：

设计缺陷

使用过久自然损坏

人为故障

…… ……

软件故障

设计缺陷

bug

人为误操作

……

A=MTBF/(MTBF+MTTR)

MTBF: Mean Time Between Failure

MTTR: Mean Time To Repair

0<A<1: 百分比

90%, 95%, 99%

99.9%, 99.99%, 99.999%

提供冗余：

network partition： vote system

隔离：

STONITH：shoot the other node on the head 节点级别隔离

Fence: 资源级别的隔离

failover domain：

fda: node1, node5

fdb: node2, node5

fdc: node3, node5

fdd: node4, node5

资源的约束性：

位置约束：资源对节点的倾向性；

排列约束：资源彼此间是否能运行于同一节点的倾向性；

顺序约束：多个资源启动顺序依赖关系；

vote system:

少数服从多数：quorum

> total/2

with quorum: 拥有法定票数

without quorum: 不拥有法定票数

两个节点(偶数个节点)：

Ping node

qdisk

failover

failback

Messaging Layer:

heartbeat

corosync

cman

Cluster Resource Manager(CRM):

heartbeat v1 haresources (配置接口：配置文件haresources)

heartbeat v2 crm (在每个节点运行一个crmd(5560/tcp)守护进程，有命令行接口crmsh; GUI: hb_gui)

heartbeat v3, pacemaker (配置接口：crmsh, pcs; GUI: hawk(suse), LCMC, pacemaker-gui)

rgmanager (配置接口：cluster.conf, system-config-cluster, conga(webgui), cman_tool, clustat)

组合方式：

heartbeat v1 (haresources)

heartbeat v2 (crm)

heartbeat v3 + pacemaker

corosync + pacemaker

corosync v1 + pacemaker (plugin)

corosync v2 + pacemaker (standalone service)

cman + rgmanager

corosync v1 + cman + pacemaker

RHCS: Red Hat Cluster Suite

RHEL5: cman + rgmanager + conga (ricci/luci)

RHEL6: cman + rgmanager + conga (ricci/luci)

corosync + pacemaker

corosync + cman + pacemaker

RHEL7: corosync + pacemaker

Resource Agent：

service: /etc/ha.d/haresources.d/目录下的脚本；

LSB: /etc/rc.d/init.d/目录下的脚本；

OCF：Open Cluster Framework

provider:

STONITH:

Systemd:

资源类型：

primitive：主资源，原始资源；在集群中只能运行一个实例；

clone：克隆资源，在集群中可运行多个实例；

匿名克隆、全局惟一克隆、状态克隆(主动、被动)

multi-state(master/slave)：克隆资源的特殊实现；多状态资源；

group: 组资源；

启动或停止；

资源监视