高可用rabbitmq集群服务部署步骤

Posted 鸾舞春秋

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了高可用rabbitmq集群服务部署步骤相关的知识,希望对你有一定的参考价值。

消息队列是非常基础的关键服务,为保证公司队列服务的高可用及负载均衡,现通过如下方式实现: RabbitMQ Cluster + Queue HA + Haproxy + Keepalived


3台rabbitMQ服务器构建broker集群,允许2台服务器故障而服务不受影响,


在此基础上,通过queue mirror实现队列的高可用,在本例中镜像到所有服务器,即1个master,2个slave;


为保证客户端访问入口地址的唯一性,通过haproxy做4层代理来提供mq服务,通过简单的轮询方式来进行负载均衡,设置健康检查来屏蔽故障节点对客户端的影响,


并通过2台haproxy做keepalived实现客户端访问入口的高可用。




rabbitmq队列基本概念参考:
http://baike.baidu.com/link?url=ySoVSgecyl7dcLNqyjvwXVW-nNTSA7tIHmhwTHx37hL_H4wnYa70VCqmOZ59AaSEz2DYyfUiSMnQV2tHKD7OQK


参考官方文档:
http://www.rabbitmq.com/clustering.html
http://www.rabbitmq.com/ha.html
http://www.rabbitmq.com/man/rabbitmqctl.1.man.html


http://www.rabbitmq.com/production-checklist.html


 


一、基础知识:


rabbitmq集群:RabbitMQ broker 集群是多个erlang节点的逻辑组,每个节点运行rabbitmq应用,他们之间共享用户、虚拟主机、队列、exchange、绑定和运行时参数。


集群之间复制什么信息:除了message queue(存在一个节点,从其他节点都可见、访问该队列,要实现queue的复制就需要做queue的HA)之外,任何一个rabbitmq broker上的所有操作的data和state都会在所有的节点之间进行复制。


集群运行的前提:
1、集群所有节点必须运行相同的erlang及rabbitmq版本
2、hostname解析,节点之间通过域名相互通信,本文为3个node的集群,采用配置hosts的形式。


端口及用途
  5672 客户端连接用途
15672 web管理接口
25672 集群通信用途


集群的搭建方式:
1、通过rabbitmqctl手工配置 (本文采用此方式)
2、通过配置文件声明
3、通过rabbitmq-autocluster插件声明
4、通过rabbitmq-clusterer插件声明


集群故障处理机制:
1、rabbitmq broker集群允许个体节点down机,
2、对应集群的的网络分区问题( network partitions)
集群推荐用于LAN环境,不适用WAN环境;
要通过WAN连接broker,Shovel or Federation插件是最佳的解决方案。
Shovel or Federation不同于集群。


RabbitMQ clustering has several modes of dealing with network partitions, primarily consistency oriented. Clustering is meant to be used across LAN. It is not recommended to run clusters that span WAN. The Shovel or Federation plugins are better solutions for connecting brokers across a WAN. Note that Shovel and Federation are not equivalent to clustering.


节点运行模式:
为保证数据持久性,目前所有node节点跑在disk模式,如果今后压力大,需要提高性能,考虑采用ram模式




集群节点之间是如何相互认证的:
通过Erlang Cookie,相当于共享秘钥的概念,长度任意,只要所有节点都一致即可。
rabbitmq server在启动的时候,erlang VM会自动创建一个随机的cookie文件。
cookie文件的位置: /var/lib/rabbitmq/.erlang.cookie 或者/root/.erlang.cookie
我们的放在:/root/.erlang.cookie
为保证cookie的完全一致,采用从一个节点copy的方式。


 


二、RabbitMQ集群部署过程


首先安装单机版rabbitMQ,参考同事写的文档:部署erlang环境和rabbitmq的文档


下面开始集群配置过程:
1、设置hosts解析,所有节点配置相同
[[email protected]_rabbitMQ135 ~]# tail -n4 /etc/hosts
###rabbitmq 集群通信用途,所有节点配置一致 laijingli 20160220
192.168.100.135 xx_rabbitMQ135
192.168.100.136 xx_rabbitMQ136
192.168.100.137 xx_rabbitMQ137


2、设置节点间认证的cookie
[[email protected]_rabbitMQ135 ~]# scp /root/.erlang.cookie 192.168.100.136:~
[[email protected]_rabbitMQ135 ~]# scp /root/.erlang.cookie 192.168.100.137:~




3、分别启动独立的单机版rabbitmq broker节点:
[[email protected]_rabbitMQ135 ~]# rabbitmq-server -detached
[[email protected]_rabbitMQ136 ~]# rabbitmq-server -detached
[[email protected]_rabbitMQ137 ~]# rabbitmq-server -detached


这样就在每个节点上创建了独立的RabbitMQ brokers,


查看broker的状态:
[[email protected]_rabbitMQ135 ~]# rabbitmqctl status
Status of node [email protected]_rabbitMQ135 ...
[{pid,116968},
{running_applications,
[{rabbitmq_shovel_management,"Shovel Status","3.6.0"},
{rabbitmq_management,"RabbitMQ Management Console","3.6.0"},


查看broker的集群状态:
[[email protected]_rabbitMQ135 ~]# rabbitmqctl cluster_status
Cluster status of node [email protected]_rabbitMQ135 ...
[{nodes,[{disc,[[email protected]_rabbitMQ135]}]},
{running_nodes,[[email protected]_rabbitMQ135]},
{cluster_name,<<"[email protected]_rabbitMQ135">>},
{partitions,[]}]




4、创建broker集群:
为了把集群中的3歌节点联系起来,我们把136和137分别加入到135的集群


先在136上stop rabbitmq,然后加到135的集群(join cluster会隐式的重置该节点,并删除该节点上所有的资源和数据),然后查看集群状态里有了2个node。
[[email protected]_rabbitMQ136 ~]# rabbitmqctl stop_app
Stopping node [email protected]_rabbitMQ136 ...


[[email protected]_rabbitMQ136 ~]# rabbitmqctl join_cluster [email protected]_rabbitMQ135
Clustering node [email protected]_rabbitMQ136 with [email protected]_rabbitMQ135 ...


[[email protected]_rabbitMQ136 ~]# rabbitmqctl start_app


[[email protected]_rabbitMQ136 ~]# rabbitmqctl cluster_status
Cluster status of node [email protected]_rabbitMQ136 ...
[{nodes,[{disc,[[email protected]_rabbitMQ135,[email protected]_rabbitMQ136]}]}]


137同理,加入集群时选择135或者136哪个node都没有影响。
[[email protected]_rabbitMQ135 ~]# rabbitmqctl cluster_status
Cluster status of node [email protected]_rabbitMQ135 ...
[{nodes,[{disc,[[email protected]_rabbitMQ135,[email protected]_rabbitMQ136,
[email protected]_rabbitMQ137]}]},
{running_nodes,[[email protected]_rabbitMQ136,[email protected]_rabbitMQ137,
[email protected]_rabbitMQ135]},
{cluster_name,<<"[email protected]_rabbitMQ135">>},
{partitions,[]}]


 


修改集群的名字为xx_rabbitMQ_cluster(默认是第一个node的名字):
[[email protected]_rabbitMQ135 ~]# rabbitmqctl set_cluster_name xx_rabbitMQ_cluster


[[email protected]_rabbitMQ135 ~]# rabbitmqctl cluster_status
Cluster status of node [email protected]_rabbitMQ135 ...
[{nodes,[{disc,[[email protected]_rabbitMQ135,[email protected]_rabbitMQ136,
[email protected]_rabbitMQ137]}]},
{running_nodes,[[email protected]_rabbitMQ135,[email protected]_rabbitMQ136,
[email protected]_rabbitMQ137]},
{cluster_name,<<"xx_rabbitMQ_cluster">>},
{partitions,[]}]




5、重启集群:
通过rabbitmqctl stop、rabbitmq-server -detached来重启集群,观察集群的运行状态变化


重要信息:
(1)、当整个集群down掉时,最后一个down机的节点必须第一个启动到在线状态,如果不是这样,节点会等待30s等最后的磁盘节点恢复状态,然后失败。
如果最后下线的节点不能上线,可以通过forget_cluster_node 指令来踢出集群。
When the entire cluster is brought down, the last node to go down must be the first node to be brought online. If this doesn‘t happen, the nodes will wait 30 seconds for the last disc node to come back online, and fail afterwards. 
If the last node to go offline cannot be brought back up, it can be removed from the cluster using the forget_cluster_node command - consult the rabbitmqctl manpage for more information.


(2)、如果所有的节点不受控制的同时宕机,比如掉电,会进入所有的节点都会认为其他节点比自己宕机的要晚,即自己先宕机,这种情况下可以使用force_boot指令来启动一个节点。
If all cluster nodes stop in a simultaneous and uncontrolled manner (for example with a power cut) you can be left with a situation in which all nodes think that some other node stopped after them. In this case you can use the force_boot command on one node to make it bootable again - consult the rabbitmqctl manpage for more information.




6、打破集群:
当一个节点不属于这个集群的时候,我们需要显式的踢出,可以通过本地或者远程的方式


[[email protected]_rabbitMQ137 ~]# rabbitmqctl stop_app
Stopping node [email protected]_rabbitMQ137 ...


[[email protected]_rabbitMQ137 ~]# rabbitmqctl reset
Resetting node [email protected]_rabbitMQ137 ...


[[email protected]_rabbitMQ137 ~]# rabbitmqctl start_app
Starting node [email protected]_rabbitMQ137 ...


[[email protected]_rabbitMQ137 ~]# rabbitmqctl cluster_status
Cluster status of node [email protected]_rabbitMQ137 ...
[{nodes,[{disc,[[email protected]_rabbitMQ137]}]},
{running_nodes,[[email protected]_rabbitMQ137]},
{cluster_name,<<"[email protected]_rabbitMQ137">>},
{partitions,[]}]




7、客户端连接集群测试


通过web管理页面进行创建队列、发布消息、创建用户、创建policy等
http://192.168.100.137:15672/


或者通过rabbitmqadmin命令行来测试


[[email protected]_rabbitMQ136 ~]# wget http://192.168.100.136:15672/cli/rabbitmqadmin


[[email protected]_rabbitMQ136 ~]# chmod +x rabbitmqadmin


[[email protected]_rabbitMQ136 ~]# mv rabbitmqadmin /usr/local/rabbitmq_server-3.6.0/sbin/


Declare an exchange
$ rabbitmqadmin declare exchange name=my-new-exchange type=fanout
exchange declared
Declare a queue, with optional parameters
$ rabbitmqadmin declare queue name=my-new-queue durable=false
queue declared
Publish a message
$ rabbitmqadmin publish exchange=my-new-exchange routing_key=test payload="hello, world"
Message published
And get it back
$ rabbitmqadmin get queue=test requeue=false
+-------------+----------+---------------+--------------+------------------+-------------+
| routing_key | exchange | message_count | payload | payload_encoding | redelivered |
+-------------+----------+---------------+--------------+------------------+-------------+
| test | | 0 | hello, world | string | False |
+-------------+----------+---------------+--------------+------------------+-------------+


 


测试后发现问题问题:
[[email protected]_rabbitMQ135 ~]# rabbitmqctl stop_app
[[email protected]_rabbitMQ135 ~]# rabbitmqctl stop
在stop_app或者stop掉broker之后在135节点的上队列已经不可用了,重启135的app或broker之后,虽然集群工作正常,但135上队列中消息会被清空(queue还是存在的)


对于生产环境而已,这肯定是不可接受的,如果不能保证队列的高可用,那么做集群的意义也不太大了,还好rabbitmq支持Highly Available Queues,下面介绍queue的HA。


 


三、Queue HA配置


默认情况下,集群中的队列存在于集群中单个节点上,这要看创建队列时声明在那个节点上创建,而exchange和binding则默认存在于集群中所有节点。
队列可以通过镜像来提高可用性,HA依赖rabbitmq cluster,所以队列镜像也不适合WAN部署,每个被镜像的队列包含一个master和一个或者多个slave,当master因任何原因故障时,最老的slave被提升为新的master。
发布到队列的消息被复制到所有的slave上,消费者无论连接那个node,都会连接到master;如果master确认要删除消息,那么所有slave就会删除队列中消息。
队列镜像可以提供queue的高可用性,但不能分担负载,因为所有参加的节点都做所有的工作。




1、配置队列镜像
通过policy来配置镜像,策略可在任何时候创建,比如先创建一个非镜像的队列,然后在镜像,反之亦然。
镜像队列和非镜像队列的区别是非镜像队列没有slaves,运行速度也比镜像队列快。


设置策略然后设置ha-mode,3中模式:all、exactly、nodes
每个队列都有一个home node,叫做queue master node


(1)、设置policy,以ha.开头的队列将会被镜像到集群其他所有节点,一个节点挂掉然后重启后需要手动同步队列消息
rabbitmqctl set_policy ha-all-queue "^ha\." ‘{"ha-mode":"all"}‘


(2)、设置policy,以ha.开头的队列将会被镜像到集群其他所有节点,一个节点挂掉然后重启后会自动同步队列消息(我们生产环境采用这个方式)
rabbitmqctl set_policy ha-all-queue "^ha\." ‘{"ha-mode":"all","ha-sync-mode":"automatic"}‘




2、问题:


配置镜像队列后,其中1台节点失败,队列内容是不会丢失,如果整个集群重启,队列中的消息内容仍然丢失,如何实现队列消息内容持久化那?
我的node也是跑在disk模式,创建见消息的时候也声明了持久化,为什么还是不行那?


因为创建消息的时候需要指定消息是否持久化,如果启用了消息的持久化的话,重启集群消息也不会丢失了,前提是创建的队列也应该是创建的持久化队列。




四、客户端连接rabbitMQ集群服务的方式:
1、客户端可以连接集群中的任意一个节点,如果一个节点故障,客户端自行重新连接到其他的可用节点;(不推荐,对客户端不透明)
2、通过动态DNS,较短的ttl
3、通过HA+4层负载均衡器(比如haproxy+keepalived)


 


五、Haproxy+keepalived的部署


消息队列作为公司的关键基础服务,为给客户端提供稳定、透明的rabbitmq服务,现通过Haproxy+keepalived构建高可用的rabbitmq统一入口,及基本的负载均衡服务。


为简化安装配置,现采用yum的方式安装haproxy和keepalived,可参考 基于keepalived+nginx部署强健的高可用7层负载均衡方案


1、安装


yum install haproxy keepalived -y


2、设置关键服务开机自启动


[[email protected] keepalived]# chkconfig --list|grep haproxy
haproxy 0:off 1:off 2:off 3:off 4:off 5:off 6:off
[[email protected] keepalived]# chkconfig haproxy on
[[email protected] keepalived]# chkconfig --list|grep haproxy
haproxy 0:off 1:off 2:on 3:on 4:on 5:on 6:off


3、配置将haproxy的log记录到 /var/log/haproxy.log


[[email protected] haproxy]# more /etc/rsyslog.d/haproxy.conf
$ModLoad imudp
$UDPServerRun 514


local0.* /var/log/haproxy.log


[[email protected] haproxy]# /etc/init.d/rsyslog restart


 


4、haproxy的配置,2台机器上的配置完全相同


[[email protected] keepalived]# more /etc/haproxy/haproxy.cfg


#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------


#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the ‘-r‘ option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 127.0.0.1 local2 notice


chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon


# turn on stats unix socket
stats socket /var/lib/haproxy/stats


#---------------------------------------------------------------------
# common defaults that all the ‘listen‘ and ‘backend‘ sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode tcp
option tcplog
option dontlognull
option http-server-close
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
timeout client 1m
timeout server 1m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000




###haproxy statistics monitor by laijingli 20160222
listen statics 0.0.0.0:8888
mode http
log 127.0.0.1 local0 debug
transparent
stats refresh 60s
stats uri / haproxy-stats
stats realm Haproxy \ statistic
stats auth laijingli:xxxxx


#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend xx_rabbitMQ_cluster_frontend
mode tcp
option tcpka
log 127.0.0.1 local0 debug
bind 0.0.0.0:5672
use_backend xx_rabbitMQ_cluster_backend


frontend xx_rabbitMQ_cluster_management_frontend
mode tcp
option tcpka
log 127.0.0.1 local0 debug
bind 0.0.0.0:15672
use_backend xx_rabbitMQ_cluster_management_backend


#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend xx_rabbitMQ_cluster_backend
balance roundrobin
server xx_rabbitMQ135 192.168.100.135:5672 check inter 3s rise 1 fall 2
server xx_rabbitMQ136 192.168.100.136:5672 check inter 3s rise 1 fall 2
server xx_rabbitMQ137 192.168.100.137:5672 check inter 3s rise 1 fall 2


backend xx_rabbitMQ_cluster_management_backend
balance roundrobin
server xx_rabbitMQ135 192.168.100.135:15672 check inter 3s rise 1 fall 2
server xx_rabbitMQ136 192.168.100.136:15672 check inter 3s rise 1 fall 2
server xx_rabbitMQ137 192.168.100.137:15672 check inter 3s rise 1 fall 2


[[email protected] keepalived]#


 


5、贴出keepalived的配置,因HA服务器上同时运行的http及rabbitmq的服务,故一并贴出来了,特别注意2台服务器上的keepalived配置不一样


[[email protected] keepalived]# more /etc/keepalived/keepalived.conf
####Configuration File for keepalived
####xx公司线上内部API网关 keepalived HA配置
####xx公司线上rabbitMQ集群keepalived HA配置
#### laijingli 20151213


global_defs {
notification_email {
[email protected]
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id xxhaproxy101 ## xxhaproxy101 on master , xxhaproxy102 on backup
}




###simple check with killall -0 which is less expensive than pidof to verify that nginx is running
vrrp_script chk_nginx {
script "killall -0 nginx"
interval 1
weight 2
fall 2
rise 1
}


vrrp_instance YN_API_GATEWAY {
state MASTER ## MASTER on master , BACKUP on backup
interface em1
virtual_router_id 101 ## YN_API_GATEWAY virtual_router_id
priority 200 ## 200 on master , 199 on backup
advert_int 1
###采用单播通信,避免同一个局域网中多个keepalived组之间的相互影响
unicast_src_ip 192.168.100.101 ##本机ip
unicast_peer {
192.168.100.102 ##对端ip
}
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.100.99 ## VIP
}
###如果只有一块网卡的话监控网络接口就没有必要了
#track_interface {
# em1
#}
track_script {
chk_nginx
}
###状态切换是发送邮件通知,本机记录log,后期会触发短信通知
notify_master /usr/local/bin/keepalived_notify.sh notify_master
notify_backup /usr/local/bin/keepalived_notify.sh notify_backup
notify_fault /usr/local/bin/keepalived_notify.sh notify_fault
notify /usr/local/bin/keepalived_notify.sh notify
smtp_alert
}




###simple check with killall -0 which is less expensive than pidof to verify that haproxy is running
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 1
weight 2
fall 2
rise 1
}
vrrp_instance xx_rabbitMQ_GATEWAY {
state BACKUP ## MASTER on master , BACKUP on backup
interface em1
virtual_router_id 111 ## xx_rabbitMQ_GATEWAY virtual_router_id
priority 199 ## 200 on master , 199 on backup
advert_int 1
###采用单播通信,避免同一个局域网中多个keepalived组之间的相互影响
unicast_src_ip 192.168.100.101 ##本机ip
unicast_peer {
192.168.100.102 ##对端ip
}
authentication {
auth_type PASS
auth_pass 123456
}
virtual_ipaddress {
192.168.100.100 ## VIP
}
###如果只有一块网卡的话监控网络接口就没有必要了
#track_interface {
# em1
#}
track_script {
chk_haproxy
}
###状态切换是发送邮件通知,本机记录log,后期会触发短信通知
notify_master /usr/local/bin/keepalived_notify_for_haproxy.sh notify_master
notify_backup /usr/local/bin/keepalived_notify_for_haproxy.sh notify_backup
notify_fault /usr/local/bin/keepalived_notify_for_haproxy.sh notify_fault
notify /usr/local/bin/keepalived_notify_for_haproxy.sh notify
smtp_alert
}


 


[[email protected] keepalived]# more /etc/keepalived/keepalived.conf
####Configuration File for keepalived
####xx公司线上内部API网关 keepalived HA配置
####xx公司线上rabbitMQ集群keepalived HA配置
#### laijingli 20151213


global_defs {
notification_email {
[email protected]
[email protected]
}
notification_email_from [email protected]
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id xxhaproxy102 ## xxhaproxy101 on master , xxhaproxy102 on backup
}




###simple check with killall -0 which is less expensive than pidof to verify that nginx is running
vrrp_script chk_nginx {
script "killall -0 nginx"
interval 1
weight 2
fall 2
rise 1
}


vrrp_instance YN_API_GATEWAY {
state BACKUP ## MASTER on master , BACKUP on backup
interface em1
virtual_router_id 101 ## YN_API_GATEWAY virtual_router_id
priority 199 ## 200 on master , 199 on backup
advert_int 1
###采用单播通信,避免同一个局域网中多个keepalived组之间的相互影响
unicast_src_ip 192.168.100.102 ##本机ip
unicast_peer {
192.168.100.101 ##对端ip
}
authentication {
auth_type PASS
auth_pass YN_API_HA_PASS
}
virtual_ipaddress {
192.168.100.99 ## VIP
}
###如果只有一块网卡的话监控网络接口就没有必要了
#track_interface {
# em1
#}
track_script {
chk_nginx
}
###状态切换是发送邮件通知,本机记录log,后期会触发短信通知
notify_master /usr/local/bin/keepalived_notify.sh notify_master
notify_backup /usr/local/bin/keepalived_notify.sh notify_backup
notify_fault /usr/local/bin/keepalived_notify.sh notify_fault
notify /usr/local/bin/keepalived_notify.sh notify
smtp_alert
}




###simple check with killall -0 which is less expensive than pidof to verify that haproxy is running
vrrp_script chk_haproxy {
script "killall -0 haproxy"
interval 1
weight 2
fall 2
rise 1
}
vrrp_instance xx_rabbitMQ_GATEWAY {
state MASTER ## MASTER on master , BACKUP on backup
interface em1
virtual_router_id 111 ## xx_rabbitMQ_GATEWAY virtual_router_id
priority 200 ## 200 on master , 199 on backup
advert_int 1
###采用单播通信,避免同一个局域网中多个keepalived组之间的相互影响
unicast_src_ip 192.168.100.102 ##本机ip
unicast_peer {
192.168.100.101 ##对端ip
}
authentication {
auth_type PASS
auth_pass YN_MQ_HA_PASS
}
virtual_ipaddress {
192.168.100.100 ## VIP
}
###如果只有一块网卡的话监控网络接口就没有必要了
#track_interface {
# em1
#}
track_script {
chk_haproxy
}
###状态切换是发送邮件通知,本机记录log,后期会触发短信通知
notify_master /usr/local/bin/keepalived_notify_for_haproxy.sh notify_master
notify_backup /usr/local/bin/keepalived_notify_for_haproxy.sh notify_backup
notify_fault /usr/local/bin/keepalived_notify_for_haproxy.sh notify_fault
notify /usr/local/bin/keepalived_notify_for_haproxy.sh notify
smtp_alert
}


 


配置中用到的通知脚本,2台服务器上完全一样:


[[email protected] keepalived]# more /usr/local/bin/keepalived_notify.sh
#!/bin/bash
###keepalived notify script for record ha state transtion to log files


###将将状态转换过程记录到log,便于排错
logfile=/var/log/keepalived.notify.log
echo --------------- >> $logfile
echo `date` [`hostname`] keepalived HA role state transition: $1 $2 $3 $4 $5 $6 >> $logfile


###将状态转换记录到nginx的文件,便于通过web查看ha状态(一定注意不要开放到公网)
echo `date` `hostname` $1 $2 $3 $4 $5 $6 " <br>" > /usr/share/nginx/html/index_for_nginx.html


###将nginx api和rabbitmq的ha log记录到同一个文件里
cat /usr/share/nginx/html/index_for* > /usr/share/nginx/html/index.html


 




[[email protected] keepalived]# more /usr/local/bin/keepalived_notify_for_haproxy.sh
#!/bin/bash
###keepalived notify script for record ha state transtion to log files


###将将状态转换过程记录到log,便于排错
logfile=/var/log/keepalived.notify.log
echo --------------- >> $logfile
echo `date` [`hostname`] keepalived HA role state transition: $1 $2 $3 $4 $5 $6 >> $logfile


###将状态转换记录到nginx的文件,便于通过web查看ha状态(一定注意不要开放到公网)
echo `date` `hostname` $1 $2 $3 $4 $5 $6 " <br>" > /usr/share/nginx/html/index_for_haproxy.html


###将nginx api和rabbitmq的ha log记录到同一个文件里
cat /usr/share/nginx/html/index_for* > /usr/share/nginx/html/index.html
[[email protected] keepalived]#


 


6、haproxy监控页面:


http://192.168.100.101:8888/


7、查看keepalived中高可用服务运行在那台服务器上


http://192.168.100.101


8、通过VIP访问rabbitMQ服务


192.168.100.100:5672




六、更多问题:


参考 xx公司rabbitmq服务客户端使用规范
1、使用vhost来隔离不同的应用、不同的用户、不同的业务组
2、消息持久化,exchange、queue、message等持久化需要在客户端声明指定

以上是关于高可用rabbitmq集群服务部署步骤的主要内容,如果未能解决你的问题,请参考以下文章

阿里云K8S使用NAS作为动态存储部署RabbitMQ高可用集群

RabbitMQ学习之集群部署

私有云Rabbitmq 集群部署

Rabbit-3.6.5 集群部署

Docker下RabbitMQ四部曲之四:高可用实战

RabbitMQ集群与高可用部署