etcd频繁选举leader

Posted Wshile

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了etcd频繁选举leader相关的知识,希望对你有一定的参考价值。

etcd频繁选举leader

集群中etcd出现报警

Alert Name: A high number of leader changes within the etcd cluster are happening
Severity: warning
Cluster Name: shdmz-prod-diamond (ID: c-n6wc4)
Namespace: cattle-prometheus
Expression: increase(etcd_server_leader_changes_seen_total[1h])>3
Description: Threshold Crossed: datapoint value 4.067796610169491 was greater than to the threshold (3) for (3m)

 日志中发现的问题,还有类似心跳检测超时的情况

2020-07-08 11:32:11.730958 W | rafthttp: the clock difference against peer db40725e6f94d8e3 is too high [13.717094955s > 1s] (prober "ROUND_TRIPPER_RAFT_MESSAGE")

 解决方式

1、集群中有某些机器时间不同步

2、扩大心跳检测时长

- --election-timeout=5000
- --heartbeat-interval=500

 

以上是关于etcd频繁选举leader的主要内容,如果未能解决你的问题,请参考以下文章

etcd选举机制

ETCD 应急方案

orchestrator raft leader频繁变化问题

etcd3集群管理

etcd源码解读之raft协议实现

etcd源码剖析-raft