ovn-central raft HA (by quqi99)
Posted quqi99
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ovn-central raft HA (by quqi99)相关的知识,希望对你有一定的参考价值。
作者:张华 发表于:2022-10-12
版权声明:可以任意转载,转载时请务必以超链接形式标明文章原始出处和作者信息及本版权声明
What’s raft
RAFT(https://raft.github.io/)是一种致性算法,它有三种节点:
- Follower - This is first state when node start.
- Candidate - If no leader then start election.
- Leader - If get enough vote then it become leader and all other nodes followers.
使用RAFT做做ovn-central HA的作用相当于haproxy或pacemaker (see https://bugs.launchpad.net/neutron/+bug/1969354/comments/3 )
Set up ovn-central raft HA env
基于3个LXD容器快速搭建ovn-central raft HA环境. 或请参考: https://bugzilla.redhat.com/show_bug.cgi?id=1929690#c9
cd ~ && lxc launch faster:ubuntu/focal v1
lxc launch faster:ubuntu/focal v2
lxc launch faster:ubuntu/focal v3
#the subnet is 192.168.121.0/24
lxc config device override v1 eth0 ipv4.address=192.168.121.2
lxc config device override v2 eth0 ipv4.address=192.168.121.3
lxc config device override v3 eth0 ipv4.address=192.168.121.4
lxc stop v1 && lxc start v1 && lxc stop v2 && lxc start v2 && lxc stop v3 && lxc start v3
#on v1
lxc exec `lxc list |grep v1 |awk -F '|' 'print $2'` bash
sudo apt install ovn-central -y
cat << EOF |tee /etc/default/ovn-central
OVN_CTL_OPTS= \\
--db-nb-addr=192.168.121.2 \\
--db-sb-addr=192.168.121.2 \\
--db-nb-cluster-local-addr=192.168.121.2 \\
--db-sb-cluster-local-addr=192.168.121.2 \\
--db-nb-create-insecure-remote=yes \\
--db-sb-create-insecure-remote=yes \\
--ovn-northd-nb-db=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 \\
--ovn-northd-sb-db=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642
EOF
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
systemctl restart ovn-central
root@v1:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
6943
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 6943 (69432f05-2d37-44fd-8869-2ec365bb0b4c)
Address: tcp:192.168.121.2:6643
Status: cluster member
Role: leader
Term: 2
Leader: self
Vote: self
Election timer: 1000
Log: [2, 5]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: <-0000 <-0000
Servers:
6943 (6943 at tcp:192.168.121.2:6643) (self) next_index=4 match_index=4
#on v2
lxc exec `lxc list |grep v2 |awk -F '|' 'print $2'` bash
sudo apt install ovn-central -y
cat << EOF |tee /etc/default/ovn-central
OVN_CTL_OPTS= \\
--db-nb-addr=192.168.121.3 \\
--db-sb-addr=192.168.121.3 \\
--db-nb-cluster-local-addr=192.168.121.3 \\
--db-sb-cluster-local-addr=192.168.121.3 \\
--db-nb-create-insecure-remote=yes \\
--db-sb-create-insecure-remote=yes \\
--ovn-northd-nb-db=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 \\
--ovn-northd-sb-db=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642 \\
--db-nb-cluster-remote-addr=192.168.121.2 \\
--db-sb-cluster-remote-addr=192.168.121.2
EOF
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
systemctl restart ovn-central
root@v2:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
158b
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 158b (158b0aea-ba5d-42e0-b69b-2fc05204f622)
Address: tcp:192.168.121.3:6643
Status: cluster member
Role: follower
Term: 2
Leader: 6943
Vote: unknown
Election timer: 1000
Log: [2, 7]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 <-6943
Servers:
6943 (6943 at tcp:192.168.121.2:6643)
158b (158b at tcp:192.168.121.3:6643) (self)
#on v3
lxc exec `lxc list |grep v3 |awk -F '|' 'print $2'` bash
sudo apt install ovn-central -y
cat << EOF |tee /etc/default/ovn-central
OVN_CTL_OPTS= \\
--db-nb-addr=192.168.121.4 \\
--db-sb-addr=192.168.121.4 \\
--db-nb-cluster-local-addr=192.168.121.4 \\
--db-sb-cluster-local-addr=192.168.121.4 \\
--db-nb-create-insecure-remote=yes \\
--db-sb-create-insecure-remote=yes \\
--ovn-northd-nb-db=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 \\
--ovn-northd-sb-db=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642 \\
--db-nb-cluster-remote-addr=192.168.121.2 \\
--db-sb-cluster-remote-addr=192.168.121.2
EOF
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
systemctl restart ovn-central
root@v3:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
298d
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 298d (298de33b-1b92-47c4-95aa-ebcf8e80f567)
Address: tcp:192.168.121.4:6643
Status: cluster member
Role: follower
Term: 2
Leader: 6943
Vote: unknown
Election timer: 1000
Log: [2, 8]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: ->0000 ->158b <-6943 <-158b
Servers:
6943 (6943 at tcp:192.168.121.2:6643)
158b (158b at tcp:192.168.121.3:6643)
298d (298d at tcp:192.168.121.4:6643) (self)
OVN_NB_DB=tcp:192.168.121.2:6641,tcp:192.168.121.3:6641,tcp:192.168.121.4:6641 ovn-nbctl show
OVN_SB_DB=tcp:192.168.121.2:6642,tcp:192.168.121.3:6642,tcp:192.168.121.4:6642 ovn-sbctl show
Cluster Failover Testing
做一个failover的测试,停掉容器v1 (lxc stop v1), 会在v2与v3上看到如下日志,现在v2变成了leader,并且Term由2变成了3.
root@v2:~# tail -f /var/log/ovn/ovsdb-server-nb.log
2022-10-12T03:38:18.092Z|00085|raft|INFO|received leadership transfer from 6943 in term 2
2022-10-12T03:38:18.092Z|00086|raft|INFO|term 3: starting election
2022-10-12T03:38:18.095Z|00088|raft|INFO|term 3: elected leader by 2+ of 3 servers
root@v3:~# tail -f /var/log/ovn/ovsdb-server-nb.log
2022-10-12T03:38:18.095Z|00021|raft|INFO|server 158b is leader for term 3
root@v2:~# ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound
158b
Name: OVN_Northbound
Cluster ID: 51b9 (51b9f953-989f-4f90-9add-73dbabe3fe06)
Server ID: 158b (158b0aea-ba5d-42e0-b69b-2fc05204f622)
Address: tcp:192.168.121.3:6643
Status: cluster member
Role: leader
Term: 3
Leader: self
Vote: self
Election timer: 1000
Log: [2, 9]
Entries not yet committed: 0
Entries not yet applied: 0
Connections: (->0000) <-298d ->298d
Servers:
6943 (6943 at tcp:192.168.121.2:6643) next_index=9 match_index=0
158b (158b at tcp:192.168.121.3:6643) (self) next_index=8 match_index=8
298d (298d at tcp:192.168.121.4:6643) next_index=9 match_index=8
set inactivity-probe for raft port (6644)
ovn-sbctl list connection
#ovn-sbctl --no-leader-only list connection
ovn-sbctl --inactivity-probe=30000 set-connection pssl:6642 pssl:6644 pssl:16642
ovn-sbctl --inactivity-probe=30000 set-connection pssl:6642 pssl:6644 pssl:16642 punix:/var/run/ovn/ovnsb_db.sock
ovn-sbctl --inactivity-probe=30000 set-connection read-write role="ovn-controller" pssl:6642 read-write role="ovn-controller" pssl:6644 pssl:16642
或者使用下面的(它等价于:ovn-nbctl --inactivity-probe=57 set-connection pssl:6642 pssl:6644)
#https://mail.openvswitch.org/pipermail/ovs-discuss/2020-February/049743.html
#https://opendev.org/x/charm-ovn-central/commit/9dcd53bb75805ff733c8f10b99724ea16a2b5f25
ovn-sbctl -- --id=@connection create Connection target="pssl\\:6644" inactivity_probe=55 -- set SB_Global . connections=@connection
ovn-sbctl set connection . inactivity_probe=56
#above 'set SB_Global' will delete all then create one new, here 'add SB_Global' is only to add one new
ovn-sbctl -- --id=@connection create Connection target="pssl\\:6648" -- add SB_Global . connections @connection
ovn-sbctl --inactivity-probe=30000 set-connection pssl:6648
NOTE: 20221019更新 - 上面的’ovn-sbctl --inactivity-probe=30000 set-connection pssl:6648’会覆盖到6648之外的其他port的配置, 结果造成客户开了L1, 想哭. 正确的设置方法是下列两种之一:
ovn-sbctl --inactivity-probe=60001 set-connection read-write role="ovn-controller" pssl:6644 pssl:6641 pssl:6642 pssl:16642
或
ovn-sbctl -- --id=@connection create Connection role=ovn-controller target="pssl\\:6644" inactivity_probe=6000 -- add SB_Global . connections @connection
use ovsdb-tool to set up cluster
https://mail.openvswitch.org/pipermail/ovs-discuss/2020-February/049743.html
使用下列命令来用ovsdb-tool来创建cluster时刚开始未成功(在v2上用’ovs-appctl -t /var/run/ovn/ovnnb_db.ctl cluster/status OVN_Northbound’ 看到'Remotes for joining: tcp:192.168.121.4:6643 tcp:192.168.121.3:6643'无法加入cluster),那是因为在v2上运行’join-cluster’命令时v2的ip (192.168.121.3:6644)应该写在最前面, 所以对于v2它应该是" tcp:192.168.121.3:6644 tcp:192.168.121.2:6644 tcp:192.168.121.4:6644",而不是" tcp:192.168.121.2:6644 tcp:192.168.121.3:6644 tcp:192.168.121.4:6644"
#reset env in all nodes(v1, v2, v3)
systemctl stop ovn-central
rm -rf /var/lib/ovn/* && rm -rf /var/lib/ovn/.ovn*
rm -rf /etc/default/ovn-central
# on v1
rm -rf /var/lib/openvswitch/ovn*b_db.db
ovsdb-tool create-cluster /var/lib/openvswitch/ovnsb_db.db /usr/share/ovn/ovn-sb.ovsschema tcp:192.168.121.2:6644
ovsdb-tool create-cluster /var/lib/openvswitch/ovnnb_db.db /usr/share/ovn/ovn-nb.ovsschema tcp:192.168.121.2:6643
# on v2
rm -rf /var/lib/openvswitch/ovn*b_db.db
ovsdb-tool join-cluster /var/lib/openvswitch/ovnsb_db.db OVN_Southbound tcp:192.168.121.3:6644 tcp:192.168.121.2:6644 tcp:192.168.121.4:6644
ovsdb-tool join-cluster /var/lib/openvswitch/ovnnb_db.db OVN_Northbound tcp:192.168.121.3:6643 tcp:192.168.121.2:6643 tcp:192.168.121.4:6643
# on v3
rm -rf /var/lib/openvswitch/ovn*b_db.db
ovsdb-tool join-cluster /var/lib/openvswitch/ovnsb_db.db OVN_Southbound tcp:192.168.121.4:6644 tcp:192.168.121.2:6644 tcp:192.168.121.3:6644
ovsdb-tool join-cluster /var/lib/openvswitch/ovnnb_db.db OVN_Northbound tcp:192.168.121.4:6643 tcp:192.168.121.2:6643 tcp:192.168.121.3:6643
# then append the following content in /etc/default/ovn-central, finally restart ovn-central
--db-nb-file=/var/lib/openvswitch/ovnnb_db.db --db-sb-file=/var/lib/openvswitch/ovnsb_db.db
一个大问题
上面给raft port 6644设置inactivity-probe会造成一个大问题,因为要给它设置值就会创建一个connection, 这样在重启ovn-ovsdb-server-sb.service时会看到错误:6644:10.5.3.254: bind: Address already in use , 这样SB DB不 work, 进而neutron list与nova list都hang在那.
最后通过下列命令将a) 将SB DB从raft版本转成standalone版本 b) 从命令行启动 (从systemd启动会仍然以cluster模式启动) c) 将6644这个connection删除
ovsdb-tool cluster-to-standalone /var/lib/ovn/ovnsb_db.db_standalone /var/lib/ovn/ovnsb_db.db
cp /var/lib/ovn/ovnsb_db.db /var/lib/ovn/ovnsb_db.db_bk2
cp /var/lib/ovn/ovnsb_db.db_standalone /var/lib/ovn/ovnsb_db.db
ovsdb-server --remote=punix:/var/run/ovn/ovnsb_db.sock --pidfile=/var/run/ovn/ovnsb_db.pid --unixctl=/var/run/ovn/ovnsb_db.ctl --remote=db:OVN_Southbound,SB_Global,connections --private-key=/etc/ovn/key_host --certificate=/etc/ovn/cert_host --ca-cert=/etc/ovn/ovn-central.crt --ssl-protocols=db:OVN_Southbound,SSL,ssl_protocols --ssl-ciphers=db:OVN_Southbound,SSL,ssl_ciphers /var/lib/ovn/ovnsb_db.db > temp 2>&1
6644的心跳与6641和6642还有有些区别的, 它只是3个ovn-central节点的raft之间的心跳, 一般5秒应该是够了(For raft ports the only thing which is transmitted consensus and the DB diff, it should not take more than 5 seconds ). 它将调大了的话then it will take longer to see if any of the members has issues and you could get in a place where you have a split brain/db corruption , so changing the inactivity probe for raft ports is discouraged
https://bugs.launchpad.net/ubuntu/+source/openvswitch/+bug/1990978
https://bugs.launchpad.net/openvswitch/+bug/1985062
raft的heartbeat是单向的, 当inactivity-probe被禁用时,leader不停往followers发heartbeat, followers没有收到heartbeat就会发起election, 但是如果leader没有收到任何响应的话它也不会做任何事情(if leader will not receive any replies it will not do anything until there is a quorum), 这会导致leader一直盲目发包浪费cpu. 为避免这种情况又重新引入了inactivity-probe这样leader是长时间没有收到一个leader的响应的话会断到和它之间的tcp连接.
在所有ovn-central节点上运行上面iptables将会让下列的连接变得单向.
iptables -A INPUT -i eth0 -p tcp --match multiport --dports 6641:6644 -j DROP
iptables -A POSTROUTING -p tcp --match multiport --dports 6641:6644 -j DROP
ovn1:<random port> <------ ovn3:6641
ovn1:6641 ------> ovn3:<random port>
以上是关于ovn-central raft HA (by quqi99)的主要内容,如果未能解决你的问题,请参考以下文章
ovn-central raft HA (by quqi99)
ovn-central raft HA (by quqi99)
Set up debian based maas ha env on xenial by hand (by quqi99)
Set up debian based maas ha env on xenial by hand (by quqi99)
Set up debian based maas ha env on xenial by hand (by quqi99)