k8s 1.26.x 二进制高可用部署
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了k8s 1.26.x 二进制高可用部署相关的知识,希望对你有一定的参考价值。
标签(空格分隔): kubernetes系列
一: 系统环境初始化
1.1 系统环境
系统:
almalinux 8.7x64
cat /etc/hosts
----
172.16.10.81 flyfish81
172.16.10.82 flyfish82
172.16.10.83 flyfish83
172.16.10.84 flyfish84
172.16.10.85 flyfish85
-----
本次部署为前五台almalinux 8.7x64
承接上文部署:https://blog.51cto.com/flyfish225/5988774
flyfish81 做为 master 部署
flyfish82 、flyfish83 作为worker 节点 已经部署完成
flyfish84/flyfish85作为扩展worker节点
启用flyfish82 为备用master节点
1.2 flyfish84 与 flyfish85 节点系统初始化
# 安装依赖包
yum -y install wget jq psmisc vim net-tools nfs-utils telnet yum-utils device-mapper-persistent-data lvm2 git network-scripts tar curl -y
# 关闭防火墙 与selinux
systemctl disable --now firewalld
setenforce 0
sed -i s#SELINUX=enforcing#SELINUX=disabled#g /etc/selinux/config
# 关闭交换分区
sed -ri s/.*swap.*/#&/ /etc/fstab
swapoff -a && sysctl -w vm.swappiness=0
cat /etc/fstab
# /dev/mapper/centos-swap swap swap defaults 0 0
#
# 配置系统句柄数
ulimit -SHn 65535
cat >> /etc/security/limits.conf <<EOF
* soft nofile 655360
* hard nofile 131072
* soft nproc 655350
* hard nproc 655350
* seft memlock unlimited
* hard memlock unlimitedd
EOF
# 做系统无密码互信登陆
yum install -y sshpass
ssh-keygen -f /root/.ssh/id_rsa -P
export IP="172.16.10.81 172.16.10.82 172.16.10.83 172.16.10.84 172.16.10.85"
export SSHPASS=flyfish225
for HOST in $IP;do
sshpass -e ssh-copy-id -o StrictHostKeyChecking=no $HOST
done
# 升级系统内核
rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org
yum install https://www.elrepo.org/elrepo-release-8.el8.elrepo.noarch.rpm
修改阿里云 镜像源
mv /etc/yum.repos.d/elrepo.repo /etc/yum.repos.d/elrepo.repo.bak
vim /etc/yum.repos.d/elrepo.repo
----
[elrepo-kernel]
name=elrepoyum
baseurl=https://mirrors.aliyun.com/elrepo/kernel/el8/x86_64/
enable=1
gpgcheck=0
----
yum --enablerepo=elrepo-kernel install kernel-ml -y
#使用序号为0的内核,序号0是前面查出来的可用内核编号
grub2-set-default 0
#生成 grub 配置文件并重启
grub2-mkconfig -o /boot/grub2/grub.cfg
reboot
#启用ipvs
yum install ipvsadm ipset sysstat conntrack libseccomp -y
mkdir -p /etc/modules-load.d/
cat >> /etc/modules-load.d/ipvs.conf <<EOF
ip_vs
ip_vs_rr
ip_vs_wrr
ip_vs_sh
nf_conntrack
ip_tables
ip_set
xt_set
ipt_set
ipt_rpfilter
ipt_REJECT
ipip
EOF
systemctl restart systemd-modules-load.service
lsmod | grep -e ip_vs -e nf_conntrack
ip_vs_sh 16384 0
ip_vs_wrr 16384 0
ip_vs_rr 16384 0
ip_vs 180224 6 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_conntrack 176128 1 ip_vs
nf_defrag_ipv6 24576 2 nf_conntrack,ip_vs
nf_defrag_ipv4 16384 1 nf_conntrack
libcrc32c 16384 3 nf_conntrack,xfs,ip_vs
#修改内核参数
cat <<EOF > /etc/sysctl.d/k8s.conf
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-iptables = 1
vm.overcommit_memory = 1
vm.panic_on_oom = 0
fs.inotify.max_user_watches = 89100
fs.file-max = 52706963
fs.nr_open = 52706963
net.netfilter.nf_conntrack_max = 2310720
net.ipv4.tcp_keepalive_time = 600
net.ipv4.tcp_keepalive_probes = 3
net.ipv4.tcp_keepalive_intvl = 15
net.ipv4.tcp_max_tw_buckets = 36000
net.ipv4.tcp_tw_reuse = 1
net.ipv4.tcp_max_orphans = 327680
net.ipv4.tcp_orphan_retries = 3
net.ipv4.tcp_syncookies = 1
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_max_syn_backlog = 16384
net.ipv4.tcp_timestamps = 0
net.core.somaxconn = 16384
net.ipv6.conf.all.disable_ipv6 = 0
net.ipv6.conf.default.disable_ipv6 = 0
net.ipv6.conf.lo.disable_ipv6 = 0
net.ipv6.conf.all.forwarding = 1
EOF
modprobe br_netfilter
lsmod |grep conntrack
modprobe ip_conntrack
sysctl -p /etc/sysctl.d/k8s.conf
#安装docker 处理
#解压
tar xf docker-*.tgz
#拷贝二进制文件
cp docker/* /usr/bin/
#创建containerd的service文件,并且启动
cat >/etc/systemd/system/containerd.service <<EOF
[Unit]
Description=containerd container runtime
Documentation=https://containerd.io
After=network.target local-fs.target
[Service]
ExecStartPre=-/sbin/modprobe overlay
ExecStart=/usr/bin/containerd
Type=notify
Delegate=yes
KillMode=process
Restart=always
RestartSec=5
LimitNPROC=infinity
LimitCORE=infinity
LimitNOFILE=1048576
TasksMax=infinity
OOMScoreAdjust=-999
[Install]
WantedBy=multi-user.target
EOF
systemctl enable --now containerd.service
#准备docker的service文件
cat > /etc/systemd/system/docker.service <<EOF
[Unit]
Description=Docker Application Container Engine
Documentation=https://docs.docker.com
After=network-online.target firewalld.service containerd.service
Wants=network-online.target
Requires=docker.socket containerd.service
[Service]
Type=notify
ExecStart=/usr/bin/dockerd -H fd:// --containerd=/run/containerd/containerd.sock
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
OOMScoreAdjust=-500
[Install]
WantedBy=multi-user.target
EOF
#准备docker的socket文件
cat > /etc/systemd/system/docker.socket <<EOF
[Unit]
Description=Docker Socket for the API
[Socket]
ListenStream=/var/run/docker.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
#创建docker组
groupadd docker
#启动docker
systemctl enable --now docker.socket && systemctl enable --now docker.service
#验证
docker info
cat >/etc/docker/daemon.json <<EOF
"exec-opts": ["native.cgroupdriver=systemd"],
"registry-mirrors": [
"https://docker.mirrors.ustc.edu.cn",
"http://hub-mirror.c.163.com"
],
"max-concurrent-downloads": 10,
"log-driver": "json-file",
"log-level": "warn",
"log-opts":
"max-size": "10m",
"max-file": "3"
,
"data-root": "/var/lib/docker"
EOF
systemctl restart docker
安装cri-dockerd
# 由于1.24以及更高版本不支持docker所以安装cri-docker
# 下载cri-docker
# wget https://ghproxy.com/https://github.com/Mirantis/cri-dockerd/releases/download/v0.2.5/cri-dockerd-0.2.5.amd64.tgz
# 解压cri-docker
tar -zxvf cri-dockerd-0.3.0.amd64.tgz
cp cri-dockerd/cri-dockerd /usr/bin/
chmod +x /usr/bin/cri-dockerd
# 写入启动配置文件
cat > /usr/lib/systemd/system/cri-docker.service <<EOF
[Unit]
Description=CRI Interface for Docker Application Container Engine
Documentation=https://docs.mirantis.com
After=network-online.target firewalld.service docker.service
Wants=network-online.target
Requires=cri-docker.socket
[Service]
Type=notify
ExecStart=/usr/bin/cri-dockerd --network-plugin=cni --pod-infra-container-image=registry.aliyuncs.com/google_containers/pause:3.7
ExecReload=/bin/kill -s HUP $MAINPID
TimeoutSec=0
RestartSec=2
Restart=always
StartLimitBurst=3
StartLimitInterval=60s
LimitNOFILE=infinity
LimitNPROC=infinity
LimitCORE=infinity
TasksMax=infinity
Delegate=yes
KillMode=process
[Install]
WantedBy=multi-user.target
EOF
# 写入socket配置文件
cat > /usr/lib/systemd/system/cri-docker.socket <<EOF
[Unit]
Description=CRI Docker Socket for the API
PartOf=cri-docker.service
[Socket]
ListenStream=%t/cri-dockerd.sock
SocketMode=0660
SocketUser=root
SocketGroup=docker
[Install]
WantedBy=sockets.target
EOF
# 进行启动cri-docker
systemctl daemon-reload ; systemctl enable cri-docker --now
二: 新增扩展节点flyfish84 与 flyfish85 两个节点
2.1 同步文件
1. 拷贝已部署好的Node相关文件到新节点
在Master节点将Worker Node涉及文件拷贝到新节点172.16.10.84/85
scp -r /opt/kubernetes root@172.16.10.84:/opt/
scp /opt/kubernetes/ssl/ca.pem root@172.16.10.84:/opt/kubernetes/ssl
scp -r /usr/lib/systemd/system/kubelet,kube-proxy.service root@172.16.10.84:/usr/lib/systemd/system
scp -r /opt/kubernetes root@172.16.10.85:/opt/
scp /opt/kubernetes/ssl/ca.pem root@172.16.10.85:/opt/kubernetes/ssl
scp -r /usr/lib/systemd/system/kubelet,kube-proxy.service root@172.16.10.85:/usr/lib/systemd/system
删除kubelet证书和kubeconfig文件
rm -rf /opt/kubernetes/cfg/kubelet.kubeconfig
rm -rf /opt/kubernetes/ssl/kubelet*
rm -rf /opt/kubernetes/logs/*
注:这几个文件是证书申请审批后自动生成的,每个Node不同,必须删除
修改主机名 [改节点的主机名]
flyfish84:
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=flyfish84
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: flyfish84
修改主机名 [改节点的主机名]
flyfish85:
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=flyfish85
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: flyfish85
启动并设置开机启动
systemctl daemon-reload
systemctl start kubelet kube-proxy
systemctl enable kubelet kube-proxy
在Master上批准新Node kubelet证书申请
kubectl get csr
# 授权请求
kubectl certificate approve node-csr-L4Ka9Ku3_M0JDVuSi331b2Jb729vvxHaO4Vjd-XUuLo
kubectl certificate approve node-csr-yzuzQ6tj-rSqY5jzGtXgP1JuAMqTGHxhHFEO3Zgc_Hc
kubectl get pod -n kube-system
kubectl get node
三: k8s 1.26.x 的高可用
3.1 新增一个flyfish82的master节点
kubernetes master 节点的 高可用
部署master02 IP 地址:flyfish82 节点 172.16.10.82
在 flyfish82 部署与 flyfish81 一样的 服务
scp -r /root/TLS/ root@flyfish82:/root/
scp -r /opt/kubernetes/ root@172.16.10.82:/opt/
scp /usr/bin/kubectl root@172.16.10.82:/usr/bin/
scp /usr/lib/systemd/system/kube-* root@172.16.10.82:/usr/lib/systemd/system/
修改flyfish82的配置文件
cd /opt/kubernetes/cfg
vim kube-apiserver.conf
---
--bind-address=192.168.100.12
--advertise-address=192.168.100.12
k8s 下节点命令
kubectl cordon flyfish82
kubectl drain flyfish82 --ignore-daemonsets --delete-emptydir-data
kubectl delete node flyfish82
修改flyfish82的节点名称从新加入集群
rm -rf /opt/kubernetes/cfg/kubelet.kubeconfig
rm -rf /opt/kubernetes/ssl/kubelet*
rm -rf /opt/kubernetes/logs/*
修改主机名 [改节点的主机名]
flyfish82:
vi /opt/kubernetes/cfg/kubelet.conf
--hostname-override=flyfish82
vi /opt/kubernetes/cfg/kube-proxy-config.yml
hostnameOverride: flyfish82
启动并设置开机启动
systemctl daemon-reload
systemctl start kubelet kube-proxy
systemctl enable kubelet kube-proxy
kubectl certificate approve node-csr-stxnPCqzzIEMfnsW6S467m3KxRvfBe_ur-vCWD5gzLw
启动服务:
service kube-apiserver start
chkconfig kube-apiserver on
service kube-controller-manager start
chkconfig kube-controller-manager on
service kube-scheduler start
chkconfig kube-scheduler on
cd /root/TLS/k8s
生成kubeconfig文件:
mkdir /root/.kube
KUBE_CONFIG="/root/.kube/config"
KUBE_APISERVER="https://172.16.10.82:6443"
kubectl config set-cluster kubernetes \\
--certificate-authority=/opt/kubernetes/ssl/ca.pem \\
--embed-certs=true \\
--server=$KUBE_APISERVER \\
--kubeconfig=$KUBE_CONFIG
kubectl config set-credentials cluster-admin \\
--client-certificate=./admin.pem \\
--client-key=./admin-key.pem \\
--embed-certs=true \\
--kubeconfig=$KUBE_CONFIG
kubectl config set-context default \\
--cluster=kubernetes \\
--user=cluster-admin \\
--kubeconfig=$KUBE_CONFIG
kubectl config use-context default --kubeconfig=$KUBE_CONFIG
kubectl get cs
kubectl get node
kubectl top node
3.2 配置两台nginx 服务
选用flyfish84 节点安装nginx
nginx服务器 地址: flyfish84 (172.16.10.84)
nginx的编译安装:
配置依赖包:
tar -zxvf nginx-1.23.2.tar.gz
cd nginx-1.23.2/
./configure \\
--prefix=/usr/local/nginx \\
--http-proxy-temp-path=/usr/local/nginx/proxy_temp \\
--http-fastcgi-temp-path=/usr/local/nginx/fastcgi_temp \\
--with-http_ssl_module \\
--with-threads \\
--with-file-aio \\
--with-http_ssl_module \\
--with-http_realip_module \\
--with-http_gzip_static_module \\
--with-http_secure_link_module \\
--with-http_stub_status_module \\
--with-http_auth_request_module \\
--with-http_random_index_module \\
--with-http_image_filter_module \\
--with-stream
make && make install
cd /usr/local/nginx/conf
cp -p nginx.conf nginx.conf.bak
vim nginx.conf
增加:
---
stream
log_format main "$remote_addr $upstream_addr $time_local $status";
access_log /var/log/nginx/k8s-access.log main;
upstream k8s-apiserver
server 172.16.10.81:6443;
server 172.16.10.82:6443;
server
listen 172.16.10.84:6443;
proxy_pass k8s-apiserver;
mkdir -p /vr/log/nginx/
sbin/nginx -t
sbin/nginx
修改所有node 节点的 master 服务器指向
login : 172.16.10.81
cd /opt/kubernetes/cfg/
vim bootstrap.kubeconfig
---
server: https://172.16.10.81:6443 改成:
server: https://172.16.10.84:6443
---
vim kubelet.kubeconfig
----
server: https://172.16.10.81:6443 改成:
server: https://172.16.10.84:6443
----
vim kube-proxy.kubeconfig
----
server: https://172.16.10.81:6443 改成:
server: https://172.16.10.84:6443
----
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish82:/opt/kubernetes/cfg/
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish83:/opt/kubernetes/cfg/
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish84:/opt/kubernetes/cfg/
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish85:/opt/kubernetes/cfg/
重新启动 node 节点的 kubelet 和 kube-proxy
service kubelet restart
service kube-proxy restart
使用 主节点 配置 测试
kubectl get nodes
查看nginx 日志
login: flyfish84节点
cd /var/log/nginx/
tail -f k8s-access.log
3.3 配置nginx 的负载均衡器
启用flyfish85 主机安装一个nginx 与flyfish 84 主机一样
nginx服务器 地址: flyfish85 (172.16.10.85)
nginx的编译安装:
配置依赖包:
tar -zxvf nginx-1.23.2.tar.gz
cd nginx-1.23.2/
./configure \\
--prefix=/usr/local/nginx \\
--http-proxy-temp-path=/usr/local/nginx/proxy_temp \\
--http-fastcgi-temp-path=/usr/local/nginx/fastcgi_temp \\
--with-http_ssl_module \\
--with-threads \\
--with-file-aio \\
--with-http_ssl_module \\
--with-http_realip_module \\
--with-http_gzip_static_module \\
--with-http_secure_link_module \\
--with-http_stub_status_module \\
--with-http_auth_request_module \\
--with-http_random_index_module \\
--with-http_image_filter_module \\
--with-stream
make && make install
cd /usr/local/nginx/conf
cp -p nginx.conf nginx.conf.bak
vim nginx.conf
增加:
---
stream
log_format main "$remote_addr $upstream_addr $time_local $status";
access_log /var/log/nginx/k8s-access.log main;
upstream k8s-apiserver
server 172.16.10.81:6443;
server 172.16.10.82:6443;
server
listen 172.16.10.85:6443;
proxy_pass k8s-apiserver;
mkdir -p /vr/log/nginx/
sbin/nginx -t
sbin/nginx
3.4 配置nginx 的负载均衡 keepalived
在flyfish84 与flyfish85 上面部署 nginx 与keepalive
yum install epel-release -y
yum install keepalived -y
keepalived配置文件(Nginx Master flyfish84 主机)
cat > /etc/keepalived/keepalived.conf << EOF
global_defs
notification_email
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_MASTER
vrrp_script check_nginx
script "/etc/keepalived/check_nginx.sh"
vrrp_instance VI_1
state MASTER
interface ens160
virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
priority 100 # 优先级,备服务器设置 90
advert_int 1 # 指定VRRP 心跳包通告间隔时间,默认1秒
authentication
auth_type PASS
auth_pass 1111
# 虚拟IP
virtual_ipaddress
172.16.10.200/24
track_script
check_nginx
EOF
vrrp_script:指定检查nginx工作状态脚本(根据nginx状态判断是否故障转移)
virtual_ipaddress:虚拟IP(VIP)
检查nginx状态脚本:
cat > /etc/keepalived/check_nginx.sh << "EOF"
#!/bin/bash
count=$(ps -ef |grep nginx |egrep -cv "grep|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
keepalived配置文件(Nginx Backup flyfish85 主机)
cat > /etc/keepalived/keepalived.conf << EOF
global_defs
notification_email
acassen@firewall.loc
failover@firewall.loc
sysadmin@firewall.loc
notification_email_from Alexandre.Cassen@firewall.loc
smtp_server 127.0.0.1
smtp_connect_timeout 30
router_id NGINX_BACKUP
vrrp_script check_nginx
script "/etc/keepalived/check_nginx.sh"
vrrp_instance VI_1
state BACKUP
interface ens160
virtual_router_id 51 # VRRP 路由 ID实例,每个实例是唯一的
priority 90
advert_int 1
authentication
auth_type PASS
auth_pass 1111
virtual_ipaddress
172.16.10.200/24
track_script
check_nginx
EOF
上述配置文件中检查nginx运行状态脚本:
cat > /etc/keepalived/check_nginx.sh << "EOF"
#!/bin/bash
count=$(ps -ef |grep nginx |egrep -cv "grep|$$")
if [ "$count" -eq 0 ];then
exit 1
else
exit 0
fi
EOF
chmod +x /etc/keepalived/check_nginx.sh
注:keepalived根据脚本返回状态码(0为工作正常,非0不正常)判断是否故障转移。
启动并设置开机启动
systemctl daemon-reload
systemctl start keepalived
systemctl enable keepalived
查看keepalived工作状态
ip addr
在flyfish84 上面 有一个 虚拟VIP 172.16.10.200
flyfish85 主机上面没有 负载VIP
ip addr
3.5 Nginx+Keepalived高可用测试
关闭主节点Nginx,测试VIP是否漂移到备节点服务器。
杀掉flyfish84 的nginx
pkill nginx
查看浮动IP 是否 飘到了flyfish85 节点
漂移到flyfish85 上面
将flyfish84 主机的nginx 起来负载VIP又漂移回来了
cd /usr/local/nginx/
sbin/nginx
ip addr
修改flyfish84/flyfish85 的 转发地址为0.0.0.0
cd /usr/local/nginx/conf
vim nginx.conf
----
events
worker_connections 1024;
stream
log_format main "$remote_addr $upstream_addr $time_local $status";
access_log /var/log/nginx/k8s-access.log main;
upstream k8s-apiserver
server 172.16.10.81:6443;
server 172.16.10.82:6443;
server
listen 0.0.0.0:6443;
proxy_pass k8s-apiserver;
----
之后从新启动nginx
sbin/nginx -s stop
sbin/nginx
验证负载VIP
curl -k https://172.16.10.200:6443/version
3.6 修改所有Worker Node连接LB VIP
虽然我们增加了Master2和负载均衡器,但是我们是从单Master架构扩容的,也就是说目前所有的Node组件连接都还是Master1,如果不改为连接VIP走负载均衡器,那么Master还是单点故障。
因此接下来就是要改所有Node组件配置文件,由原来172.16.100.81修改为172.16.10.200(VIP):
修改所有node 节点的 master 服务器指向
login : 172.16.10.81
cd /opt/kubernetes/cfg/
vim bootstrap.kubeconfig
---
server: https://172.16.10.84:6443 改成:
server: https://172.16.10.200:6443
---
vim kubelet.kubeconfig
----
server: https://172.16.10.84:6443 改成:
server: https://172.16.10.200:6443
----
vim kube-proxy.kubeconfig
----
server: https://172.16.10.84:6443 改成:
server: https://172.16.10.200:6443
----
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish82:/opt/kubernetes/cfg/
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish83:/opt/kubernetes/cfg/
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish84:/opt/kubernetes/cfg/
scp bootstrap.kubeconfig kubelet.kubeconfig kube-proxy.kubeconfig root@flyfish85:/opt/kubernetes/cfg/
重新启动 node 节点的 kubelet 和 kube-proxy
service kubelet restart
service kube-proxy restart
在flyfish81 与 flyfish82 主机上面验证
kubectl get node
kubectl get pod -n kube-system
以上是关于k8s 1.26.x 二进制高可用部署的主要内容,如果未能解决你的问题,请参考以下文章
ansible + kubeasz 二进制部署K8S高可用集群方案