CentOS7.9安装K8S高可用集群(三主三从)
Posted 岁月已走远
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了CentOS7.9安装K8S高可用集群(三主三从)相关的知识,希望对你有一定的参考价值。
服务器规划见下表(均为4核4G配置):
按上表准备好服务器后,对所有服务器节点的操作系统内核由3.10升级至5.4+(Haproxy、Keepalived和Ceph存储等需要用到),步骤如下:
#导入用于内核升级的yum源仓库ELRepo的秘钥 rpm --import https://www.elrepo.org/RPM-GPG-KEY-elrepo.org #启用ELRepo仓库yum源 rpm -Uvh http://www.elrepo.org/elrepo-release-7.0-2.el7.elrepo.noarch.rpm #查看当前可供升级的内核版本 yum --disablerepo="*" --enablerepo="elrepo-kernel" list available #安装长期稳定版本的内核(本例中为5.4.231-1.el7.elrepo) yum --enablerepo=elrepo-kernel install kernel-lt -y #设置GRUB的默认启动内核为刚刚新安装的内核版本(先备份,再修改) cp /etc/default/grub /etc/default/grub.bak vi /etc/default/grub
/etc/default/grub文件原来的内容如下:
GRUB_TIMEOUT=5 GRUB_DISTRIBUTOR="$(sed \'s, release .*$,,g\' /etc/system-release)" GRUB_DEFAULT=saved GRUB_DISABLE_SUBMENU=true GRUB_TERMINAL_OUTPUT="console" GRUB_CMDLINE_LINUX="crashkernel=auto rd.lvm.lv=centos_jumpserver/root rd.lvm.lv=centos_jumpserver/swap rhgb quiet" GRUB_DISABLE_RECOVERY="true"
将其的 GRUB_DEFAULT=saved 的值由 saved 改为 0 即可(即 GRUB_DEFAULT=0)保存退出,接着执行以下命令:
#重新生成启动引导项配置文件,该命令会去读取/etc/default/grub内容 grub2-mkconfig -o /boot/grub2/grub.cfg #重启系统 reboot -h now
重启完成后,查看一下当前的操作系统及内核版本:
[root@master1 ~]# cat /etc/redhat-release CentOS Linux release 7.9.2009 (Core) [root@master1 ~]# uname -rs Linux 5.4.231-1.el7.elrepo.x86_64
更新一下yum源仓库CentOS-Base.repo的地址,配置阿里云镜像地址,用于加速后续某些组件的下载速度:
#先安装一下wget工具 yum install -y wget #然后备份 CentOS-Base.repo mv /etc/yum.repos.d/CentOS-Base.repo /etc/yum.repos.d/CentOS-Base.repo.bak #使用阿里云镜像仓库地址重建CentOS-Base.repo wget -O /etc/yum.repos.d/CentOS-Base.repo https://mirrors.aliyun.com/repo/Centos-7.repo
#运行yum makecache生成缓存
yum makecache
关闭所有服务器的防火墙(在某些组件(例如Dashboard)的防火墙端口开放范围不是很清楚的情况下选择将其关闭,生产环境最好按照K8S和相关组件官网的说明,保持防火墙开启状态下选择开放相关端口):
#临时关闭防火墙
systemctl stop firewalld
#永久关闭防火墙(禁用,避免重启后又打开)
systemctl disable firewalld
设置所有服务器的/etc/hosts文件,做好hostname和IP的映射:
cat >> /etc/hosts << EOF
192.168.17.3 master1 192.168.17.4 master2 192.168.17.5 master3 192.168.17.11 node1 192.168.17.12 node2 192.168.17.13 node3 192.168.17.200 lb # 后面将用于Keepalived的VIP(虚拟IP),如果不是高可用集群,该IP可以是master1的IP
EOF
设置所有服务器的时间同步组件(K8S在进行某些集群状态的判断时(例如节点存活)需要有统一的时间进行参考):
#安装时间同步组件 yum install -y ntpdate #对齐一下各服务器的当前时间 ntpdate time.windows.com
禁用selinux(K8S在SELINUX方面的使用和控制还不是非常的成熟):
#临时禁用Selinux setenforce 0 #永久禁用Selinux sed -i \'s/^SELINUX=enforcing$/SELINUX=permissive/\' /etc/selinux/config
关闭操作系统swap功能(swap功能可能会使K8S在内存方面的QoS策略失效,进而影响内存资源紧张时的Pod驱逐):
#临时关闭swap功能 swapoff -a #永久关闭swap功能 sed -ri \'s/.*swap.*/#&/\' /etc/fstab
在每个节点上将桥接的IPv4流量配置为传递到iptables的处理链(K8S的Service功能的实现组件kube-proxy需要使用iptables来转发流量到目标pod):
#新建 /etc/sysctl.d/k8s.conf 配置文件 cat > /etc/sysctl.d/k8s.conf << EOF net.bridge.bridge-nf-call-ip6tables = 1 net.bridge.bridge-nf-call-iptables = 1 net.ipv4.ip_forward = 1 vm.swappiness = 0 EOF #新建 /etc/modules-load.d/k8s.conf 配置文件 cat > /etc/modules-load.d/k8s.conf << EOF br_netfilter overlay EOF #加载配置文件 sysctl --system #加载br_netfilter模块(可以使 iptables 规则在 Linux Bridges 上面工作,用于将桥接的流量转发至iptables链) #如果没有加载br_netfilter模块,并不会影响不同node上的pod之间的通信,但是会影响同一node内的pod之间通过service来通信 modprobe br_netfilter #加载网络虚拟化技术模块 modprobe overlay
#检验网桥过滤模块是否加载成功
lsmod | grep -e br_netfilter -e overlay
K8S的service有基于iptables和基于ipvs两种代理模型,基于ipvs的性能要高一些,但是需要手动载入ipvs模块才能使用:
#安装ipset和ipvsadm yum install -y ipset ipvsadm #创建需要加载的模块写入脚本文件(高版本内核nf_conntrack_ipv4已经换成nf_conntrack) cat > /etc/sysconfig/modules/ipvs.modules << EOF #!/bin/bash modprobe -- ip_vs modprobe -- ip_vs_rr modprobe -- ip_vs_wrr modprobe -- ip_vs_sh modprobe -- nf_conntrack EOF #为上述为脚本添加执行权限 chmod +x /etc/sysconfig/modules/ipvs.modules #执行上述脚本 /bin/bash /etc/sysconfig/modules/ipvs.modules #查看上述脚本中的模块是否加载成功 lsmod | grep -e -ip_vs -e nf_conntrack
在master1节点上使用RSA算法生成非对称加密公钥和密钥,并将公钥传递给其他节点(方便后面相同配置文件的分发传送,以及从master1上免密登录到其他节点进行集群服务器管理):
#使用RSA算法成生密钥和公钥,遇到输入,直接Enter即可 ssh-keygen -t rsa #将公钥传送给其他几个节点(遇到(yes/no)的一路输入yes,并接着输入相应服务器的root密码) for i in master1 master2 master3 node1 node2 node3;do ssh-copy-id -i .ssh/id_rsa.pub $i;done
Are you sure you want to continue connecting (yes/no)? yes
/usr/bin/ssh-copy-id: INFO: attempting to log in with the new key(s), to filter out any that are already installed
/usr/bin/ssh-copy-id: INFO: 1 key(s) remain to be installed -- if you are prompted now it is to install the new keys
root@node3\'s password:
Number of key(s) added: 1
Now try logging into the machine, with: "ssh \'node3\'"
and check to make sure that only the key(s) you wanted were added.
配置所有节点的limits:
#在所有节点上临时修改limits ulimit -SHn 65536 #先在master1节点永久修改limits vi /etc/security/limits.conf #在最后添加以下内容 * soft nofile 65536 * hard nofile 65536 * soft nproc 4096 * hard nproc 4096 * soft memlock unlimited * soft memlock unlimited
然后将master1上的/etc/security/limits.conf文件复制到其他服务器节点上
scp /etc/security/limits.conf root@master2:/etc/security/limits.conf scp /etc/security/limits.conf root@master3:/etc/security/limits.conf scp /etc/security/limits.conf root@node3:/etc/security/limits.conf scp /etc/security/limits.conf root@node2:/etc/security/limits.conf scp /etc/security/limits.conf root@node1:/etc/security/limits.conf
在所有节点上安装Docker(K8S v1.24后推荐使用符合CRI标准的containerd或cri-o等作为容器运行时,原来用于支持Docker作为容器运行时的dockershim从该版本开始也已经从K8S中移除,如果还要坚持使用Docker的话,需要借助cri-dockerd适配器,性能没containerd和cri-o好):
#卸载旧docker相关组件 yum remove -y docker* yum remove -y containerd* #下载docker-ce仓库的yum源配置 wget https://mirrors.aliyun.com/docker-ce/linux/centos/docker-ce.repo -O /etc/yum.repos.d/docker-ce.repo #安装特定版本的docker(要符合目标版本K8S的docker版本要求) yum install -y docker-ce-19.03.15 docker-ce-cli-19.03.15 containerd.io #配置docker(国内华中科技大学的镜像仓库、cgroup驱动为systemd等) mkdir -p /etc/docker sudo tee /etc/docker/daemon.json <<-\'EOF\' "exec-opts": ["native.cgroupdriver=systemd"], "registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"], "log-driver":"json-file", "log-opts": "max-size":"500m", "max-file":"3" EOF #立即启动Docker,并设置为开机自动启动 systemctl enable docker --now
设置下载K8S相关组件的仓库为国内镜像yum源(国外的太慢了,这里设置为阿里云的):
cat > /etc/yum.repos.d/kubernetes.repo << EOF [kubernetes] name=Kubernetes baseurl=https://mirrors.aliyun.com/kubernetes/yum/repos/kubernetes-el7-x86_64 enabled=1 gpgcheck=0 repo_gpgcheck=0 gpgkey=https://mirrors.aliyun.com/kubernetes/yum/doc/yum-key.gpg https://mirrors.aliyun.com/kubernetes/yum/doc/rpm-package-key.gpg EOF
在master1、master2和master3节点上安装kubeadm、kubectl和kubelet三个工具(不要太新的版本,也不要太旧,否则很多基于K8S的周边工具和应用无法配对,比如kubeSphere、Rook等):
yum install -y kubeadm-1.20.2 kubectl-1.20.2 kubelet-1.20.2 --disableexcludes=kubernetes
在node1、node2和node3节点上安装kubeadm和kubelet两个工具:
yum install -y kubeadm-1.20.2 kubelet-1.20.2 --disableexcludes=kubernetes
在所有节点上立即启动kubelet,并设置为开机启动:
systemctl enable kubelet --now
在所有节点上将kubelet使用的cgroup drver与前面安装配置Docker时的对齐,都设置为systemd :
#先在master1上修改"/etc/sysconfig/kubelet"文件的内容 vi /etc/sysconfig/kubelet #cgroup-driver使用systemd驱动 KUBELET_EXTRA_ARGS="--cgroup-driver=systemd" #kube-proxy基于ipvs的代理模型 KUBE_PROXY_MODE="ipvs"
然后将master1上的 /etc/sysconfig/kubelet 文件复制到其他节点上
scp /etc/sysconfig/kubelet root@master2:/etc/sysconfig/kubelet scp /etc/sysconfig/kubelet root@master3:/etc/sysconfig/kubelet scp /etc/sysconfig/kubelet root@node3:/etc/sysconfig/kubelet scp /etc/sysconfig/kubelet root@node2:/etc/sysconfig/kubelet scp /etc/sysconfig/kubelet root@node1:/etc/sysconfig/kubelet
在所有节点上立即启动kubelet,并设置为开机自动启动
#立即启动kubelet,并设置为开机自动启动 systemctl enable kubelet --now #查看各节点上kubelet的运行状态(由于网络组件还没安装,此时kubelet会不断重启,属于正常现象) systemctl status kubelet
在master1、master2和master3节点上安装HAProxy和Keepalived高可用维护组件(仅高可用集群需要):
yum install -y keepalived haproxy
在master1节点上修改HAproxy配置文件/etc/haproxy/haproxy.cfg,修改好后分发到master2和master3:
#先备份HAProxy配置文件 cp /etc/haproxy/haproxy.cfg /etc/haproxy/haproxy.cfg.bak #修改HAProxy配置文件 vi /etc/haproxy/haproxy.cfg
global maxconn 2000 ulimit-n 16384 log 127.0.0.1 local0 err stats timeout 30s defaults log global mode http option httplog timeout connect 5000 timeout client 50000 timeout server 50000 timeout http-request 15s timeout http-keep-alive 15s frontend monitor-in bind *:33305 mode http option httplog monitor-uri /monitor listen stats bind *:8006 mode http stats enable stats hide-version stats uri /stats stats refresh 30s stats realm Haproxy\\ Statistics stats auth admin:admin frontend k8s-master bind 0.0.0.0:16443 bind 127.0.0.1:16443 mode tcp option tcplog tcp-request inspect-delay 5s default_backend k8s-master backend k8s-master mode tcp option tcplog option tcp-check balance roundrobin default-server inter 10s downinter 5s rise 2 fall 2 slowstart 60s maxconn 250 maxqueue 256 weight 100 # 下面的配置根据实际情况修改 server master1 192.168.17.3:6443 check server master2 192.168.17.4:6443 check server master3 192.168.17.5:6443 check
然后将master1节点上的 /etc/haproxy/haproxy.cfg 文件复制到master2和master3节点上
scp /etc/haproxy/haproxy.cfg root@master2:/etc/haproxy/haproxy.cfg scp /etc/haproxy/haproxy.cfg root@master3:/etc/haproxy/haproxy.cfg
在master1、master2和master3节点上修改Keepalived的配置文件(注意各节点不一样的地方,红色标记):
#先备份Keepalived的配置文件 cp /etc/keepalived/keepalived.conf /etc/keepalived/keepalived.conf.bak #修改HAProxy配置文件 vi /etc/keepalived/keepalived.conf
! Configuration File for keepalived global_defs ## 标识本节点的字符串,通常为 hostname router_id master1 script_user root enable_script_security ## 检测脚本 ## keepalived 会定时执行脚本并对脚本执行的结果进行分析,动态调整 vrrp_instance 的优先级。如果脚本执行结果为 0,并且 weight 配置的值大于 0,则优先级相应的增加。如果脚本执行结果非 0,并且 weight配置的值小于 0,则优先级相应的减少。其他情况,维持原本配置的优先级,即配置文件中 priority 对应的值。 vrrp_script chk_apiserver script "/etc/keepalived/check_apiserver.sh" # 每2秒检查一次 interval 2 # 一旦脚本执行成功,权重减少5 weight -5 fall 3 rise 2 ## 定义虚拟路由,VR_1 为虚拟路由的标示符,自己定义名称 vrrp_instance VR_1 ## 主节点为 MASTER,对应的备份节点为 BACKUP state MASTER ## 绑定虚拟 IP 的网络接口(网卡),与本机 IP 地址所在的网络接口相同 interface ens32 # 主机的IP地址 mcast_src_ip 192.168.17.3 # 虚拟路由id virtual_router_id 100 ## 节点优先级,值范围 0-254,MASTER 要比 BACKUP 高 priority 100 ## 优先级高的设置 nopreempt 解决异常恢复后再次抢占的问题 nopreempt ## 组播信息发送间隔,所有节点设置必须一样,默认 1s advert_int 2 ## 设置验证信息,所有节点必须一致 authentication auth_type PASS auth_pass K8SHA_KA_AUTH ## 虚拟 IP 池, 所有节点设置必须一样 virtual_ipaddress ## VIP,可以定义多个 192.168.17.200 track_script chk_apiserver
在master1节点上新建节点存活监控脚本 /etc/keepalived/check_apiserver.sh :
vi /etc/keepalived/check_apiserver.sh
#!/bin/bash err=0 for k in $(seq 1 5) do check_code=$(pgrep kube-apiserver) if [[ $check_code == "" ]]; then err=$(expr $err + 1) sleep 5 continue else err=0 break fi done if [[ $err != "0" ]]; then echo "systemctl stop keepalived" /usr/bin/systemctl stop keepalived exit 1 else exit 0 fi
然后将master1节点上的 /etc/keepalived/check_apiserver.sh 文件复制到master2和master3节点上:
scp /etc/keepalived/check_apiserver.sh root@master2:/etc/keepalived/check_apiserver.sh scp /etc/keepalived/check_apiserver.sh root@master3:/etc/keepalived/check_apiserver.sh
各节点修改 /etc/keepalived/check_apiserver.sh 文件为可执行权限:
chmod +x /etc/keepalived/check_apiserver.sh
各节点立即启动HAProxy和Keepalived,并设置为开机启动:
#立即启动haproxy,并设置为开机启动 systemctl enable haproxy --now #立即启动keepalived,并设置为开机启动 systemctl enable keepalived --now
测试Keepalived维护的VIP是否畅通:
#在宿主机上(本例为Windows) ping 192.168.17.200 正在 Ping 192.168.17.200 具有 32 字节的数据: 来自 192.168.17.200 的回复: 字节=32 时间=1ms TTL=64 来自 192.168.17.200 的回复: 字节=32 时间<1ms TTL=64 来自 192.168.17.200 的回复: 字节=32 时间=1ms TTL=64 来自 192.168.17.200 的回复: 字节=32 时间<1ms TTL=64 192.168.17.200 的 Ping 统计信息: 数据包: 已发送 = 4,已接收 = 4,丢失 = 0 (0% 丢失), 往返行程的估计时间(以毫秒为单位): 最短 = 0ms,最长 = 1ms,平均 = 0ms #在K8S各节点上 ping 192.168.17.200 -c 4 PING 192.168.17.200 (192.168.17.200) 56(84) bytes of data. 64 bytes from 192.168.17.200: icmp_seq=1 ttl=64 time=0.058 ms 64 bytes from 192.168.17.200: icmp_seq=2 ttl=64 time=0.055 ms 64 bytes from 192.168.17.200: icmp_seq=3 ttl=64 time=0.077 ms 64 bytes from 192.168.17.200: icmp_seq=4 ttl=64 time=0.064 ms --- 192.168.17.200 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3112ms rtt min/avg/max/mdev = 0.055/0.063/0.077/0.011 ms
接下来我们在master1节点生成并配置kubeadm工具用于初始化K8S控制平面(即Master)的配置文件(这里命名为kubeadm-config.yaml):
#进入 /etc/kubernetes/ 目录 cd /etc/kubernetes/ #使用以下命令先将kubeadm默认的控制平台初始化配置导出来 kubeadm config print init-defaults > kubeadm-config.yaml
生成的kubeadm-config.yaml文件内容如下(红色部分为要注意根据情况修改的,蓝色为新加的配置):
apiVersion: kubeadm.k8s.io/v1beta2 bootstrapTokens: - groups: - system:bootstrappers:kubeadm:default-node-token token: abcdef.0123456789abcdef ttl: 24h0m0s usages: - signing - authentication kind: InitConfiguration localAPIEndpoint: advertiseAddress: 192.168.17.3 #当前节点IP(各节点不同) bindPort: 6443 nodeRegistration: criSocket: /var/run/dockershim.sock name: master1 #当前节点hostname(各节点不同) taints: - effect: NoSchedule key: node-role.kubernetes.io/master --- apiServer: timeoutForControlPlane: 4m0s apiVersion: kubeadm.k8s.io/v1beta2 certificatesDir: /etc/kubernetes/pki clusterName: kubernetes controlPlaneEndpoint: "192.168.17.200:16443" #Keepalived维护的VIP和HAProxy维护的Port controllerManager: dns: type: CoreDNS etcd: local: dataDir: /var/lib/etcd imageRepository: registry.aliyuncs.com/google_containers #K8S内部拉取镜像时将使用的镜像仓库地址 kind: ClusterConfiguration kubernetesVersion: v1.20.0 networking: dnsDomain: cluster.local serviceSubnet: 10.96.0.0/16 #K8S内部生成Service资源对象的IP时将使用的CIDR格式网段 podSubnet: 10.244.0.0/16 #K8S内部生成Pod资源对象的IP时将使用的CIDR格式网段,宿主机、Service、Pod三者的网段不能重叠 scheduler:
添加的podSubnet配置项与在命令行中的参与 --pod-network-cidr 是同一个参数,表明使用CNI标准的Pod的网络插件(后面的Calico就是)
然后将master1节点上的 /etc/kubernetes/kubeadm-config.yaml 文件复制到master2和master3节点上,并修改上述标红的(各节点不同)部分:
scp /etc/kubernetes/kubeadm-config.yaml root@master2:/etc/kubernetes/kubeadm-config.yaml scp /etc/kubernetes/kubeadm-config.yaml root@master3:/etc/kubernetes/kubeadm-config.yaml
所有master节点使用刚刚的控制平台初始化配置文件 /etc/kubernetes/kubeadm-config.yaml 提前下载好初始化控制平台需要的镜像:
kubeadm config images pull --config /etc/kubernetes/kubeadm-config.yaml
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-apiserver:v1.20.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-controller-manager:v1.20.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.0
[config/images] Pulled registry.aliyuncs.com/google_containers/kube-proxy:v1.20.0
[config/images] Pulled registry.aliyuncs.com/google_containers/pause:3.2
[config/images] Pulled registry.aliyuncs.com/google_containers/etcd:3.4.13-0
[config/images] Pulled registry.aliyuncs.com/google_containers/coredns:1.7.0
然后将master1选为主节点(对应Keepalived的配置),使用配置文件/etc/kubernetes/kubeadm-config.yaml进行K8S控制平面的初始化(会在/etc/kubernetes目录下生成对应的证书和配置文件):
kubeadm init --config /etc/kubernetes/kubeadm-config.yaml --upload-certs
[init] Using Kubernetes version: v1.20.0 [preflight] Running pre-flight checks [preflight] Pulling images required for setting up a Kubernetes cluster [preflight] This might take a minute or two, depending on the speed of your internet connection [preflight] You can also perform this action in beforehand using \'kubeadm config images pull\' [certs] Using certificateDir folder "/etc/kubernetes/pki" [certs] Generating "ca" certificate and key [certs] Generating "apiserver" certificate and key [certs] apiserver serving cert is signed for DNS names [kubernetes kubernetes.default kubernetes.default.svc kubernetes.default.svc.cluster.local master1] and IPs [10.96.0.1 192.168.17.3 192.168.17.200] [certs] Generating "apiserver-kubelet-client" certificate and key [certs] Generating "front-proxy-ca" certificate and key [certs] Generating "front-proxy-client" certificate and key [certs] Generating "etcd/ca" certificate and key [certs] Generating "etcd/server" certificate and key [certs] etcd/server serving cert is signed for DNS names [localhost master1] and IPs [192.168.17.3 127.0.0.1 ::1] [certs] Generating "etcd/peer" certificate and key [certs] etcd/peer serving cert is signed for DNS names [localhost master1] and IPs [192.168.17.3 127.0.0.1 ::1] [certs] Generating "etcd/healthcheck-client" certificate and key [certs] Generating "apiserver-etcd-client" certificate and key [certs] Generating "sa" key and public key [kubeconfig] Using kubeconfig folder "/etc/kubernetes" [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "admin.conf" kubeconfig file [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "kubelet.conf" kubeconfig file [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "controller-manager.conf" kubeconfig file [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [kubeconfig] Writing "scheduler.conf" kubeconfig file [kubelet-start] Writing kubelet environment file with flags to file "/var/lib/kubelet/kubeadm-flags.env" [kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml" [kubelet-start] Starting the kubelet [control-plane] Using manifest folder "/etc/kubernetes/manifests" [control-plane] Creating static Pod manifest for "kube-apiserver" [control-plane] Creating static Pod manifest for "kube-controller-manager" [control-plane] Creating static Pod manifest for "kube-scheduler" [etcd] Creating static Pod manifest for local etcd in "/etc/kubernetes/manifests" [wait-control-plane] Waiting for the kubelet to boot up the control plane as static Pods from directory "/etc/kubernetes/manifests". This can take up to 4m0s [apiclient] All control plane components are healthy after 16.635364 seconds [upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace [kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace [upload-certs] Using certificate key: 1f0eae8f50417882560ecc7ec7f08596ed1910d9f90d237f00c69eb07b8fb64c [mark-control-plane] Marking the node master1 as control-plane by adding the labels "node-role.kubernetes.io/master=\'\'" and "node-role.kubernetes.io/control-plane=\'\' (deprecated)" [mark-control-plane] Marking the node master1 as control-plane by adding the taints [node-role.kubernetes.io/master:NoSchedule] [bootstrap-token] Using token: abcdef.0123456789abcdef [bootstrap-token] Configuring bootstrap tokens, cluster-info ConfigMap, RBAC Roles [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes [bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials [bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token [bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster [bootstrap-token] Creating the "cluster-info" ConfigMap in the "kube-public" namespace [kubelet-finalize] Updating "/etc/kubernetes/kubelet.conf" to point to a rotatable kubelet client certificate and key [addons] Applied essential addon: CoreDNS [endpoint] WARNING: port specified in controlPlaneEndpoint overrides bindPort in the controlplane address [addons] Applied essential addon: kube-proxy Your Kubernetes control-plane has initialized successfully! To start using your cluster, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Alternatively, if you are the root user, you can run: export KUBECONFIG=/etc/kubernetes/admin.conf You should now deploy a pod network to the cluster. Run "kubectl apply -f [podnetwork].yaml" with one of the options listed at: https://kubernetes.io/docs/concepts/cluster-administration/addons/ You can now join any number of the control-plane node running the following command on each as root: kubeadm join 192.168.17.200:16443 --token abcdef.0123456789abcdef \\ --discovery-token-ca-cert-hash sha256:4316261bbf7392937c47cc2e0f4834df9d6ad51fea47817cc37684a21add36e1 \\ --control-plane --certificate-key 1f0eae8f50417882560ecc7ec7f08596ed1910d9f90d237f00c69eb07b8fb64c Please note that the certificate-key gives access to cluster sensitive data, keep it secret! As a safeguard, uploaded-certs will be deleted in two hours; If necessary, you can use "kubeadm init phase upload-certs --upload-certs" to reload certs afterward. Then you can join any number of worker nodes by running the following on each as root: kubeadm join 192.168.17.200:16443 --token abcdef.0123456789abcdef \\ --discovery-token-ca-cert-hash sha256:4316261bbf7392937c47cc2e0f4834df9d6ad51fea47817cc37684a21add36e1
如果初始化失败,可以使用以下命清理一下,然后重新初始化:
kubeadm reset -f ipvsadm --clear rm -rf ~/.kube
初始化完master1节点上的k8s控制平面后生成的文件夹和文件内容如下图所示:
根据master1上控制平台初化始成功后的提示,我们继续完成高可用集群搭建的剩余工作。
为常规的普通用户执行以下命令:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
为 root 用户执行以下命令:
export KUBECONFIG=/etc/kubernetes/admin.conf
请确保你一定执行了上述命令,否则你可能会遇到如下错误:
[root@master1 kubernetes]# kubectl get cs The connection to the server localhost:8080 was refused - did you specify the right host or port?
在master2和master3节点上分别执行以下命令将相应节点加入到集群成为控制平面节点(即主节点):
kubeadm join 192.168.17.200:16443 --token abcdef.0123456789abcdef \\ --discovery-token-ca-cert-hash sha256:4316261bbf7392937c47cc2e0f4834df9d6ad51fea47817cc37684a21add36e1 \\ --control-plane --certificate-key 1f0eae8f50417882560ecc7ec7f08596ed1910d9f90d237f00c69eb07b8fb64c
This node has joined the cluster and a new control plane instance was created: * Certificate signing request was sent to apiserver and approval was received. * The Kubelet was informed of the new secure connection details. * Control plane (master) label and taint were applied to the new node. * The Kubernetes control plane instances scaled up. * A new etcd member was added to the local/stacked etcd cluster. To start administering your cluster from this node, you need to run the following as a regular user: mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config Run \'kubectl get nodes\' to see this node join the cluster.
根据提示,在master2和master3节点上分别执行以下命令:
mkdir -p $HOME/.kube sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config sudo chown $(id -u):$(id -g) $HOME/.kube/config
在node1、node2和node3节点上分别执行以下命令将相应节点加入到集群成为工作节点:
kubeadm join 192.168.17.200:16443 --token abcdef.0123456789abcdef \\ --discovery-token-ca-cert-hash sha256:4316261bbf7392937c47cc2e0f4834df9d6ad51fea47817cc37684a21add36e1
此时查看集群的节点列表,发现均处于 NotReady 状态:
[root@master1 ~]# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 NotReady control-plane,master 5m59s v1.20.2 master2 NotReady control-plane,master 4m48s v1.20.2 master3 NotReady control-plane,master 4m1s v1.20.2 node1 NotReady <none> 30s v1.20.2 node2 NotReady <none> 11s v1.20.2 node3 NotReady <none> 6s v1.20.2
此时查看集群的Pod列表,发现coredns相关的pod也处理 Pending 状态:
[root@master1 ~]# kubectl get pods -ALL NAMESPACE NAME READY STATUS RESTARTS AGE L kube-system coredns-7f89b7bc75-252wl 0/1 Pending 0 6m12s kube-system coredns-7f89b7bc75-7952s 0/1 Pending 0 6m12s kube-system etcd-master1 1/1 Running 0 6m19s kube-system etcd-master2 1/1 Running 0 5m6s kube-system etcd-master3 1/1 Running 0 4m20s kube-system kube-apiserver-master1 1/1 Running 0 6m19s kube-system kube-apiserver-master2 1/1 Running 0 5m9s kube-system kube-apiserver-master3 1/1 Running 1 4m21s kube-system kube-controller-manager-master1 1/1 Running 1 6m19s kube-system kube-controller-manager-master2 1/1 Running 0 5m10s kube-system kube-controller-manager-master3 1/1 Running 0 3m17s kube-system kube-proxy-68q8f 1/1 Running 0 34s kube-system kube-proxy-6m986 1/1 Running 0 5m11s kube-system kube-proxy-8fklt 1/1 Running 0 29s kube-system kube-proxy-bl2rx 1/1 Running 0 53s kube-system kube-proxy-pth46 1/1 Running 0 4m24s kube-system kube-proxy-r65vj 1/1 Running 0 6m12s kube-system kube-scheduler-master1 1/1 Running 1 6m19s kube-system kube-scheduler-master2 1/1 Running 0 5m9s kube-system kube-scheduler-master3 1/1 Running 0 3m22s
上述状态都是因为集群还没有安装Pod的网格插件,接下来就为集群安装一个名为Calico的网络插件!
使用 "kubectl apply -f [podnetwork].yaml" 为K8S集群安装Pod网络插件,用于集群内Pod之间的网络连接建立(不同节点间的Pod直接用其Pod IP通信,相当于集群节点对于Pod透明),我们这里选用符合CNI(容器网络接口)标准的calico网络插件(flanel也是可以的):
首选,从Calico的版本发行说明站点 https://docs.tigera.io/archive 找到兼容你所安装的K8S的版本,里面同时也有安装配置说明,如下截图所示:
根据安装提示,我们执行以下命令进行Calico网络插件的安装:
#安装Calico Operator kubectl create -f https://docs.projectcalico.org/archive/v3.19/manifests/tigera-operator.yaml
如果上述步骤无法直接应用目录站点上的配置进行安装,可以直接在浏览器打开目标URL,将内容复制下来创建一个yaml文件再应用(例如创建一个 /etc/kubernetes/calico-operator.yaml 文件(其内容就是tigera-operator.yaml文件的内容)):
kubectl create -f /etc/kubernetes/calico-operator.yaml
接下来是Calico的资源定义文件的应用,这里我不直接使用在线文件进行应用,而是直接在浏览器打开目标URL,将内容复制下来创建一个yaml文件再应用(例如创建一个 /etc/kubernetes/calico-resources.yaml 文件(其内容就是tigera-operator.yaml文件的内容)),更改一下cidr网段配置,对齐之前kubeadm-config.yaml文件里的 podSubnet 的值:
#直接应用在线文件的方式(我们这里不这样做,因为要修改cidr地址) kubectl create -f https://docs.projectcalico.org/archive/v3.19/manifests/custom-resources.yaml
# This section includes base Calico installation configuration. # For more information, see: https://docs.projectcalico.org/v3.19/reference/installation/api#operator.tigera.io/v1.Installation apiVersion: operator.tigera.io/v1 kind: Installation metadata: name: default spec: # Configures Calico networking. calicoNetwork: # Note: The ipPools section cannot be modified post-install. ipPools: - blockSize: 26 cidr: 10.244.0.0/16 encapsulation: VXLANCrossSubnet natOutgoing: Enabled nodeSelector: all()
kubectl create -f /etc/kubernetes/calico-resources.yaml
执行完相关命令后,可以使用以下命令监控Calico相关Pod的部署进度(下载镜像需要一定时间,耐心等待!):
watch kubectl get pods -n calico-system
待等Calico的部署完成后,使用以下命令查看一下当前集群内所有Pod的运行状态(发现所有Pod都处于 Running 状态了):
[root@master1 kubernetes]# kubectl get pods -ALL NAMESPACE NAME READY STATUS RESTARTS AGE L calico-system calico-kube-controllers-6564f5db75-9smlm 1/1 Running 0 7m34s calico-system calico-node-5xg5s 1/1 Running 0 7m34s calico-system calico-node-dzmrl 1/1 Running 0 7m34s calico-system calico-node-fh5tt 1/1 Running 0 7m34s calico-system calico-node-ngpgx 1/1 Running 0 7m34s calico-system calico-node-qxkbd 1/1 Running 0 7m34s calico-system calico-node-zmtdc 1/1 Running 0 7m34s calico-system calico-typha-99996595b-lm7rh 1/1 Running 0 7m29s calico-system calico-typha-99996595b-nwbcz 1/1 Running 0 7m29s calico-system calico-typha-99996595b-vnzpl 1/1 Running 0 7m34s kube-system coredns-7f89b7bc75-252wl 1/1 Running 0 39m kube-system coredns-7f89b7bc75-7952s 1/1 Running 0 39m kube-system etcd-master1 1/1 Running 0 39m kube-system etcd-master2 1/1 Running 0 38m kube-system etcd-master3 1/1 Running 0 37m kube-system kube-apiserver-master1 1/1 Running 0 39m kube-system kube-apiserver-master2 1/1 Running 0 38m kube-system kube-apiserver-master3 1/1 Running 1 37m kube-system kube-controller-manager-master1 1/1 Running 1 39m kube-system kube-controller-manager-master2 1/1 Running 0 38m kube-system kube-controller-manager-master3 1/1 Running 0 36m kube-system kube-proxy-68q8f 1/1 Running 0 33m kube-system kube-proxy-6m986 1/1 Running 0 38m kube-system kube-proxy-8fklt 1/1 Running 0 33m kube-system kube-proxy-bl2rx 1/1 Running 0 33m kube-system kube-proxy-pth46 1/1 Running 0 37m kube-system kube-proxy-r65vj 1/1 Running 0 39m kube-system kube-scheduler-master1 1/1 Running 1 39m kube-system kube-scheduler-master2 1/1 Running 0 38m kube-system kube-scheduler-master3 1/1 Running 0 36m tigera-operator tigera-operator-6fdc4c585-lk294 1/1 Running 0 8m1s [root@master1 kubernetes]#
待等Calico的部署完成后,使用以下命令查看当前集群节点的状态(发现所有的节点都已处理 Ready 状态了):
[root@master1 kubernetes]# kubectl get nodes NAME STATUS ROLES AGE VERSION master1 Ready control-plane,master 49m v1.20.2 master2 Ready control-plane,master 48m v1.20.2 master3 Ready control-plane,master 47m v1.20.2 node1 Ready <none> 43m v1.20.2 node2 Ready <none> 43m v1.20.2 node3 Ready <none> 43m v1.20.2
如果你打算让Master节点也参与到平常的Pod调度(生产环境一般不会这样做,以保证master节点的稳定性),那么你需要使用以下命令将Master节点上的 taint(污点标记)解除:
kubectl taint nodes --all node-role.kubernetes.io/master-
最后我们使用以下命令查看当前集群的状态,发现Scheduler和Controller Manager组件处理不健康状态:
[root@master1 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Unhealthy Get "http://127.0.0.1:10251/healthz": dial tcp 127.0.0.1:10251: connect: connection refused controller-manager Unhealthy Get "http://127.0.0.1:10252/healthz": dial tcp 127.0.0.1:10252: connect: connection refused etcd-0 Healthy "health":"true"
解决上述问题需要将每个Master节点上的 /etc/kubernetes/manifests/kube-scheduler.yaml 和 /etc/kubernetes/manifests/kube-controller-manager.yaml 文件中的- --port=0注释掉(即前面加一个#),这里以 /etc/kubernetes/manifests/kube-scheduler.yaml文件为例:
[root@master1 ~]# vi /etc/kubernetes/manifests/kube-scheduler.yaml apiVersion: v1 kind: Pod metadata: creationTimestamp: null labels: component: kube-scheduler tier: control-plane name: kube-scheduler namespace: kube-system spec: containers: - command: - kube-scheduler - --authentication-kubeconfig=/etc/kubernetes/scheduler.conf - --authorization-kubeconfig=/etc/kubernetes/scheduler.conf - --bind-address=127.0.0.1 - --kubeconfig=/etc/kubernetes/scheduler.conf - --leader-elect=true #- --port=0 image: registry.aliyuncs.com/google_containers/kube-scheduler:v1.20.0 imagePullPolicy: IfNotPresent livenessProbe: failureThreshold: 8 httpGet: host: 127.0.0.1 path: /healthz port: 10259 scheme: HTTPS initialDelaySeconds: 10
然后重启一下各Master节点上的kubelet即可:
systemctl restart kubelet
[root@master1 ~]# kubectl get cs Warning: v1 ComponentStatus is deprecated in v1.19+ NAME STATUS MESSAGE ERROR scheduler Healthy ok controller-manager Healthy ok etcd-0 Healthy "health":"true"
最后,我们再为K8S部署一个Web应用经常用到的服务路由和负载均衡组件——Ingress。
K8S Ingress是一个将K8S集群外部请求(南北流量)转发到集群内service的后端pod的七层负载均衡器,它转发的流量会跳过K8S集群内的kube-proxy直达目标pod(kube-proxy是集群内的四层负载均衡器,管理集群内pod间的东西流量的),下图是K8S官网给出的Ingress流量转发示意图:
使用Ingress进行服务路由时,Ingress Controller基于Ingress规则将客户端请求直接转发到Service对应的后端Endpoint(Pod)上,这样会跳过kube-proxy自动设置的路由转发规则,以提高网络转发效率。目前Ingress只能为HTTP和HTTPS提供服务,对于TCP/IP协议,可以通过设置Service的类型(type)为NodePort或LoadBalancer对集群外部的客户端提供服务。注意,Ingress是用于转发策略定义的,提供转发服务的的Ingress Controller,Ingress Controller会持续监控K8S API Server的/ingress接口的变化(即用户定义的后端服务的转发规则),当后端服务信息发生变化时,Ingress Controller会自动更新其转发规则。
目前K8S官方维护的Ingress控制器有三个: AWS、 GCE 和 Nginx Ingress,前两个为专门针对亚马逊公有云和Google公有云的,这里我们选择中立的 Nginx Ingress(还有其他很多第三方的 Ingress ,例如Istio Ingress、Kong Ingress、HAProxy Ingress等等)
从Github上的 Ingress NGINX Controller 项目的readme页面找到支持当前K8S版本(v1.20.2)的Ingress-Nginx版本为v1.0.0~v1.3.1,这里我们选择v1.3.1版本(对应的Nginx版本为1.19.10+)。从Github上的tag列表里下载相应版本的Source code包,解压后从deploy\\static\\provider\\cloud目录中找到deploy.yaml文件,该文件包含了Ingress-Nginx的所有K8S资源定义:
Ingress Nginx Controller 可以以 DaemonSet 或 Deployment 模式进行部署,通常可以考虑通过设置 nodeSelector 或亲和性调度策略将调度到固定的几个Node上提供服务,换句话说Ingress Nginx Controller在K8S集群内也是以Pod形式运行的,它自身要怎么接收外部客户端的访问的呢?这可以在容器级别设置hostPort,将80和433端口映射到宿主机上,这样外部客户端就可以通过URL地址“ http://<NodeIP>:80 ”或“ https://<NodeIP>:443 ”访问Ingress Nginx Controller了(hostPort为容器所在主机需要监听的端口号,未设置时默认与conainerPort相同;显示设置时,因端口占用同一台宿主机将无法启动该容器的第2份副本)。这样只需要K8S集群的部分Node当作边缘路由(前端接入路由)用于接入南北流量即可,接下来我们来修改deploy.yaml文件中的以下资源定义(红色为修改过的部分):
--- apiVersion: apps/v1 kind: Deployment metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.3.1 name: ingress-nginx-controller namespace: ingress-nginx spec: ...省略... image: registry.aliyuncs.com/google_containers/nginx-ingress-controller:v1.3.1 imagePullPolicy: IfNotPresent lifecycle: ...省略... ports: - containerPort: 80 name: http protocol: TCP hostPort: 80 - containerPort: 443 name: https protocol: TCP hostPort: 443 - containerPort: 8443 name: webhook protocol: TCP hostPort: 8443 ...省略... dnsPolicy: ClusterFirst nodeSelector: kubernetes.io/os: linux role: ingress-nginx-controller serviceAccountName: ingress-nginx ...省略... --- apiVersion: batch/v1 kind: Job metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.3.1 name: ingress-nginx-admission-create namespace: ingress-nginx spec: ...省略... image: registry.aliyuncs.com/google_containers/kube-webhook-certgen:v1.3.0 imagePullPolicy: IfNotPresent name: create ...省略... --- apiVersion: batch/v1 kind: Job metadata: labels: app.kubernetes.io/component: admission-webhook app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.3.1 name: ingress-nginx-admission-patch namespace: ingress-nginx spec: ...省略... image: registry.aliyuncs.com/google_containers/kube-webhook-certgen:v1.3.0 imagePullPolicy: IfNotPresent name: patch ...省略... --- apiVersion: networking.k8s.io/v1 kind: IngressClass metadata: labels: app.kubernetes.io/component: controller app.kubernetes.io/instance: ingress-nginx app.kubernetes.io/name: ingress-nginx app.kubernetes.io/part-of: ingress-nginx app.kubernetes.io/version: 1.3.1 name: nginx annotations: ingressclass.kubernetes.io/is-default-class: "true" spec: controller: k8s.io/ingress-nginx ...省略...
其中 ingressclass.kubernetes.io/is-default-class: "true" 注解将 ingress-nginx作为K8S集群的默认Ingress Class。
在deploy.yaml文件的修改过程中,我们将宿主机的80、443和8443端口绑定给了Ingress Nginx Controller,所要确保目标节点宿主机上这几个端口没有被占用;另外,我们为Ingress Nginx Controller的节点选择器(nodeSelector)添加了额外的标签要求(role: ingress-nginx-controller),因此要先给目标节点打上该标签,例如我们选节集群里的node1节点作为目标节点(边缘路由节点):
kubectl label nodes node1 role=ingress-nginx-controller
一切就绪后,将修
以上是关于CentOS7.9安装K8S高可用集群(三主三从)的主要内容,如果未能解决你的问题,请参考以下文章