私有云中Kubernetes Cluster HA方案

Posted WaltonWang

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了私有云中Kubernetes Cluster HA方案相关的知识,希望对你有一定的参考价值。

摘要:发现很多Kubernetes刚入门的同学对Kubernetes的Master高可用方案很感兴趣,官方又只给出了GCE上部署高可用的方案,因此我觉得有必要把我之前做的Kubernetes Master HA方案分享一下。

Kubernetes Master HA架构图

配置与说明

  1. 所有组件可以通过kubelet static pod的方式启动和管理,由kubelet static pod机制保证宿主机上各个组件的高可用, 注意kubelet要添加配置--allow-privileged=true;
  2. 管理static pod的kubelet的高可用通过systemd来负责;
  3. 当然,你也可以直接通过进程来部署这些组件,systemd来直接管理这些进程;(我们选择的是这种方式,降低复杂度。)
  4. 上图中,etcd和Master部署在一起,三个Master节点分别部署了三个etcd,这三个etcd组成一个集群;(当然,如果条件允许,建议将etcd集群和Master节点分开部署。)
  5. 每个Master中的apiserver、controller-manager、scheduler都使用hostNetwork, controller-manager和scheduler通过localhost连接到本节点的apiserver,而不会和其他两个Master节点的apiserver连接;
  6. 外部的rest-client、kubectl、kubelet、kube-proxy等都通过TLS证书,在LB节点做TLS Termination,LB出来就是http请求发到经过LB策略(RR)到对应的apiserver instance;
  7. apiserver到kubelet server和kube-proxy server的访问也类似,Https到LB这里做TLS Termination,然后http请求出来到对应node的kubelet/kube-proxy server;
  8. apiserver的HA通过经典的haproxy + keepalived来保证,集群对外暴露VIP;
  9. controller-manager和scheduler的HA通过自身提供的leader选举功能(–leader-elect=true),使得3个controller-manager和scheduler都分别只有一个是leader,leader处于正常工作状态,当leader失败,会重新选举新leader来顶替继续工作;
  10. 因此,该HA方案中,通过haproxy+keepalived来做apiserver的LB和HA,controller-manager和scheduler通过自身的leader选举来达到HA,etcd通过raft协议保证etcd cluster数据的一致性,达到HA;
  11. keepalived的配置可参考如下:

    vrrp_script check_script 
     script  "/etc/keepalived/check_haproxy.py  http://caicloud:caicloud@127.0.0.1/haproxy?stats"
     interval 5 # check every 5 seconds
     weight 5
     fall 2 # require 2 fail for KO
     rise 1 # require 1 successes for OK
    
    
    vrrp_instance VI_01 
        state MASTER (BACKUP)
        interface eth1
        track_interface 
            eth1
        
    
    
        vrrp_garp_master_repeat 5
        vrrp_garp_master_refresh 10
    
        virtual_router_id 51
        priority 100 (97)
    
        advert_int 1
    
        authentication 
            auth_type PASS
            auth_pass username
        
    
        virtual_ipaddress 
            192.168.205.254 dev eth1 label eth1:vip
        
    
        track_script 
            check_script
        
    
        notify "etc/keepalived/notify_state.sh"
    
  12. haproxy的配置可参考如下:

    global
        log 127.0.0.1 local0
        maxconn 32768
        pidfile /run/haproxy.pid
        # turn on stats unix socket
        stats socket /run/haproxy.stats
        tune.ssl.default-dh-param  2048
    
    default 
        log global
        mode http
        option httplog
        option dontlognull
        retries 3
        timeout connect 5000ms
        timeout client 50000ms
        timeout server 50000ms
        timeout check 50000ms
        timeout queue 50000ms
    
    frontend frontend-apisver-http
        bind *:8080
        option forwardfor
    
        acl local_net src 192.168.205.0/24
    
        http-request allow if local_net
        http-request deny
    
        default_backend backend-apiserver-http
    
    frontedn frontend-apiserver-https
        # haproxy enable ssl
        bind *:443 ssl crt /etc/kubernetes/master-lb.pem
        option forwardfor
        default_backend backend-apiserver-http
    
    backend backend-apiserver-http
        balance roundrobin
        option forward-for
    
        server master-1 192.168.205.11:8080  check
        server master-2 192.168.205.12:8080  check
        server master-3 192.168.205.13:8080  check
    
    listen  admin_stats
           bind  0.0.0.0:80
           log  global
           mode  http
           maxconn  10
           stats  enable
           #Hide  HAPRoxy version, a necessity for any public-facing site
           stats  hide-version
           stats  refresh 30s
           stats  show-node
           stats  realm Haproxy\\ Statistics
           stats  auth caicloud:caicloud
           stats  uri /haproxy?stats
  13. LB所在的节点,注意确保ip_vs model已加载、ip_forward和ip_nonlocal_bind已开启;

    
    # make sure ip_vs kernel model is loaded
    
    modprobe ip_vs
    modprobe ip_vs_rr
    modprobe ip_vs_wrr
    
    
    # enable ip_forward and ip_nonlocal_bind
    
    echo "net.ipv4.ip_forward = 1" >> /etc/sysctl.conf
    echo "net.ipv4.ip_nonlocal_bind = 1" >> /etc/sysctl.conf
  14. 如果你通过pod来部署K8S的组件,可参考官方给出的Yaml:

    • apiserver
    apiVersion: v1
    kind: Pod
    metadata:
      name: kube-apiserver
    spec:
      hostNetwork: true
      containers:
      - name: kube-apiserver
        image: gcr.io/google_containers/kube-apiserver:9680e782e08a1a1c94c656190011bd02
        command:
        - /bin/sh
        - -c
        - /usr/local/bin/kube-apiserver --address=127.0.0.1 --etcd-servers=http://127.0.0.1:4001
          --cloud-provider=gce   --admission-control=NamespaceLifecycle,LimitRanger,SecurityContextDeny,ServiceAccount,ResourceQuota
          --service-cluster-ip-range=10.0.0.0/16 --client-ca-file=/srv/kubernetes/ca.crt
          --basic-auth-file=/srv/kubernetes/basic_auth.csv --cluster-name=e2e-test-bburns
          --tls-cert-file=/srv/kubernetes/server.cert --tls-private-key-file=/srv/kubernetes/server.key
          --secure-port=443 --token-auth-file=/srv/kubernetes/known_tokens.csv  --v=2
          --allow-privileged=False 1>>/var/log/kube-apiserver.log 2>&1
        ports:
        - containerPort: 443
          hostPort: 443
          name: https
        - containerPort: 7080
          hostPort: 7080
          name: http
        - containerPort: 8080
          hostPort: 8080
          name: local
        volumeMounts:
        - mountPath: /srv/kubernetes
          name: srvkube
          readOnly: true
        - mountPath: /var/log/kube-apiserver.log
          name: logfile
        - mountPath: /etc/ssl
          name: etcssl
          readOnly: true
        - mountPath: /usr/share/ssl
          name: usrsharessl
          readOnly: true
        - mountPath: /var/ssl
          name: varssl
          readOnly: true
        - mountPath: /usr/ssl
          name: usrssl
          readOnly: true
        - mountPath: /usr/lib/ssl
          name: usrlibssl
          readOnly: true
        - mountPath: /usr/local/openssl
          name: usrlocalopenssl
          readOnly: true
        - mountPath: /etc/openssl
          name: etcopenssl
          readOnly: true
        - mountPath: /etc/pki/tls
          name: etcpkitls
          readOnly: true
      volumes:
      - hostPath:
          path: /srv/kubernetes
        name: srvkube
      - hostPath:
          path: /var/log/kube-apiserver.log
        name: logfile
      - hostPath:
          path: /etc/ssl
        name: etcssl
      - hostPath:
          path: /usr/share/ssl
        name: usrsharessl
      - hostPath:
          path: /var/ssl
        name: varssl
      - hostPath:
          path: /usr/ssl
        name: usrssl
      - hostPath:
          path: /usr/lib/ssl
        name: usrlibssl
      - hostPath:
          path: /usr/local/openssl
        name: usrlocalopenssl
      - hostPath:
          path: /etc/openssl
        name: etcopenssl
      - hostPath:
          path: /etc/pki/tls
        name: etcpkitls
    • controller-manager
    apiVersion: v1
    kind: Pod
    metadata:
      name: kube-controller-manager
    spec:
      containers:
      - command:
        - /bin/sh
        - -c
        - /usr/local/bin/kube-controller-manager --master=127.0.0.1:8080 --cluster-name=e2e-test-bburns
          --cluster-cidr=10.245.0.0/16 --allocate-node-cidrs=true --cloud-provider=gce  --service-account-private-key-file=/srv/kubernetes/server.key
          --v=2 --leader-elect=true 1>>/var/log/kube-controller-manager.log 2>&1
        image: gcr.io/google_containers/kube-controller-manager:fda24638d51a48baa13c35337fcd4793
        livenessProbe:
          httpGet:
            path: /healthz
            port: 10252
          initialDelaySeconds: 15
          timeoutSeconds: 1
        name: kube-controller-manager
        volumeMounts:
        - mountPath: /srv/kubernetes
          name: srvkube
          readOnly: true
        - mountPath: /var/log/kube-controller-manager.log
          name: logfile
        - mountPath: /etc/ssl
          name: etcssl
          readOnly: true
        - mountPath: /usr/share/ssl
          name: usrsharessl
          readOnly: true
        - mountPath: /var/ssl
          name: varssl
          readOnly: true
        - mountPath: /usr/ssl
          name: usrssl
          readOnly: true
        - mountPath: /usr/lib/ssl
          name: usrlibssl
          readOnly: true
        - mountPath: /usr/local/openssl
          name: usrlocalopenssl
          readOnly: true
        - mountPath: /etc/openssl
          name: etcopenssl
          readOnly: true
        - mountPath: /etc/pki/tls
          name: etcpkitls
          readOnly: true
      hostNetwork: true
      volumes:
      - hostPath:
          path: /srv/kubernetes
        name: srvkube
      - hostPath:
          path: /var/log/kube-controller-manager.log
        name: logfile
      - hostPath:
          path: /etc/ssl
        name: etcssl
      - hostPath:
          path: /usr/share/ssl
        name: usrsharessl
      - hostPath:
          path: /var/ssl
        name: varssl
      - hostPath:
          path: /usr/ssl
        name: usrssl
      - hostPath:
          path: /usr/lib/ssl
        name: usrlibssl
      - hostPath:
          path: /usr/local/openssl
        name: usrlocalopenssl
      - hostPath:
          path: /etc/openssl
        name: etcopenssl
      - hostPath:
          path: /etc/pki/tls
        name: etcpkitls
    • scheduler
    apiVersion: v1
    kind: Pod
    metadata:
      name: kube-scheduler
    spec:
      hostNetwork: true
      containers:
      - name: kube-scheduler
        image: gcr.io/google_containers/kube-scheduler:34d0b8f8b31e27937327961528739bc9
        command:
        - /bin/sh
        - -c
        - /usr/local/bin/kube-scheduler --master=127.0.0.1:8080 --v=2 --leader-elect=true 1>>/var/log/kube-scheduler.log
          2>&1
        livenessProbe:
          httpGet:
            path: /healthz
            port: 10251
          initialDelaySeconds: 15
          timeoutSeconds: 1
        volumeMounts:
        - mountPath: /var/log/kube-scheduler.log
          name: logfile
        - mountPath: /var/run/secrets/kubernetes.io/serviceaccount
          name: default-token-s8ejd
          readOnly: true
      volumes:
      - hostPath:
          path: /var/log/kube-scheduler.log
        name: logfile
    • etcd
    apiVersion: v1
    kind: Pod
    metadata:
      name: etcd-server
    spec:
      hostNetwork: true
      containers:
      - image: gcr.io/google_containers/etcd:2.0.9
        name: etcd-container
        command:
        - /usr/local/bin/etcd
        - --name
        - $NODE_NAME
        - --initial-advertise-peer-urls
        - http://$NODE_IP:2380
        - --listen-peer-urls
        - http://$NODE_IP:2380
        - --advertise-client-urls
        - http://$NODE_IP:4001
        - --listen-client-urls
        - http://127.0.0.1:4001
        - --data-dir
        - /var/etcd/data
        - --discovery
        - $DISCOVERY_TOKEN
        ports:
        - containerPort: 2380
          hostPort: 2380
          name: serverport
        - containerPort: 4001
          hostPort: 4001
          name: clientport
        volumeMounts:
        - mountPath: /var/etcd
          name: varetcd
        - mountPath: /etc/ssl
          name: etcssl
          readOnly: true
        - mountPath: /usr/share/ssl
          name: usrsharessl
          readOnly: true
        - mountPath: /var/ssl
          name: varssl
          readOnly: true
        - mountPath: /usr/ssl
          name: usrssl
          readOnly: true
        - mountPath: /usr/lib/ssl
          name: usrlibssl
          readOnly: true
        - mountPath: /usr/local/openssl
          name: usrlocalopenssl
          readOnly: true
        - mountPath: /etc/openssl
          name: etcopenssl
          readOnly: true
        - mountPath: /etc/pki/tls
          name: etcpkitls
          readOnly: true
      volumes:
      - hostPath:
          path: /var/etcd/data
        name: varetcd
      - hostPath:
          path: /etc/ssl
        name: etcssl
      - hostPath:
          path: /usr/share/ssl
        name: usrsharessl
      - hostPath:
          path: /var/ssl
        name: varssl
      - hostPath:
          path: /usr/ssl
        name: usrssl
      - hostPath:
          path: /usr/lib/ssl
        name: usrlibssl
      - hostPath:
          path: /usr/local/openssl
        name: usrlocalopenssl
      - hostPath:
          path: /etc/openssl
        name: etcopenssl
      - hostPath:
          path: /etc/pki/tls
        name: etcpkitls

以上是关于私有云中Kubernetes Cluster HA方案的主要内容,如果未能解决你的问题,请参考以下文章

3大模型,搞定Kubernetes网络

四大模型,搞定Kubernetes网络!

盘点Kubernetes网络问题的4种解决方案

01-集群环境及组件介绍

AIX下的ha高可用集群cluster

HA ,CLUSTER ,RAC这几个概念我老弄混,谁帮我分析下,都分别啥意思,特别ha ,cluster