kubernetes集群安装指南:master组件kube-controller-manager部署

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了kubernetes集群安装指南:master组件kube-controller-manager部署相关的知识,希望对你有一定的参考价值。

kube-controller-manager集群包含 3 个节点,启动后将通过竞争选举机制产生一个 leader 节点,其它节点为阻塞状态。当 leader 节点不可用后,剩余节点将再次进行选举产生新的 leader 节点,从而保证服务的高可用性。

1 安装准备

特别说明:这里所有的操作都是在devops这台机器上执行

1.1 环境变量定义

#################### Variable parameter setting ######################
KUBE_NAME=kube-controller-manager
K8S_INSTALL_PATH=/data/apps/k8s/kubernetes
K8S_BIN_PATH=$K8S_INSTALL_PATH/sbin
K8S_LOG_DIR=$K8S_INSTALL_PATH/logs
K8S_CONF_PATH=/etc/k8s/kubernetes
KUBE_CONFIG_PATH=/etc/k8s/kubeconfig
CA_DIR=/etc/k8s/ssl
SOFTWARE=/root/software
VERSION=v1.14.2
PACKAGE="kubernetes-server-$VERSION-linux-amd64.tar.gz"
DOWNLOAD_URL=“”https://github.com/devops-apps/download/raw/master/kubernetes/$PACKAGE"
ETCD_ENDPOIDS=https://10.10.10.22:2379,https://10.10.10.23:2379,https://10.10.10.24:2379
ETH_INTERFACE=eth1
LISTEN_IP=$(ifconfig | grep -A 1 $ETH_INTERFACE |grep inet |awk ‘print $2‘)
USER=k8s
SERVICE_CIDR=10.254.0.0/22

1.2 下载和分发 kubernetes 二进制文件

访问kubernetes github 官方地址下载稳定的 realease 包至本机;

wget  $DOWNLOAD_URL -P $SOFTWARE

将kubernetes 软件包分发到各个master节点服务器;

sudo ansible master_k8s_vgs -m copy -a "src=$SOFTWARE/$PACKAGE dest=$SOFTWARE/" -b

2 部署kube-controller-manager集群

2.1 安装kube-controller-manager二进制文件

### 1.Check if the install directory exists.
if [ ! -d "$K8S_BIN_PATH" ]; then
     mkdir -p $K8S_BIN_PATH
fi

if [ ! -d "$K8S_LOG_DIR/$KUBE_NAME" ]; then
     mkdir -p $K8S_LOG_DIR/$KUBE_NAME
fi

if [ ! -d "$K8S_CONF_PATH" ]; then
     mkdir -p $K8S_CONF_PATH
fi

if [ ! -d "$KUBE_CONFIG_PATH" ]; then
     mkdir -p $KUBE_CONFIG_PATH
fi

### 2.Install kube-apiserver binary of kubernetes.
if [ ! -f "$SOFTWARE/kubernetes-server-$VERSION-linux-amd64.tar.gz" ]; then
     wget $DOWNLOAD_URL -P $SOFTWARE >>/tmp/install.log  2>&1
fi
cd $SOFTWARE && tar -xzf kubernetes-server-$VERSION-linux-amd64.tar.gz -C ./
cp -fp kubernetes/server/bin/$KUBE_NAME $K8S_BIN_PATH
ln -sf  $K8S_BIN_PATH/$KUBE_NAM /usr/local/bin
chown -R $USER:$USER $K8S_INSTALL_PATH
chmod -R 755 $K8S_INSTALL_PATH

2.2 分发 kubeconfig 文件和证书

分发证书
sudo ansible master_k8s_vgs -m  synchronize -a   "src=$CA_DIR/kube-scheduler*   dest=$K8S_KUBECONFIG_PATH/ mode=push delete=yes rsync_opts=-avz" -b
分发kubeconfig认证文件

kube-controller-manager使用 kubeconfig文件连接访问 apiserver服务,该文件提供了 apiserver 地址、嵌入的 CA 证书和 kube-scheduler证书:

sudo ansible master_k8s_vgs -m  synchronize -a   "src=$K8S_KUBECONFIG_PATH/   dest=$K8S_KUBECONFIG_PATH/ mode=push delete=yes rsync_opts=-avz" -b

备注: 如果在前面小节已经同步过各组件kubeconfig和证书文件,此处可以不必执行此操作;

2.3 创建kube-controller-manager启动服务

cat >/usr/lib/systemd/system/$KUBE_NAME.service<<EOF
[Unit]
Description=Kubernetes Controller Manager
Documentation=https://github.com/GoogleCloudPlatform/kubernetes
[Service]
User=$USER
WorkingDirectory=$K8S_INSTALL_PATH
ExecStart=$K8S_BIN_PATH/$KUBE_NAME \  --port=10252 \  --secure-port=10257 \  --bind-address=$LISTEN_IP \  --address=127.0.0.1 \  --kubeconfig=$KUBE_CONFIG_PATH/$KUBE_NAME.kubeconfig \  --authentication-kubeconfig=$KUBE_CONFIG_PATH/$KUBE_NAME.kubeconfig \  --authorization-kubeconfig=$KUBE_CONFIG_PATH/$KUBE_NAME.kubeconfig \  --client-ca-file=$CA_DIR/ca.pem \  --service-cluster-ip-range=$SERVICE_CIDR \  --cluster-name=kubernetes \  --cluster-signing-cert-file=$CA_DIR/ca.pem \  --cluster-signing-key-file=$CA_DIR/ca-key.pem \  --root-ca-file=$CA_DIR/ca.pem \  --service-account-private-key-file=$CA_DIR/ca-key.pem \  --leader-elect=true \  --feature-gates=RotateKubeletServerCertificate=true \  --horizontal-pod-autoscaler-use-rest-clients=true \  --horizontal-pod-autoscaler-sync-period=10s \  --concurrent-service-syncs=2 \  --kube-api-qps=1000 \  --kube-api-burst=2000 \  --concurrent-gc-syncs=30 \  --concurrent-deployment-syncs=10 \  --terminated-pod-gc-threshold=10000 \  --controllers=*,bootstrapsigner,tokencleaner \  --requestheader-allowed-names="" \  --requestheader-client-ca-file=$CA_DIR/ca.pem \  --requestheader-extra-headers-prefix="X-Remote-Extra-" \  --requestheader-group-headers=X-Remote-Group \  --requestheader-username-headers=X-Remote-User \  --tls-cert-file=$CA_DIR/kube-controller-manager.pem \  --tls-private-key-file=$CA_DIR/kube-controller-manager-key.pem \  --use-service-account-credentials=true \  --alsologtostderr=true \  --logtostderr=false \  --log-dir=$K8S_LOG_DIR/$KUBE_NAME \  --flex-volume-plugin-dir=$K8S_INSTALL_PATH/libexec/kubernetes \  --v=2
Restart=on
Restart=on-failure
RestartSec=5
[Install]
WantedBy=multi-user.target
EOF
  • --port=0:关闭监听非安全端口(http),同时 --address 参数无效,--bind-address 参数有效;
  • --secure-port=10252、--bind-address=0.0.0.0: 在所有网络接口监听 10252 端口的 https /metrics 请求;
  • --kubeconfig:指定 kubeconfig 文件路径,kube-controller-manager 使用它连接和验证 kube-apiserver;
  • --authentication-kubeconfig 和 --authorization-kubeconfig:kube-controller-manager 使用它连接 apiserver,对 client 的请求进行认证和授权。kube-controller-manager 不再使用 --tls-ca-file 对请求 https metrics 的 Client 证书进行校验。如果没有配置这两个 kubeconfig 参数,则 client 连接 kube-controller-manager https 端口的请求会被拒绝(提示权限不足)。
  • --cluster-signing-*-file:签名 TLS Bootstrap 创建的证书;
  • --experimental-cluster-signing-duration:指定 TLS Bootstrap 证书的有效期;
  • --root-ca-file:放置到容器 ServiceAccount 中的 CA 证书,用来对 kube-apiserver 的证书进行校验;
  • --service-account-private-key-file:签名 ServiceAccount 中 Token 的私钥文件,必须和 kube-apiserver 的 --service-account-key-file 指定的公钥文件配对使用;
  • --service-cluster-ip-range :指定 Service Cluster IP 网段,必须和 kube-apiserver 中的同名参数一致;
  • --leader-elect=true:集群运行模式,启用选举功能;被选为 leader 的节点负责处理工作,其它节点为阻塞状态;
  • --controllers=*,bootstrapsigner,tokencleaner:启用的控制器列表,tokencleaner 用于自动清理过期的 Bootstrap token;
  • --horizontal-pod-autoscaler-*:custom metrics 相关参数,支持 autoscaling/v2alpha1;
  • --tls-cert-file、--tls-private-key-file:使用 https 输出 metrics 时使用的 Server 证书和秘钥;
  • --use-service-account-credentials=true: kube-controller-manager 中各 controller 使用 serviceaccount 访问 kube-apiserver;

2.4 检查服务运行状态

kube-controller-manager监听10252和10257端口,两个接口都对外提供 /metrics 和 /healthz 的访问。

  • 10252:接收 http 请求访问,非安全端口,不需要认证授权,为了安全建议侦听地址为127.0.0.1;
  • 10257:接收 https 请求访问,安全端口,需要认证授权,可以侦听任何地址;
    sudo netstat -ntlp | grep kube-con
    tcp  0      0 127.0.0.1:10252         0.0.0.0:*      LISTEN      2450/kube-controlle 
    tcp  0      0 10.10.10.22:10257       0.0.0.0:*      LISTEN      2450/kube-controlle 

注意:很多安装文档都是关闭了非安全端口,将安全端口改为10250,这会导致查看集群状态是报如下错误,执行 kubectl get cs命令时,apiserver 默认向 127.0.0.1 发送请求。当controller-manager、scheduler以集群模式运行时,有可能和kube-apiserver不在一台机器上,且访问方式为https,则 controller-manager或scheduler 的状态为 Unhealthy,但实际上它们工作正常。则会导致上述error,但实际集群是安全状态;

kubectl get componentstatuses
NAME                 STATUS      MESSAGE    ERROR
controller-manager  Unhealthy  dial tcp  127.0.0.1:10252: connect: connection refused
scheduler          Unhealthy  dial tcp  127.0.0.1:10251: connect: connection refused
etcd-0               Healthy     "health":"true"
etcd-2               Healthy     "health":"true"
etcd-1               Healthy     "health":"true"

正常输出应该为:
NAME                 STATUS    MESSAGE             ERROR
scheduler            Healthy   ok                  
controller-manager   Healthy   ok                  
etcd-2               Healthy   "health":"true"   
etcd-1               Healthy   "health":"true"   
etcd-0               Healthy   "health":"true"   

查看服务是否运行

systemctl status kube-controller-manager|grep Active

确保状态为 active (running),否则查看日志,确认原因:

sudo journalctl -u kube-controller-manager

2.5 查看输出的 metrics

注意:以下命令在 kube-controller-manager 节点上执行。

https方式访问
curl -s --cacert /opt/k8s/work/ca.pem   --cert /opt/k8s/work/admin.pem   --key /opt/k8s/work/admin-key.pem   https://10.10.10.22:10257/metrics |head

http方式访问
curl -s http://127.0.0.1:10252/metrics |head

2.6 kube-controller-manager 的权限设置

ClusteRole system:kube-controller-manager 的权限很小,只能创建 secret、serviceaccount 等资源对象,各 controller 的权限分散到 ClusterRole system:controller:XXX 中:

 $ kubectl describe clusterrole system:kube-controller-manager
Name:         system:kube-controller-manager
Labels:       kubernetes.io/bootstrapping=rbac-defaults
Annotations:  rbac.authorization.kubernetes.io/autoupdate=true
PolicyRule:
  Resources                  Non-Resource URLs  Resource Names  Verbs
  ---------                  -----------------  --------------  -----
  secrets                                   []   []    [create delete get update]
  endpoints                                 []   []    [create get update]
  serviceaccounts                           []   []    [create get update]
  events                                    []   []    [create patch update]
  tokenreviews.authentication.k8s.io        []   []    [create]
  subjectacce***eviews.authorization.k8s.io []   []    [create]
  configmaps                                []   []    [get]
  namespaces                                []   []    [get]
  *.*                                       []   []    [list watch]

需要在kube-controller-manager的启动参数中添加"--use-service-account-credentials=true"参数,这样main controller将会为各controller创建对应的ServiceAccount XXX-controller。然后内置的 ClusterRoleBinding system:controller:XXX则将赋予各XXX-controller ServiceAccount对应的ClusterRole system:controller:XXX 权限。

 $ kubectl get clusterrole|grep controller
system:controller:attachdetach-controller                              17d
system:controller:certificate-controller                               17d
system:controller:clusterrole-aggregation-controller                   17d
system:controller:cronjob-controller                                   17d
system:controller:daemon-set-controller                                17d
system:controller:deployment-controller                                17d
system:controller:disruption-controller                                17d
system:controller:endpoint-controller                                  17d
system:controller:expand-controller                                    17d
system:controller:generic-garbage-collector                            17d
system:controller:horizontal-pod-autoscaler                            17d
system:controller:job-controller                                       17d
system:controller:namespace-controller                                 17d
system:controller:node-controller                                      17d
system:controller:persistent-volume-binder                             17d
system:controller:pod-garbage-collector                                17d
system:controller:pv-protection-controller                             17d
system:controller:pvc-protection-controller                            17d
system:controller:replicaset-controller                                17d
system:controller:replication-controller                               17d
system:controller:resourcequota-controller                             17d
system:controller:route-controller                                     17d
system:controller:service-account-controller                           17d
system:controller:service-controller                                   17d
system:controller:statefulset-controller                               17d
system:controller:ttl-controller                                       17d
system:kube-controller-manager                                         17d

以 deployment controller 为例:

$ kubectl describe clusterrole system:controller:deployment-controller
Name:         system:controller:deployment-controller
Labels:       kubernetes.io/bootstrapping=rbac-defaults
Annotations:  rbac.authorization.kubernetes.io/autoupdate=true
PolicyRule:
  Resources                        Non-Resource URLs  Resource Names  Verbs
  ---------                        -----------------  --------------  -----
  replicasets.apps                 []   []  [create delete get list patch update watch]
  replicasets.extensions           []   []  [create delete get list patch update watch]
  events                           []   []  [create patch update]
  pods                             []   []  [get list update watch]
  deployments.apps                 []   []  [get list update watch]
  deployments.extensions           []   []  [get list update watch]
  deployments.apps/finalizers      []   []  [update]
  deployments.apps/status          []   []  [update]
  deployments.extensions/finalizers[]   []  [update]
  deployments.extensions/status    []   []  [update]

2.7 查看当前的 leader

kubectl get endpoints kube-controller-manager --namespace=kube-system -o yaml

2.8 测试kube-controller-manager集群的高可用

随机找一个或两个 master 节点,停掉kube-controller-manager服务,看其它节点是否获取了 leader 权限.

参考


关于 controller 权限和 use-service-account-credentials 参数:
https://github.com/kubernetes/kubernetes/issues/48208
kubelet 认证和授权:
https://kubernetes.io/docs/admin/kubelet-authentication-authorization/#kubelet-authorization

以上是关于kubernetes集群安装指南:master组件kube-controller-manager部署的主要内容,如果未能解决你的问题,请参考以下文章

kubernetes 二进制安装(v1.20.16)验证 master 部署

超全K8s集群构建指南,建议收藏

云原生Kubernetes集群安装和配置之节点初始化(master和node)

kubernetes实践指南

云原生 | Kubernetes篇Kubernetes原理与安装

kubernetes v1.14.3 HA集群安装