通过集群 API 构建的 Rancher Kube API 健康检查失败

Posted

技术标签:

【中文标题】通过集群 API 构建的 Rancher Kube API 健康检查失败【英文标题】:Rancher Kube API Health Check Failure via Cluster API Build 【发布时间】:2019-10-17 16:19:33 【问题描述】:

我正在尝试使用 API 调用 + Ansible 创建一个新集群,并将 AWS 作为云提供商。我已经生成了所需的节点模板并开始触发构建。

    当我使用已构建的节点模板从 UI 触发集群创建时,集群创建按预期成功。

    当我通过代码触发集群创建时,集群部署了大部分集群,但运行状况检查失败。

我曾尝试通过 UI 构建 - 每次都能正常工作。

我也试过更改API调用参数,但都没有生效。

      shell: "`curl -s 'https:// rancher_server /v3/cluster' -H 'content-type: application/json' -H 'Authorization: Bearer  racherlogintoken.stdout ' --data-binary '\"dockerRootDir\":\"/var/lib/docker\",\"enableNetworkPolicy\":false,\"type\":\"cluster\",\"rancherKubernetesEngineConfig\":\"addonJobTimeout\":30,\"ignoreDockerVersion\":true,\"kubernetesVersion\": \"v1.11.5-rancher1-1\",\"sshAgentAuth\":false,\"type\":\"rancherKubernetesEngineConfig\",\"authentication\":\"type\":\"authnConfig\",\"strategy\":\"x509\",\"network\":\"type\":\"networkConfig\",\"plugin\":\"calico\", \"cloudProvider\":\"awsCloudProvider\":\"type\":\"/v3/schemas/awsCloudProvider\", \"name\":\"aws\", \"type\":\"/v3/schemas/cloudProvider\",\"monitoring\":\"type\":\"monitoringConfig\",\"provider\":\"metrics-server\", \"services\":\"type\":\"rkeConfigServices\",\"kubeApi\":\"podSecurityPolicy\":false,\"type\":\"kubeAPIService\",\"etcd\":\"snapshot\":false,\"type\":\"etcdService\",\"extraArgs\":\"heartbeat-interva\":500,\"election-timeout\":5000,\"name\":\" mdio_cluster_name \"' --insecure` | jq -r .data[].id"

Errors:

2019/06/01 07:40:28 [ERROR] cluster [c-sgd2w] provisioning: [controlPlane] Failed to bring up Control Plane: Failed to verify healthcheck: Failed to check https://localhost:6443/healthz for service [kube-apiserver] on host [x.x.x.x]: Get https://localhost:6443/healthz: read tcp [::1]:60288->[::1]:6443: read: connection reset by peer, log: I0601 07:40:24.813709       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
2019/06/01 07:40:28 [ERROR] ClusterController c-sgd2w [cluster-provisioner-controller] failed with : [controlPlane] Failed to bring up Control Plane: Failed to verify healthcheck: Failed to check https://localhost:6443/healthz for service [kube-apiserver] on host [x.x.x.x]: Get https://localhost:6443/healthz: read tcp [::1]:60288->[::1]:6443: read: connection reset by peer, log: I0601 07:40:24.813709       1 plugins.go:161] Loaded 6 validating admission controller(s) successfully in the following order: LimitRanger,ServiceAccount,Priority,PersistentVolumeClaimResize,ValidatingAdmissionWebhook,ResourceQuota.
2019/06/01 07:40:30 [INFO] 2019/06/01 07:40:30 http: multiple response.WriteHeader calls
2019/06/01 07:40:40 [INFO] 2019/06/01 07:40:40 http: multiple response.WriteHeader calls
2019/06/01 07:40:50 [INFO] 2019/06/01 07:40:50 http: multiple response.WriteHeader calls

【问题讨论】:

【参考方案1】:

看起来是网络“印花布”导致了问题。用过的“运河”,一切都变得更好了。

【讨论】:

以上是关于通过集群 API 构建的 Rancher Kube API 健康检查失败的主要内容,如果未能解决你的问题,请参考以下文章

Kubernetes使用vSphere存储

二rancher-ha-四层负载均衡Helm HA部署

Rancher概述

k8s 如何使用kube-dns实现服务发现

Rancher简单介绍

K8s 通过 Rancher 管理 k8s 集群