k8s hpa无法获取cpu信息[关闭]
Posted
技术标签:
【中文标题】k8s hpa无法获取cpu信息[关闭]【英文标题】:k8s hpa can't get the cpu information [closed] 【发布时间】:2020-04-03 08:56:48 【问题描述】:我设置了一个hpa使用命令
sudo kubectl autoscale deployment e7-build-64 --cpu-percent=50 --min=1 --max=2 -n k8s-demo
sudo kubectl get hpa -n k8s-demo
NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
e7-build-64 Deployment/e7-build-64 <unknown>/50% 1 2 1 15m
sudo kubectl 描述 hpa e7-build-64 -n k8s-demo
Name: e7-build-64
Namespace: k8s-demo
Labels: <none>
Annotations: <none>
CreationTimestamp: Tue, 10 Dec 2019 15:34:24 +0800
Reference: Deployment/e7-build-64
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 50%
Min replicas: 1
Max replicas: 2
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedComputeMetricsReplicas 13m (x12 over 16m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Warning FailedGetResourceMetric 74s (x61 over 16m) horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from resource metrics API
在 deployment.yaml 中我添加了资源请求和限制
resources:
limits:
memory: "16Gi"
cpu: "4000m"
requests:
memory: "4Gi"
cpu: "2000m"
kubectl 版本
Client Version: version.InfoMajor:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:18:23Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"
Server Version: version.InfoMajor:"1", Minor:"16", GitVersion:"v1.16.2", GitCommit:"c97fe5036ef3df2967d086711e6c0c405941e14b", GitTreeState:"clean", BuildDate:"2019-10-15T19:09:08Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"
然后我尝试设置 hpa 使用 yaml
apiVersion: autoscaling/v2beta2
kind: HorizontalPodAutoscaler
metadata:
name: hpa-e7-build-64
namespace: k8s-demo
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: e7-build-64
minReplicas: 1
maxReplicas: 2
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 10
它仍然有一些错误 sudo kubectl 描述 hpa hpa-e7-build-64 -n k8s-demo
Name: hpa-e7-build-64
Namespace: k8s-demo
Labels: <none>
Annotations: kubectl.kubernetes.io/last-applied-configuration:
"apiVersion":"autoscaling/v2beta2","kind":"HorizontalPodAutoscaler","metadata":"annotations":,"name":"hpa-e7-build-64","namespace":"k8...
CreationTimestamp: Tue, 10 Dec 2019 14:24:07 +0800
Reference: Deployment/e7-build-64
Metrics: ( current / target )
resource cpu on pods (as a percentage of request): <unknown> / 10%
Min replicas: 1
Max replicas: 2
Deployment pods: 1 current / 0 desired
Conditions:
Type Status Reason Message
---- ------ ------ -------
AbleToScale True SucceededGetScale the HPA controller was able to get the target's current scale
ScalingActive False FailedGetResourceMetric the HPA was unable to compute the replica count: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedGetResourceMetric 59m (x141 over 94m) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
Warning FailedGetResourceMetric 54m (x2 over 54m) horizontal-pod-autoscaler unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server is currently unable to handle the request (get pods.metrics.k8s.io)
Warning FailedComputeMetricsReplicas 39m (x58 over 53m) horizontal-pod-autoscaler invalid metrics (1 invalid out of 1), first error is: failed to get cpu utilization: unable to get metrics for resource cpu: no metrics returned from resource metrics API
Warning FailedGetResourceMetric 4m29s (x197 over 53m) horizontal-pod-autoscaler unable to get metrics for resource cpu: no metrics returned from resource metrics API
我已经执行了以下命令:
git clone https://github.com/kubernetes-incubator/metrics-server.git (fetch)
cd metrics-server/deploy
sudo kubectl create -f 1.8+/
有人知道怎么解决吗?
更新:
sudo kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
"kind":"PodMetricsList","apiVersion":"metrics.k8s.io/v1beta1","metadata":"selfLink":"/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods","items":[]
以及 pod 信息:
sudo kubectl describe pod metrics-server-795b774c76-fs8hw -n kube-system
Name: metrics-server-795b774c76-fs8hw
Namespace: kube-system
Priority: 0
Node: nandoc-95/192.168.33.225
Start Time: Tue, 10 Dec 2019 15:04:14 +0800
Labels: k8s-app=metrics-server
pod-template-hash=795b774c76
Annotations: cni.projectcalico.org/podIP: 10.0.229.135/32
Status: Running
IP: 10.0.229.135
IPs:
IP: 10.0.229.135
Controlled By: ReplicaSet/metrics-server-795b774c76
Containers:
metrics-server:
Container ID: docker://2c6dd8c50938bc9ab536c78b73773aa7a9eedd60a6974805beec58e8ee9fde3c
Image: k8s.gcr.io/metrics-server-amd64:v0.3.6
Image ID: docker-pullable://k8s.gcr.io/metrics-server-amd64@sha256:c9c4e95068b51d6b33a9dccc61875df07dc650abbf4ac1a19d58b4628f89288b
Port: 4443/TCP
Host Port: 0/TCP
Args:
--cert-dir=/tmp
--secure-port=4443
State: Running
Started: Tue, 10 Dec 2019 15:05:13 +0800
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/tmp from tmp-dir (rw)
/var/run/secrets/kubernetes.io/serviceaccount from metrics-server-token-xjgpx (ro)
Conditions:
Type Status
Initialized True
Ready True
ContainersReady True
PodScheduled True
Volumes:
tmp-dir:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
metrics-server-token-xjgpx:
Type: Secret (a volume populated by a Secret)
SecretName: metrics-server-token-xjgpx
Optional: false
QoS Class: BestEffort
Node-Selectors: beta.kubernetes.io/os=linux
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events: <none>
sudo kubectl get pods --all-namespaces -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
k8s-demo k8s-pod-e7-build-32-7bb5bc7c6-s2zsr 1/1 Running 0 32m 10.0.100.198 nandoc-94 <none> <none>
k8s-demo k8s-pod-e7-build-64-d5c659d6b-5hv6m 1/1 Running 0 31m 10.0.229.137 nandoc-95 <none> <none>
kube-system calico-kube-controllers-55754f75c-82np8 1/1 Running 0 5d 10.0.126.1 nandoc-93 <none> <none>
kube-system calico-node-2dxmp 1/1 Running 0 2d5h 192.168.33.225 nandoc-95 <none> <none>
kube-system calico-node-7ms8t 1/1 Running 0 28d 192.168.33.223 nandoc-93 <none> <none>
kube-system calico-node-hdw25 1/1 Running 1 21d 192.168.33.224 nandoc-94 <none> <none>
kube-system calico-node-j4jv4 0/1 Running 0 27d 192.168.37.173 cyuan-k8s-node1 <none> <none>
kube-system calicoctl 1/1 Running 0 6d 192.168.33.224 nandoc-94 <none> <none>
kube-system coredns-5644d7b6d9-n9z5m 1/1 Running 0 5d 10.0.126.2 nandoc-93 <none> <none>
kube-system coredns-5644d7b6d9-txcm4 1/1 Running 0 5d 10.0.100.194 nandoc-94 <none> <none>
kube-system etcd-nandoc-93 1/1 Running 0 28d 192.168.33.223 nandoc-93 <none> <none>
kube-system kube-apiserver-nandoc-93 1/1 Running 0 28d 192.168.33.223 nandoc-93 <none> <none>
kube-system kube-controller-manager-nandoc-93 1/1 Running 0 28d 192.168.33.223 nandoc-93 <none> <none>
kube-system kube-proxy-5jlfc 1/1 Running 0 27d 192.168.37.173 cyuan-k8s-node1 <none> <none>
kube-system kube-proxy-7t7b7 1/1 Running 0 28d 192.168.33.223 nandoc-93 <none> <none>
kube-system kube-proxy-j5b4c 1/1 Running 0 2d5h 192.168.33.225 nandoc-95 <none> <none>
kube-system kube-proxy-jj256 1/1 Running 1 21d 192.168.33.224 nandoc-94 <none> <none>
kube-system kube-scheduler-nandoc-93 1/1 Running 0 28d 192.168.33.223 nandoc-93 <none> <none>
kube-system metrics-server-795b774c76-fs8hw 1/1 Running 0 24h 10.0.229.135 nandoc-95 <none> <none>
kubernetes-dashboard dashboard-metrics-scraper-76585494d8-wqgks 1/1 Running 0 5d 10.0.126.3 nandoc-93 <none> <none>
kubernetes-dashboard kubernetes-dashboard-b65488c4-qh95m 1/1 Running 0 5d 10.0.126.4 nandoc-93 <none> <none>
sudo kubectl get hpa --all-namespaces -o wide
NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
k8s-demo hpa-e7-build-32 Deployment/k8s-pod-e7-build-32 <unknown>/10% 1 2 1 85s
k8s-demo hpa-e7-build-64 Deployment/k8s-pod-e7-build-64 <unknown>/10% 1 2 1 79s
k8s-demo k8s-pod-e7-build-64 Deployment/k8s-pod-e7-build-64 <unknown>/50% 1 2 1 16s
我更新了 pod 名称并重新创建了 hpa,添加了前缀 k8s-pod-today.so 输出与以前不同。
【问题讨论】:
下面的返回kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
是什么?
检查您的 metric-server pod 是否运行正常。给我们 kubectl describe pod kubectl get pods -n k8s-demo
的输出是什么?
【参考方案1】:
对于 Kubernetes 1.18 和 Metrics v0.3.7,我们应该编辑 metrics-server 部署以反映以下参数:
args:
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP
- --cert-dir=/tmp
- --secure-port=4443
【讨论】:
--kubelet-insecure-tls=true
并通过此解决方法解决了问题【参考方案2】:
谢谢 weibeld 和 EAT_Py。我已经解决了这个问题。 调试过程:
sudo kubectl get --raw "/apis/metrics.k8s.io/v1beta1/namespaces/k8s-demo/pods"
sudo kubectl -n kube-system logs metrics-server-795b774c76-t2rj7
sudo kubectl top node nandoc-94 -->can't get info
sudo kubectl top pod k8s-pod-e7-build-32-7bb5bc7c6-s2zsr -->can't get info
metrics-server 的日志有一些错误信息:
kubelet_summary:nandoc-93: unable to fetch metrics from Kubelet nandoc-93 (nandoc-93): Get https://nandoc-93:10250/stats/summary?only_cpu_and_memory=true: x509: certificate signed by unknown authority]
然后根据https://github.com/kubernetes-sigs/metrics-server/issues/146 我编辑 metrics-server/deploy/1.8+/metrics-server-deployment.yaml 并添加命令
- name: metrics-server
image: k8s.gcr.io/metrics-server-amd64:v0.3.6
command:
- /metrics-server
- --kubelet-insecure-tls
kubectl apply -f metrics-server-deployment.yaml
之后,kubectl top pod 工作正常。而 hpa 现在可以工作了。再次感谢。
sudo kubectl get hpa --all-namespaces
NAMESPACE NAME REFERENCE TARGETS MINPODS MAXPODS REPLICAS AGE
k8s-demo hpa-e7-build-32 Deployment/k8s-pod-e7-build-32 0%/10% 1 2 1 19h
k8s-demo hpa-e7-build-64 Deployment/k8s-pod-e7-build-64 0%/10% 1 2 1 19h
【讨论】:
# deployments "kube-state-metrics" was not valid: # * : Invalid value: "The modified file failed validation": [couldn't get version/kind; json 解析错误:无效字符 'a' 寻找值的开头,[无效字符 'a' 寻找值的开头,无效字符 'a' 寻找值的开头]] # 当我添加规范时:容器:-图像: quay.io/coreos/kube-state-metrics:v1.9.5 命令: - /metrics-server - --kubelet-insecure-tls以上是关于k8s hpa无法获取cpu信息[关闭]的主要内容,如果未能解决你的问题,请参考以下文章