K8S HPA - 无法从外部指标 API 获取指标

Posted

技术标签:

【中文标题】K8S HPA - 无法从外部指标 API 获取指标【英文标题】:K8S HPA - Cannot fetch metrics from External metrics API 【发布时间】:2021-07-12 17:25:32 【问题描述】:

我正在尝试让 Kafka 主题延迟进入 Prometheus,最后进入 APIServer,以便为我的应用程序使用外部指标 HPA。

我收到错误没有从外部指标 API 返回指标

70m         Warning   FailedGetExternalMetric        horizontalpodautoscaler/kafkademo-hpa   unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelectorMatchLabels:map[string]stringtopic: prices,,MatchExpressions:[]LabelSelectorRequirement,: no metrics returned from external metrics API
66m         Warning   FailedComputeMetricsReplicas   horizontalpodautoscaler/kafkademo-hpa   invalid metrics (1 invalid out of 1), first error is: failed to get external metric kafka_lag_metric_sm0ke: unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelectorMatchLabels:map[string]stringtopic: prices,,MatchExpressions:[]LabelSelectorRequirement,: no metrics returned from external metrics API

发生这种情况即使我在查询外部 API 时可以看到以下输出:

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1 | jq

  "kind": "APIResourceList",
  "apiVersion": "v1",
  "groupVersion": "external.metrics.k8s.io/v1beta1",
  "resources": [
    
      "name": "kafka_lag_metric_sm0ke",
      "singularName": "",
      "namespaced": true,
      "kind": "ExternalMetricValueList",
      "verbs": [
        "get"
      ]
    
  ]


设置如下:

卡夫卡:v2.7.0 普罗米修斯:v2.26.0 Prometheus 适配器:v0.8.3

Prometheus 适配器值

rules:
  external:
  - seriesQuery: 'kafka_consumergroup_group_lagtopic="prices"'
    resources:
      template: <<.Resource>>
    name:
      as: "kafka_lag_metric_sm0ke"
    metricsQuery: 'avg by (topic) (round(avg_over_time(<<.Series>><<.LabelMatchers>>[1m])))'

HPA

apiVersion: autoscaling/v2beta1
kind: HorizontalPodAutoscaler
metadata:
  name: kafkademo-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: kafkademo
  minReplicas: 3
  maxReplicas: 12
  metrics:
  - type: External
    external:
      metricName: kafka_lag_metric_sm0ke
      metricSelector:
        matchLabels:
          topic: prices
      targetValue: 5

HPA 信息

kubectl describe hpa kafkademo-hpa 
Name:                                       kafkademo-hpa
Namespace:                                  default
Labels:                                     <none>
Annotations:                                <none>
CreationTimestamp:                          Sat, 17 Apr 2021 20:01:29 +0300
Reference:                                  Deployment/kafkademo
Metrics:                                    ( current / target )
  "kafka_lag_metric_sm0ke" (target value):  <unknown> / 5
Min replicas:                               3
Max replicas:                               12
Deployment pods:                            3 current / 0 desired
Conditions:
  Type           Status  Reason                   Message
  ----           ------  ------                   -------
  AbleToScale    True    SucceededGetScale        the HPA controller was able to get the target's current scale
  ScalingActive  False   FailedGetExternalMetric  the HPA was unable to compute the replica count: unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelectorMatchLabels:map[string]stringtopic: prices,,MatchExpressions:[]LabelSelectorRequirement,: no metrics returned from external metrics API
Events:
  Type     Reason                        Age                     From                       Message
  ----     ------                        ----                    ----                       -------
  Warning  FailedComputeMetricsReplicas  70m (x335 over 155m)    horizontal-pod-autoscaler  invalid metrics (1 invalid out of 1), first error is: failed to get external metric kafka_lag_metric_sm0ke: unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelectorMatchLabels:map[string]stringtopic: prices,,MatchExpressions:[]LabelSelectorRequirement,: no metrics returned from external metrics API
  Warning  FailedGetExternalMetric       2m30s (x366 over 155m)  horizontal-pod-autoscaler  unable to get external metric default/kafka_lag_metric_sm0ke/&LabelSelectorMatchLabels:map[string]stringtopic: prices,,MatchExpressions:[]LabelSelectorRequirement,: no metrics returned from external metrics API

-- 编辑 1

当我查询默认命名空间时,我得到了这个:

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/kafka_lag_metric_sm0ke |jq

  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": ,
  "items": []


我可以看到“项目”字段为空。这是什么意思?

我似乎不明白幕后发生的一连串事件。

AFAIK 这就是发生的事情。 这是正确的吗?

prometheus-adapter 查询 Prometheus,执行 seriesQuery,计算 metricsQuery 并创建“kafka_lag_metric_sm0ke” 它向 api 服务器注册一个端点以获取外部指标。 API 服务器将根据该端点定期更新其统计信息。 HPA 从 API 服务器检查“kafka_lag_metric_sm0ke”并根据提供的值执行缩放。

我似乎也不明白命名空间在这一切中的重要性。我可以看到 stat 是命名空间的。这是否意味着每个命名空间将有 1 个统计信息?这有什么意义?

【问题讨论】:

【参考方案1】:

在我提出问题后回答我自己的问题的悠久传统中,上述配置有什么问题。

错误在于prometheus-adapter yaml:

rules:
  external:
    - seriesQuery: 'kafka_consumergroup_group_lagtopic="prices"'
      resources:
        template: <<.Resource>>
      name:
        as: "kafka_lag_metric_sm0ke"
      metricsQuery: 'avg by (topic) (round(avg_over_time(<<.Series>><<.LabelMatchers>>[1m])))'

我删除了&lt;&lt;.LabelMatchers&gt;&gt;,现在它可以工作了:

kubectl get --raw /apis/external.metrics.k8s.io/v1beta1/namespaces/default/kafka_lag_metric_sm0ke |jq

  "kind": "ExternalMetricValueList",
  "apiVersion": "external.metrics.k8s.io/v1beta1",
  "metadata": ,
  "items": [
    
      "metricName": "kafka_lag_metric_sm0ke",
  "metricLabels": 
        "topic": "prices"
      ,
      "timestamp": "2021-04-21T16:55:18Z",
      "value": "0"
    
  ]

我仍然不确定它为什么会起作用。我知道在这种情况下&lt;&lt;.LabelMatchers&gt;&gt; 将被替换为不会产生有效查询的东西,但我不知道它是什么。

【讨论】:

以上是关于K8S HPA - 无法从外部指标 API 获取指标的主要内容,如果未能解决你的问题,请参考以下文章

k8s Metrics Server 获取资源指标与 hpa 部署

Hpa 没有获取现有的自定义指标?

k8s技术圈一周精选[第7期]

1.4.7 HPA 横向自动扩容 4.8 statefulset

k8s hpa无法获取cpu信息[关闭]

K8s api server 无法连接 metrics server 问题解决方案