故障样本 prometheus 和 cadvisor

Posted

技术标签:

【中文标题】故障样本 prometheus 和 cadvisor【英文标题】:Out of order sample prometheus and cadvisor 【发布时间】:2021-09-10 00:06:57 【问题描述】:

这是我的 kube 集群中 prometheus 的 configmap 配置。

scrape_configs:
  - job_name: 'kubernetes-apiservers'
    kubernetes_sd_configs:
    - role: endpoints
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    authorization:
      credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    relabel_configs:
    - source_labels: [__meta_kubernetes_namespace, __meta_kubernetes_service_name, __meta_kubernetes_endpoint_port_name]
      action: keep
      regex: default;kubernetes;https
  - job_name: 'prometheus'
    static_configs:
    - targets: ['localhost:9090']
  - job_name: 'kubernetes-nodes'
    scheme: https
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    authorization:
      credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    kubernetes_sd_configs:
    - role: node
    relabel_configs:
    - action: labelmap
      regex: __meta_kubernetes_node_label_(.+)
  - job_name: kube-state-metrics
    honor_timestamps: true
    scrape_interval: 15s
    scrape_timeout: 10s
    metrics_path: /metrics
    scheme: http
    follow_redirects: true
    static_configs:
    - targets:
      - kube-state-metrics.kube-system.svc.cluster.local:8080
  - job_name: kubernetes-cadvisor
    honor_timestamps: true
    scrape_interval: 15s
    scrape_timeout: 10s
    metrics_path: /metrics/cadvisor
    scheme: https
    authorization:
      type: Bearer
      credentials_file: /var/run/secrets/kubernetes.io/serviceaccount/token
    tls_config:
      ca_file: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
      insecure_skip_verify: true
    follow_redirects: true
    relabel_configs:
    - separator: ;
      regex: __meta_kubernetes_node_label_(.+)
      replacement: $1
      action: labelmap
    kubernetes_sd_configs:
    - role: node
      follow_redirects: true
  - job_name: 'kubernetes-pods'
    kubernetes_sd_configs:
    - role: pod
    relabel_configs:
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
      action: keep
      regex: true
    - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
      action: replace
      target_label: __metrics_path__
      regex: (.+)
    - source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
      action: replace
      regex: ([^:]+)(?::\d+)?;(\d+)
      replacement: $1:$2
      target_label: __address__
    - action: labelmap
      regex: __meta_kubernetes_pod_label_(.+)
    - source_labels: [__meta_kubernetes_namespace]
      action: replace
      target_label: kubernetes_namespace
    - source_labels: [__meta_kubernetes_pod_name]
      action: replace
      target_label: kubernetes_pod_name

在集群顶部,我有一个 Prometheus 联邦,它联合集群内的 prometheus。

一切正常,但在集群内 prometheus 中,我有此日志(已打开调试级别)

摘录:

level=debug ts=2021-06-27T11:09:32.130Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/dev/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.130Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.130Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/36edd81cdc0bf2f5213054cf0ee4b6bc86328ec4473b879e3049ee0113a32728/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.130Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/5918a77ba85e2430cc0f434cde296c80f3f21f25739f73a9a7cf4296c0b2ad4d/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.130Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/5fa3d0a4389e13f8a96a8ff74e22172dd6ffb5c92e5e692a17e3a346660b49c5/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/bc60ea15258c46d3c4cca2e9b28ed608ca89b26ce5b14f2bdb6313d87d762e3b/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/bcb7f220797f48216cb0017b0cce1398ef0a9d377f66fa1f8a742241f9133567/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/cc4d8fe8e9051bfddc228dda7d272be7514c9cb93f4d1b2d98c9d632c63dfc8a/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/d9033c81c8cb4f0419b4f5ac7f1c14e0d1bb706820f46dc5a98b9db7944b2b08/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/lock\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/user/0\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/run/user/1001\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/sys/fs/cgroup\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.131Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"/var/lib/kubelet/pods/01a67077-3639-41e4-9708-0a3bf1fe5acf/volumes/kubernetes.io~secret/flannel-token-ts79t\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-48\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-53\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-56\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-62\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-68\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-79\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.133Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_currentcontainer=\"\",device=\"overlay_0-93\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/dev/mapper/debian--vg-root\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/dev/sda1\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/dev/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/36edd81cdc0bf2f5213054cf0ee4b6bc86328ec4473b879e3049ee0113a32728/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/5918a77ba85e2430cc0f434cde296c80f3f21f25739f73a9a7cf4296c0b2ad4d/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.134Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/5fa3d0a4389e13f8a96a8ff74e22172dd6ffb5c92e5e692a17e3a346660b49c5/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/bc60ea15258c46d3c4cca2e9b28ed608ca89b26ce5b14f2bdb6313d87d762e3b/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/bcb7f220797f48216cb0017b0cce1398ef0a9d377f66fa1f8a742241f9133567/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/cc4d8fe8e9051bfddc228dda7d272be7514c9cb93f4d1b2d98c9d632c63dfc8a/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/containerd/io.containerd.grpc.v1.cri/sandboxes/d9033c81c8cb4f0419b4f5ac7f1c14e0d1bb706820f46dc5a98b9db7944b2b08/shm\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/lock\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/user/0\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/run/user/1001\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/sys/fs/cgroup\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.135Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/var/lib/kubelet/pods/01a67077-3639-41e4-9708-0a3bf1fe5acf/volumes/kubernetes.io~secret/flannel-token-ts79t\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/var/lib/kubelet/pods/0c3c9be9-cc89-4e6e-93f4-e87c9356ad42/volumes/kubernetes.io~secret/kube-proxy-token-59vcr\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/var/lib/kubelet/pods/c05ffb01-231a-43d4-9941-e959ba521f52/volumes/kubernetes.io~secret/x509-certificate-exporter-node-token-m6w4w\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"/var/lib/kubelet/pods/d57f59ba-6ef6-4cd5-84cf-c1e3a2f79433/volumes/kubernetes.io~secret/default-token-zwtjf\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-115\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-121\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-145\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-151\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.136Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-157\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-164\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-165\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-188\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-44\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-48\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-53\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=debug ts=2021-06-27T11:09:32.137Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_fs_io_time_seconds_totalcontainer=\"\",device=\"overlay_0-56\",id=\"/\",image=\"\",name=\"\",namespace=\"\",pod=\"\""
level=warn ts=2021-06-27T11:09:32.149Z caller=scrape.go:1467 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Error on ingesting out-of-order samples" num_dropped=303
level=debug ts=2021-06-27T11:09:47.098Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_cpu_load_average_10scontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod8e4fae43df4163b63617776dc1321fe0.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"kube-controller-manager-master2\""
level=debug ts=2021-06-27T11:09:47.098Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_cpu_load_average_10scontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod97a02ba4a6b5572917c3b834d347981b.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"etcd-master2\""
level=debug ts=2021-06-27T11:09:47.099Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_cpu_system_seconds_totalcontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod8e4fae43df4163b63617776dc1321fe0.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"kube-controller-manager-master2\""
level=debug ts=2021-06-27T11:09:47.099Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_cpu_system_seconds_totalcontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod97a02ba4a6b5572917c3b834d347981b.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"etcd-master2\""
level=debug ts=2021-06-27T11:09:47.099Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_cpu_user_seconds_totalcontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod8e4fae43df4163b63617776dc1321fe0.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"kube-controller-manager-master2\""
level=debug ts=2021-06-27T11:09:47.100Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_cpu_user_seconds_totalcontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod97a02ba4a6b5572917c3b834d347981b.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"etcd-master2\""
level=debug ts=2021-06-27T11:09:47.100Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_file_descriptorscontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod8e4fae43df4163b63617776dc1321fe0.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"kube-controller-manager-master2\""
level=debug ts=2021-06-27T11:09:47.100Z caller=scrape.go:1511 component="scrape manager" scrape_pool=kubernetes-cadvisor target=https://10.10.10.61:10250/metrics/cadvisor msg="Out of order sample" series="container_file_descriptorscontainer=\"\",id=\"/kubepods.slice/kubepods-burstable.slice/kubepods-burstable-pod97a02ba4a6b5572917c3b834d347981b.slice\",image=\"\",name=\"\",namespace=\"kube-system\",pod=\"etcd-master2\""

我怀疑 Cadvisor 并重复了指标,但我没有看到它们。

Kubernetes:1.20 普罗米修斯:2.27.1

【问题讨论】:

【参考方案1】:

您可以尝试为这些指标添加一些任意标签,以区分它们所属的节点:

- job_name: kubernetes-cadvisor
  [...]
  relabel_configs:
  - separator: ;
    regex: __meta_kubernetes_node_label_(.+)
    replacement: $1
    action: labelmap
  - action: replace
    source_labels: [__meta_kubernetes_node_name]
    target_label: node_name
  [...]

我怀疑它会解决所有问题,但这应该有助于解决与设备相关的问题,例如 /run/lock、/run/user/0、/dev/shm、/dev/sda1、/dev/mapper/debian .+, overlay_0_[0-9]+, ... 因为我们很可能会在您的所有节点上找到这些。

完成后,让我们知道哪些仍在展示。

【讨论】:

以上是关于故障样本 prometheus 和 cadvisor的主要内容,如果未能解决你的问题,请参考以下文章

如何在 Prometheus 查询中设置返回样本频率?

有没有办法在使用 Prometheus 的 IIS 站点出现故障时产生警报?

k8s-prometheus disk

如果主从数据库服务器出现故障,则在 Prometheus 中触发警报

grafana和prometheus系列六:prometheus默认存储

#yyds干货盘点#Prometheus 之 OpenStack 的监控简述