prometheus黑盒监控之http监控

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了prometheus黑盒监控之http监控相关的知识,希望对你有一定的参考价值。

下载blackbox_exporter

同样,黑盒监控需要安装exporter,这回下载blackbox_exporter,下载地址:https://prometheus.io/download/#blackbox_exporter

编辑配置文件

vi blackbox.yml
modules:
  http_post_2xx:
    prober: http
    timeout: 5s
    http:
      method: GET

启动blackbox_exporter

./blackbox_exporter --config.file=blackbox.yml

prometheus添加监控项

  - job_name: blackbox-exporter
    params:
      module:
      - http_post_2xx
      target:
      - www.csdn.net
    metrics_path: /probe
    static_configs:
    - targets:
      - 10.10.10.103:9115

其中:www.csdn.net这个target代表需要监控的主机,10.10.10.103:9115这个target代表blackbox_exporter安装的主机地址和端口

alertmanager添加告警项

- name: Blackbox Expoter
  rules:
  - alert: BlackboxProbeFailed
    expr: probe_success == 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Blackbox probe failed (instance  $labels.instance )
      description: "Probe failed\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxConfigurationReloadFailure
    expr: blackbox_exporter_config_last_reload_successful != 1
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: Blackbox configuration reload failure (instance  $labels.instance )
      description: "Blackbox configuration reload failure\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxSlowProbe
    expr: avg_over_time(probe_duration_seconds[1m]) > 2
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: Blackbox slow probe (instance  $labels.instance )
      description: "Blackbox probe took more than 1s to complete\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxProbeHttpFailure
    expr: probe_http_status_code <= 199 OR probe_http_status_code >= 400
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Blackbox probe HTTP failure (instance  $labels.instance )
      description: "HTTP status code is not 200-399\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxSslCertificateWillExpireSoon
    expr: 3 <= round((last_over_time(probe_ssl_earliest_cert_expiry[10m]) - time()) / 86400, 0.1) < 20
    for: 0m
    labels:
      severity: warning
    annotations:
      summary: Blackbox SSL certificate will expire soon (instance  $labels.instance )
      description: "SSL certificate expires in less than 20 days\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxSslCertificateWillExpireSoon
    expr: 0 <= round((last_over_time(probe_ssl_earliest_cert_expiry[10m]) - time()) / 86400, 0.1) < 3
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Blackbox SSL certificate will expire soon (instance  $labels.instance )
      description: "SSL certificate expires in less than 3 days\\n  VALUE =  $value \\n  LABELS =  $labels "
  # For probe_ssl_earliest_cert_expiry to be exposed after expiration, you
  # need to enable insecure_skip_verify. Note that this will disable
  # certificate validation.
  # See https://github.com/prometheus/blackbox_exporter/blob/master/CONFIGURATION.md#tls_config
  - alert: BlackboxSslCertificateExpired
    expr: round((last_over_time(probe_ssl_earliest_cert_expiry[10m]) - time()) / 86400, 0.1) < 0
    for: 0m
    labels:
      severity: critical
    annotations:
      summary: Blackbox SSL certificate expired (instance  $labels.instance )
      description: "SSL certificate has expired already\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxProbeSlowHttp
    expr: avg_over_time(probe_http_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: Blackbox probe slow HTTP (instance  $labels.instance )
      description: "HTTP request took more than 1s\\n  VALUE =  $value \\n  LABELS =  $labels "
  - alert: BlackboxProbeSlowPing
    expr: avg_over_time(probe_icmp_duration_seconds[1m]) > 1
    for: 1m
    labels:
      severity: warning
    annotations:
      summary: Blackbox probe slow ping (instance  $labels.instance )
      description: "Blackbox ping took more than 1s\\n  VALUE =  $value \\n  LABELS =  $labels "

验证是否生效

target已添加 rules已添加 alerts已添加 将监控主机www.csdn.net换成一个不存在的主机试下 发现rules已出现告警 alertmanager也出现了告警 钉钉告警机器人也发出了告警

以上是关于prometheus黑盒监控之http监控的主要内容,如果未能解决你的问题,请参考以下文章

性能监控之 blackbox_exporter+Prometheus+Grafana 实现网络探测

Linux-监控三剑客之prometheus

使用prometheus和blackbox_exporte进行业务服务监控

全面学习Prometheus

Prometheus监控之Blackbox_exporter

#yyds干货盘点#K8S 之 Prometheus 监控步骤