alertmanager 告警写入kafka 及 k8s 部署prometheus alertmanager
Posted 胖胖胖胖胖虎
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了alertmanager 告警写入kafka 及 k8s 部署prometheus alertmanager相关的知识,希望对你有一定的参考价值。
部署alertmanager
helm 部署prometheus 及其周边,其他都正常部署,但是部署alertmanager Chart ,prometheus server 启动不起来
报:field alertmanagers not found in type config.ScrapeConfig
# kubectl logs prometheus-prometheus-server-59b8b67dfc-c6cbl prometheus-server
level=error ts=2023-01-11T02:32:51.178Z caller=main.go:290 msg="Error loading config (--config.file=/etc/config/prometheus.yml)" err="parsing YAML file /etc/config/prometheus.yml: yaml: unmarshal errors:\\n line 194: cannot unmarshal !!map into string\\n line 195: field alertmanagers not found in type config.ScrapeConfig"
选择不用helm
发布 alertmanager
,单独部署 alertmanager
pod
试试
参考helm的templates 的alertmanager-deployment.yaml
prometheus-alertmanager-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: alertmanager
labels:
k8s-app: alertmanager
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
replicas: 1
selector:
matchLabels:
k8s-app: alertmanager
template:
metadata:
labels:
k8s-app: alertmanager
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
spec:
priorityClassName: system-cluster-critical
containers:
- name: prometheus-alertmanager
image: "k8s-docker-registry-node:5000/alertmanager:v0.13.0"
imagePullPolicy: "IfNotPresent"
args:
- --config.file=/etc/config/alertmanager.yml
- --storage.path=/data
- --web.external-url=/
ports:
- containerPort: 9093
readinessProbe:
httpGet:
path: /#/status
port: 9093
initialDelaySeconds: 30
timeoutSeconds: 30
volumeMounts:
- name: config-volume
mountPath: /etc/config
- name: storage-volume
mountPath: "/data"
resources:
limits:
cpu: 10m
memory: 50Mi
requests:
cpu: 10m
memory: 50Mi
- name: prometheus-alertmanager-configmap-reload
image: "k8s-docker-registry-node:5000/configmap-reload:v0.1"
imagePullPolicy: "IfNotPresent"
args:
- --volume-dir=/etc/config
- --webhook-url=http://localhost:9093/-/reload
volumeMounts:
- name: config-volume
mountPath: /etc/config
readOnly: true
resources:
limits:
cpu: 10m
memory: 10Mi
requests:
cpu: 10m
memory: 10Mi
volumes:
- name: config-volume
configMap:
name: alertmanager-configmap
- name: storage-volume
emptyDir:
---
apiVersion: v1
kind: Service
metadata:
name: prometheus-alertmanager-service
labels:
app: prometheus-alertmanager-service
spec:
ports:
- port: 9093
nodePort: 9093
targetPort: 9093
name: prometheus-alertmanager-port
selector:
k8s-app: alertmanager
type: NodePort
alertmanager-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: alertmanager-configmap
labels:
app: alertmanager-configmap
data:
alertmanager.yml: |-
global:
resolve_timeout: 1m
receivers:
- name: default-receiver
webhook_configs:
### 配置kafka webhook
- url: 'http://alertmanager-kafka-forwarder-service:9792/alert'
send_resolved: true
route:
group_wait: 10s
group_interval: 5m
receiver: default-receiver
repeat_interval: 3h
rules-configmap.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: rules-configmap
labels:
app: rules-configmap
data:
node_rule.yml: |-
groups:
- name: node-rules
rules:
- alert: node-up
expr: up == 0
for: 15s
labels:
severity: 1
team: node
annotations:
summary: Summary
description: description
- alert: NodeMemoryUsage
expr: 100 - (node_memory_MemFree + node_memory_Cached + node_memory_Buffers) / node_memory_MemTotal * 100 > 60
for: 1m
labels:
severity: warning
annotations:
summary: "Instance $labels.instance 内存使用率过高"
description: " $labels.instance 内存使用大于60% (当前值: $value )"
prometheus.yaml
配置 alertmanager
prometheus.yml:
...
rule_files:
- "/prometheus/rules/node_rule.yml"
alerting: #配置Alertmanager相关信息
alertmanagers:
- static_configs:
- targets: ['prometheus-alertmanager-service:9093']
...
alertmanager 写入kafka
参考开源项目:https://github.com/insani4c/alertmanager-kafka-forwarder
apiVersion: v1
kind: ReplicationController
metadata:
name: alertmanager-kafka-forwarder-deployment
labels:
name: alertmanager-kafka-forwarder-deployment
spec:
replicas: 1
selector:
name: alertmanager-kafka-forwarder-deployment
template:
metadata:
labels:
name: alertmanager-kafka-forwarder-deployment
spec:
containers:
- name: alertmanager-kafka-forwarder-deployment
image: k8s-docker-registry-node:5000/alertmanager-kafka-forwarder:main
imagePullPolicy: IfNotPresent
ports:
- containerPort: 9792
env:
- name: "TZ"
value: "Asia/Shanghai"
- name: "BOOTSTRAP_SERVERS"
value: "kafka-headless:9092"
- name: "FLASK_SECRET_KEY"
value: "123456"
- name: "KAFKA_TOPIC"
value: "alertmanager-events"
#command: ["/bin/bash", "-c", " sleep infinity"]
restartPolicy: Always
---
apiVersion: v1
kind: Service
metadata:
labels:
app: alertmanager-kafka-forwarder-service
name: alertmanager-kafka-forwarder-service
spec:
type: NodePort
ports:
- port: 9792
targetPort: 9792
nodePort: 9792
name: alertmanager-kafka-forwarder-port
selector:
name: alertmanager-kafka-forwarder-deployment
以上是关于alertmanager 告警写入kafka 及 k8s 部署prometheus alertmanager的主要内容,如果未能解决你的问题,请参考以下文章
小姐姐带你入门Alertmanager与Prometheus告警规则