普罗米修斯中的警报管理器未启动
Posted
技术标签:
【中文标题】普罗米修斯中的警报管理器未启动【英文标题】:Alert manager in prometheus not starting 【发布时间】:2021-12-24 14:04:24 【问题描述】:我配置了prometheus alertmanager 安装没有错误,但是 systemctl status alertmanager.service 给出
# systemctl status alertmanager.service
● alertmanager.service - Alertmanager for prometheus
Loaded: loaded (/etc/systemd/system/alertmanager.service; enabled; vendor preset: enabled)
Active: failed (Result: exit-code) since Fri 2021-11-12 07:15:08 UTC; 4min 50s ago
Process: 1791 ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data (code=exited, status=1/FAILUR>
Main PID: 1791 (code=exited, status=1/FAILURE)
Nov 12 07:15:08 localhost systemd[1]: alertmanager.service: Scheduled restart job, restart counter is at 5.
Nov 12 07:15:08 localhost systemd[1]: Stopped Alertmanager for prometheus.
Nov 12 07:15:08 localhost systemd[1]: alertmanager.service: Start request repeated too quickly.
Nov 12 07:15:08 localhost systemd[1]: alertmanager.service: Failed with result 'exit-code'.
Nov 12 07:15:08 localhost systemd[1]: Failed to start Alertmanager for prometheus.
我的 alertmanager.service 的 systemd 服务文件是
[Unit]
Description=Alertmanager for prometheus
[Service]
Restart=always
User=prometheus
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data
ExecReload=/bin/kill -HUP $MAINPID
TimeoutStopSec=20s
SendSIGKILL=no
[Install]
WantedBy=multi-user.target
ExecStart=/opt/alertmanager/alertmanager --config.file=/opt/alertmanager/alertmanager.yml --storage.path=/opt/alertmanager/data(code=exited, status=1/FAILUR>
日志说
Nov 12 13:27:01 localhost alertmanager[1563]: level=warn ts=2021-11-12T13:27:01.483Z caller=cluster.go:177 component=cluster err="couldn't deduce an advertise address: no private IP found, explicit advertise addr not provided"
Nov 12 13:27:01 localhost alertmanager[1563]: level=error ts=2021-11-12T13:27:01.485Z caller=main.go:250 msg="unable to initialize gossip mesh" err="create memberlist: Failed to get final advertise address: No private IP address found, and explicit IP not provided"
Nov 12 13:27:01 localhost systemd[1]: alertmanager.service: Main process exited, code=exited, status=1/FAILURE
Nov 12 13:27:01 localhost systemd[1]: alertmanager.service: Failed with result 'exit-code'.
任何解决问题的线索
【问题讨论】:
【参考方案1】:您想在 HA 模式下运行 AlertManager 吗?它默认启用,并且需要具有 RFC-6980 IP 地址的实例。
您可以使用标志alertmanager --cluster.advertise-address=<ip>
指定此地址
否则禁用 HA 并为标志指定空值:alertmanager --cluster.listen-address=
【讨论】:
以上是关于普罗米修斯中的警报管理器未启动的主要内容,如果未能解决你的问题,请参考以下文章