Nagios 钉钉报警

Posted sleepdragon

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Nagios 钉钉报警相关的知识,希望对你有一定的参考价值。

第一章 创建钉钉应用(用于发送报警信息给单独某个用户)

打开钉钉管理后台: https://oa.dingtalk.com

技术图片

技术图片

技术图片

创建成功后需要记录三个值"AgentID","AppKey","AppSecret"。

技术图片

技术图片

第二章 创建钉钉机器人(用于发送报警信息到监控群)

2.1创建钉钉群

过程略过。

2.2添加群机器人

技术图片

技术图片

技术图片

#此处要记录webhook,脚本中需要调用。

第三章 编写报警脚本(此脚本会在服务器出现异常后调用)

技术图片
  1 此脚本是基于Python3写的,调用此脚本时会传入七个参数,参数是Nagios的变量,参数说明见下文。
  2 [[email protected] ~]$ cat /usr/local/nagios/python/dingding.py
  3 #coding: utf-8
  4 import json
  5 import sys
  6 import requests
  7 
  8 ‘‘‘
  9 参数含义:
 10 警告类型: $NOTIFICATIONTYPE$
 11 服务名称: $SERVICEDESC$
 12 主机名: $HOSTALIAS$
 13 IP地址: $HOSTADDRESS$
 14 服务状态: $SERVICESTATE$
 15 时间: $LONGDATETIME$
 16 日志: $SERVICEOUTPUT$
 17 ‘‘‘
 18 
 19 warning_type=str(sys.argv[1])
 20 service_name=str(sys.argv[2])
 21 host_name=str(sys.argv[3])
 22 host_IP=str(sys.argv[4])
 23 service_state=str(sys.argv[5])
 24 warning_time=str(sys.argv[6])
 25 warning_log=str(sys.argv[7])
 26 
 27 ‘‘‘
 28 用户的userid,因为固定的,所以写死了,获取方法:
 29 获取部门ID:
 30 curl https://oapi.dingtalk.com/department/list?access_token=xxx|jq ‘.‘
 31 通过部门ID获取userid:
 32 curl https://oapi.dingtalk.com/user/list?access_token=xx&department_id=xx|jq ‘.‘
 33 ‘‘‘
 34 
 35 chenning_id=09386937241216057
 36 baihe_id=165726012126376472
 37 tiantaotao_id=215023131029727888
 38 wangfujun_id=014610392229410999
 39 maoweijian_id=014506344727183149
 40 caie_id=01461056511094710
 41 zhaozhibo_id=121027651935582616
 42 
 43 #项目的IP列表
 44 ITFIN=[47.99.98.249,47.110.157.52,47.99.88.4,47.99.203.235,47.99.201.252,47.98.240.44,47.99.201.132,47.96.89.81,47.99.106.12,47.99.204.155,120.55.49.10]
 45 cdh=[47.99.122.122,47.99.134.63,47.99.82.201,47.96.22.59,47.99.53.179]
 46 chess=[106.14.12.179,47.101.144.209,106.14.169.195,47.101.164.250]
 47 sdk=[121.40.109.196,121.40.82.16,120.26.106.206,120.26.223.154,120.26.55.62,47.97.244.135,101.37.89.187,116.62.108.28,116.62.109.7,116.62.102.197]
 48 
 49 #发送的信息主体
 50 header = {"Content-Type":"application/json;charset=UTF-8"}
 51 content="** Nagios警报 **

警告类型: {}
服务名称: {}
主机名: {}
IP地址: {}
服务状态: {}
时间: {}
日志:
{}".format(warning_type,service_name,host_name,host_IP,service_state,warning_time,warning_log)"
 52 
 53 def get_accessToken(appkey,appsecret):
 54     ‘‘‘
 55     此函数用于获取accessToken
 56     ‘‘‘
 57     json_token=requests.get(url=https://oapi.dingtalk.com/gettoken,params={appkey:appkey,appsecret:appsecret})
 58     return json_token.json()[access_token]
 59 
 60 def send_group():
 61     ‘‘‘
 62     此函数用于发送报警至钉钉群
 63     ‘‘‘
 64     url=https://oapi.dingtalk.com/robot/send?access_token=7df4cff195905e47527602b7bfab6ecc4fc669392da1e446eebeac05049ddcf7
 65     data = {
 66     "msgtype":"text",
 67     "text":{
 68     "content":content}
 69     }
 70     sendData=json.dumps(data).encode(utf-8)
 71     result=requests.post(url=url,data=sendData,headers=header)
 72     
 73 def send_someone_data(*args):
 74     ‘‘‘
 75     不同的业务线有不同的信息,为了节省代码所以定义了一个函数
 76     ‘‘‘
 77     data={
 78     "touser":|.join((args[:])),
 79     "agentid":236353484,
 80     "msgtype":"text",
 81     "text":{
 82     "content":content}
 83     }
 84     return data
 85     
 86 def send_someone():
 87     ‘‘‘
 88     此函数用于发送信息给某个业务线的负责人
 89     ‘‘‘
 90     access_token=get_accessToken(dingg3bmym6arxwokwee,xxx)
 91     url="https://oapi.dingtalk.com/message/send?access_token={}".format(access_token)
 92     if host_IP in ITFIN:
 93         data=send_someone_data(chenning_id,baihe_id)
 94     elif host_IP in cdh:
 95         data=send_someone_data(tiantaotao_id,zhaozhibo_id)
 96     elif host_IP in chess:
 97         data=send_someone_data(wangfujun_id)
 98     elif host_IP in sdk or host_IP.startswith(103.56.139):
 99         data=send_someone_data(maoweijian_id,caie_id)
100     sendData=json.dumps(data).encode(utf-8)
101     result=requests.post(url=url,data=sendData,headers=header)
102     
103 if __name__ == __main__:
104     send_group() #只要服务器发生异常都发送报警到你创建的群中
105     send_someone() #根据发生异常的服务器IP来决定发送给哪个用户
View Code

第四章 配置钉钉报警

4.1添加报警,commands.cfg里编写。

[[email protected] ~]$ tail -6 /usr/local/nagios/etc/objects/commands.cfg
###钉钉报警###
define command{
command_name dindin-bj
command_line /usr/local/python-3.4/bin/python3.4 /usr/local/nagios/python/dingding.py "$NOTIFICATIONTYPE$""$SERVICEDESC$""$HOSTALIAS$""$HOSTADDRESS$""$SERVICESTATE$""$LONGDATETIME$""$SERVICEOUTPUT$" register 1
}

4.2 联系人调用报警

[[email protected] ~]$ tail -20 /usr/local/nagios/etc/objects/contacts.cfg
define contact{
contact_name dingding
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r,f,s
host_notification_options d,u,r,f,s
service_notification_commands dindin-bj #调用commands.cfg文件中定义的命令
host_notification_commands dindin-bj
register 1
}
define contactgroup{ #将钉钉联系人添加到组
contactgroup_name admins
alias Nagios Administrators
members 139mail,dingding,zq-weixin,mao-weixin,baihe-weixin,huazhen-weixin,zhuyuliang-weixin,tiantaotao-weixin
}
define contactgroup{
contactgroup_name paiyou
alias paiyou
members nagiosadmin,dingding,zhanghu-weixin,yujie-weixin,bietao-weixin,louchao-weixin,maxiang-weixin,liujieqing-weixin
}

 

4.3 查看主机,服务调用那些模板

[[email protected] ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/aliyun/host.cfg
define host{
use generic_linux_aliyun #应用的模板名称
host_name ad-server01
alias AD SERVER01
address 120.26.121.119
hostgroups aliyun_linux_ad_group
}
[[email protected] ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/services/check_ad.cfg
define service{
host_name         ad-server01
use generic_service    #引用的模板名称
name check_ad
service_description Check ad
check_command check_nrpe!check_ad
}

4.4 修改模板(调用此联系人)

[[email protected] ~]$ grep -vE "^$|^#" /usr/local/nagios/etc/templates/host_templates.cfg
define host{
    name        generic_linux_aliyun
    use        linux_server
}           #找到了主机引用的模板,但是此模板还有父级模板,所以要继续找到父级模板添加联系人

define host{ name linux_server use generic_host … 省略 contact_groups admins #修改联系人组,为我们定义的组 register 0 }
[[email protected]
~]$ grep -vE "^$|^#" /usr/local/nagios/etc/templates/service_templates.cfg define service{ name generic_service use services-pnp … 省略 contact_groups admins #修改联系人组,为我们定义的组 }

4.5  配置报警的整体逻辑。

主机引用模板 -> 模板引用联系人组 -> 联系人组包含联系人 -> 联系人中调用报警命令 -> 报警命令引用脚本

 

4.6  检测配置文件,重启

#/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg
#/etc/init.d/nagios restart

 

以上是关于Nagios 钉钉报警的主要内容,如果未能解决你的问题,请参考以下文章

实战Nagios网络监控——Nagios 微信报警

nagios邮件报警配置

go 钉钉报警

限制Nagios报警次数

nagios通过邮件发送报警

nagios安装配置报警监控搭建