4. 监控利器nagios手把手企业级实战第三部
Posted 小熊尤里
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了4. 监控利器nagios手把手企业级实战第三部相关的知识,希望对你有一定的参考价值。
1.nagios图形监控显示和管理服务器
虽然能显示,能报警。但是我们企业工作中需要一个历史趋势图。
nagios只开放核心,插件是单独的形式,图像也一样,是插件或者整合的方式。所以可能看起来很多,这种方式叫做弱耦合。
2.Pnp安装图形监控曲线(服务器端)
php出图软件官方站点为:http://www.php4nagios.org
先执行yum安装下面基础pnp软件需要的包,如果有重复的再执行下也不会有问题。
1)图形显示的依赖库:yum install cairo pango zlib zlib-devel freetype freetype-devel gd gd-devel -y
2)pnp依赖rrdtool软件,此软件为轮询数据库工具(24小时),安装rrdtool需要libart_lgpl,这里通过编译安装的方式。rrdtools的依赖
tar xf libart_lgpl-2.3.17.tar.gz
cd libart_lgpl-2.3.17
./configure
make
make install
/bin/cp -r /usr/local/include/libart-2.0 /usr/include
cd ../
3)安装rrdtools轮询的数据库,专门画图
tar xf rrdtool-1.2.14.tar.gz
cd rrdtool-1.2.14
./configure --prefix=/usr/local/rrdtool --disable-python --disable-tcl
#WARNING: The RRDs Perl Modules are not found on your System
#Using RRDs will speedup things in larger Installtions.
#configure后出现上面的提示可以不用理会。
make
make install
cd ../
ls -l /usr/local/rrdtool/bin
成功结果:
4)PNP收集数据告诉rrdtools画图,PnP负责展示
tar zxf pnp-0.4.14.tar.gz
cd pnp-0.4.14
./configure \\
--with-rrdtool=/usr/local/rrdtool/bin/rrdtool --with-perfdata-dir=/usr/local/nagios/share/perfdata/
#################
# WARNING: The RRDs Perl Modules are not found on your System
# Using RRDs will speedup things in larger Installtions.
#####################
make all
make install
make install-config
make install-init
ll /usr/local/nagios/libexec/ |grep process---->生成:-rwxr-xr-x 1 nagios nagios 31827 Dec 14 13:07 process_perfdata.pl (生成画图收集的数据)
4.配置出图:
833 process_performance_data=1<==========
834
835
836
837 # HOST AND SERVICE PERFORMANCE DATA PROCESSING COMMANDS
838 # These commands are run after every host and service check is
839 # performed. These commands are executed only if the
840 # enable_performance_data option (above) is set to 1. The command
841 # argument is the short name of a command definition that you
842 # define in your host configuration file. Read the html docs for
843 # more information on performance data.
844
845 host_perfdata_command=process-host-perfdata<==========
846 service_perfdata_command=process-service-perfdata<==========
执行编辑命令 vi /usr/local/nagios/etc/objects/commands.cfg +227,修改commands.cfg配置文件。(直接删除下面的再进行复制,也是可以的)
#修改commands.cfg 配置文件,约227-238行
#默认配置为(如果没有直接添加即可)
#-----------------------------------------------------------------
# \'process-host-perfdata\' command definition
define command{
command_name process-host-perfdata
command_line /usr/bin/printf "%b" "$LASTHOSTCHECK$\\t$HOSTNAME$\\t$HOSTSTATE$\\t$HOSTATTEMPT$\\t$HOSTSTATETYPE$\\t$HOSTEXECUTIONTIME$\\t$HOSTOUTPUT$\\t$HOSTPERFDATA$\\n" >> /usr/local/nagios/var/host-perfdata.out
}
# \'process-service-perfdata\' command definition
define command{
command_name process-service-perfdata
command_line /usr/bin/printf "%b" "$LASTSERVICECHECK$\\t$HOSTNAME$\\t$SERVICEDESC$\\t$SERVICESTATE$\\t$SERVICEATTEMPT$\\t$SERVICESTATETYPE$\\t$SERVICEEXECUTIONTIME$\\t$SERVICELATENCY$\\t$SERVICEOUTPUT$\\t$SERVICEPERFDATA$\\n" >> /usr/local/nagios/var/service-perfdata.out
}
修改成如下配置
# \'process-host-perfdata\' command definition
define command{
command_name process-host-perfdata
command_line /usr/local/nagios/libexec/process_perfdata.pl
}
define command{
command_name process-service-perfdata
command_line /usr/local/nagios/libexec/process_perfdata.pl
}
也可以用nagios变量$USER1$替代/usr/local/nagios/libexec/路径
执行检查语法命令/etc/init.d/nagios checkconfig
Total Warnings: 0
Total Errors: 0
根据提示,配置通过。重启。
action_url /nagios/pnp/index.php?host=$HOSTNAME$
action_url /nagios/pnp/index.php?host=$HOSTNAME$&srv=$SERVICEDESC$
1)邮件
2)邮件转短信报警
3)短信网关:将信息放到网关,网关发到手机上
4)微信及时通讯,通过给邮箱发邮件,手机微信和邮箱绑定,则微信就响
花费一定的费用,把业务做到最好,是正常的工作思路,花钱是正常的。
A类:cpu、磁盘空间、内存
B类:网站域名打不开,宕机。开发、领导都需知悉
如果宕机不处理不行,那就说明健壮性不好。
[root@oldboy-A libexec]# pwd
/usr/local/nagios/libexec
[root@oldboy-A libexec]# cat sms_send
#!/bin/sh
alert_date=$(date +%y-%m-%d" "%H:%M)
TITLE=$1 #FORMAT "Host $HOSTSTATE$ alert for $HOSTNAME$"
CONTACT=$2
curl -d cdkey=3RTY-EMY-0980-MTUQ2 -d password=189162 -d phone=$CONTACT -d message="$TITLE[${alert_date} oldboysa]" http://sdkhttp.eucp.b2m.cn/sdkproxy/sendsms.action
#wget --quiet "http://s.ccme.cc/qxt/send.jsp?circle=159net_131&pwd=oldboy123&mobile=18911718229&service=f1fb0546-ebb6-0987-8f20-560524c1f88d&msgid=3956724&message=$TITLE[${alert_date} oldboysa n]"
1)添加联系人和联系组contacts.cfg
define contact{
contact_name oldboy-pager
use generic-contact
alias Nagios users
pager 18901398221
}
2)添加报警的命令commands.cfg
#command.cfg
# \'notify-host-by-pager\' command definition
define command{
command_name notify-host-by-pager
command_line $USER1$/sms_send "Host $HOSTSTATE$ alert for $HOSTNAME$" $CONTACTPAGER$
}
# \'notify-service-by-pager\' command definition
define command{
command_name notify-service-by-pager
command_line $USER1$/sms_send "$HOSTALIAS$/$SERVICEDESC$ is $SERVICESTATE$" $CONTACTPAGER$
}
3)调整联系人模板,添加报警的命令(来自于commands.cfg)
编辑templates.cfg里的定义内容为:
define contact{
name generic-contact
service_notification_period 24x7
host_notification_period 24x7
service_notification_options w,u,c,r,f,s
host_notification_options d,u,r,f,s
service_notification_commands notify-service-by-email,notify-service-by-fetion,notify-service-by-msn, notify-service-by-pager
host_notification_commands notify-host-by-email,notify-service-by-fetion,notify-host-by-msn, notify-service-by-pager
register 0
}
contact_groups admins,groups1,groups2,user01
常用软件按:nagios、cacti、zabbix等
nagios 优点是报警 cacti 做流量趋势
cacti和nagios最大的区别在于前者有很强大的数据采集、存储和展示功能,但在报警管理这一块不如nagios。cacti开发语言是php,配置存储在mysql数据库中,数据采集同样利用了snmp协议,数据存储则利用了RRDTOOL。
在生产环境中,nagios+pnp进行数据采集及各类监控报警,使用cacti来负责监控网络设备流量。
zabbix+cacti
有些公司使用zabbix,cacti实现告警监控及数据采集出图
以上是关于4. 监控利器nagios手把手企业级实战第三部的主要内容,如果未能解决你的问题,请参考以下文章
监控利器Nagios之二:Nagios的细致介绍和监控外部服务器的私有信息
监控利器Nagios之一:监控本地NFS和外部HTTPMySQL服务