Python3 - 监控CentOS磁盘空间&预警发送邮件

Posted 韩俊强

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Python3 - 监控CentOS磁盘空间&预警发送邮件相关的知识,希望对你有一定的参考价值。

文章目录

在日常维护CentOS服务器时, 重复的使用登录服务器, 命令查看服务器状态非常繁琐, 为解决这一重复性工作, 我们通常使用定时任务执行脚本, 发送警示邮件。

以CentOS7.9 为例:
实现每4小时检测一次磁盘空间使用情况, 当磁盘空间使用大于80%时, 发送邮件预警提醒;

另外, 列举一些常用脚本, 以供参考~

1. 脚本监控CentOS磁盘空间

import subprocess


# cmd命令
def run_cmd(cmd):
    process = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
    result_f, error_f = process.stdout, process.stderr
    errors = error_f.read()
    if errors:
        pass
    result = result_f.read().decode()
    if result_f:
        result_f.close()
    if error_f:
        error_f.close()
    return result


# 磁盘检测
def disk_check():
    subject = ''
    result = run_cmd(cmd)
    content = '[harry@iZ8vbxxxo60rxxxuZ ~]$' + cmd + '\\n' + result
    result = result.split('\\n')
    for line in result:
        if 'G ' in line or 'M ' in line:
            line = line.split()
            for i in line:
                if '%' in i and int(i.strip('%')) > 80:
                    subject = '[WARNING] SERVER FILESYSTEM USE% OVER ' + i + ', PLEASE CHECK!'
    if subject:
        send_email(subject, content)
    else:
        # 磁盘空间充足情况下, 不触发发送邮件
        print('Everything is ok, keep on monitor.')

2. 发送邮件

163邮箱平台 为例:

import smtplib  
from email.mime.text import MIMEText

# 发送邮件
def send_email(title, content):
    # 发送人
    sender = 'xxx@163.com'
    pwd = 'xxxx' # 密码, 一般都是邮件平台授权码
    target = 'xxx@qq.com' # 接收人
    # 邮件: 发送人, 接收人, 主题, 内容
    msg = MIMEText(content, 'html')  # 转换成浏览器认识的
    msg['subject'] = title
    msg['from'] = sender
    msg['to'] = target   # 可以发送多人, 这里不举例了

    try:
        # 163 - SMTP服务器: smtp.163.com  端口: 465
        smtp = smtplib.SMTP_SSL('smtp.163.com', 465)
        smtp.login(sender, pwd)
        smtp.sendmail(sender, target, msg.as_string())
        smtp.close()
        print('邮件已发送~')

    except Exception as error:
        print(error)
# 检测磁盘使用情况并发送邮件
cmd = 'df -h'
disk_check()

3. 定时任务检测并发邮件

默认情况下,CentOS 7中已经安装有crontab,如果没有安装,可以通过yum进行安装。

3.1 安装使用crontab

yum install crontabs

3.1.2 crontab的定时语法说明

/etc/crontab文件包括下面几行:

cat /etc/crontab

SHELL=/bin/bash

PATH=/sbin:/bin:/usr/sbin:/usr/bin

MAILTO=HOME=/

# run-parts

51 * * * * root run-parts /etc/cron.hourly

24 7 * * * root run-parts /etc/cron.daily

22 4 * * 0 root run-parts /etc/cron.weekly

42 4 1 * * root run-parts /etc/cron.monthly

前四行是用来配置crond任务运行的环境变量,第一行SHELL变量指定了系统要使用哪个shell,这里是bash,第二行PATH变量指定了系统执行 命令的路径,第三行MAILTO变量指定了crond的任务执行信息将通过电子邮件发送给root用户,如果MAILTO变量的值为空,则表示不发送任务执行信息给用户,第四行的HOME变量指定了在执行命令或者脚本时使用的主目录。

3.1.3 corntab图解

* 代表取值范围内的数字
/ 代表"每"
- 代表从某个数字到某个数字
, 代表离散的取值(取值的列表) 
* * * * * //每分钟执行
* */4 * * * //每4小时执行
0 4 * * * //每天4点执行
0 12 */2 * * //每2天执行一次,在12点0分开始运行
* * * * 0 //每周日执行
* * * * 6,0 //每周六、日执行
5 * * * * //每小时的第5分钟执行

crontab文件的含义:

用户所建立的crontab文件中,每一行都代表一项任务,每行的每个字段代表一项设置,它的格式共分为六个字段,前五段是时间设定段,第六段是要执行的命令段,格式如下:

minute   hour   day   month   week   command

其中:

minute: 表示分钟,可以是从0到59之间的任何整数。

hour:表示小时,可以是从0到23之间的任何整数。

day:表示日期,可以是从1到31之间的任何整数。

month:表示月份,可以是从1到12之间的任何整数。

week:表示星期几,可以是从0到7之间的任何整数,这里的0或7代表星期日。

command:要执行的命令,可以是系统命令,也可以是自己编写的脚本文件。

3.2 配置定时执行python脚本

查看当前系统中的定时任务列表:

crontab -l   

对crontab进行编辑:

crontab -e 

注意:

这里使用的是Python3, 根据自己的Python环境已经脚本路径去写; 这里建议写绝对路径, 遇到问题方便定位和查看

* */4 * * * Python3 /home/harry/task/manager.py

保存后重新启动 crontab:

service crond restart

至此, 文章开头的需求已满足, 以下是扩展内容, 可以相应的参考学习, 不做为重点研究。


4. 其他常用的检测脚本

4.1 CPU检测脚本

[harry@iZ8vbxxxo60rxxxuZ ~]$ cat cpu.sh   
#!/bin/bash
cpuname=$(cat /proc/cpuinfo | grep name | cut -f2 -d: | uniq -c)
physical=$(cat /proc/cpuinfo | grep "physical id" | sort -u | wc -l)
processor=$(cat /proc/cpuinfo | grep "processor" | wc -l)
cpucores=$(cat /proc/cpuinfo  | grep "cpu cores" | uniq)
siblings=$(cat /proc/cpuinfo  | grep "siblings"  | uniq)

echo "* * * * * CPU Information * * * * *"
echo "(CPU型号)cpu name : $cpuname"
echo "(物理CPU个数)physical id is : $physical"
echo "(逻辑CPU个数)processor is : $processor"
echo "(CPU内核数)cpu cores is : $cpucores"
echo "(单个物理CPU的逻辑CPU数)siblings is : $siblings"
[harry@iZ8vbxxxo60rxxxuZ ~]$ 

4.2 CentOS环境检测脚本

#YUM
ls /etc/yum.repos.d/
echo ""
echo "############################"
read -p "clean yum,your answer(y|*):" ans
if [ "$ans" == "y" ];then
    mkdir /etc/yum.repos.d/bak
    mv /etc/yum.repos.d/CentOS* /etc/yum.repos.d/bak/
    wget -O /etc/yum.repos.d/CentOS-Base.repo http://mirrors.cloud.tencent.com/repo/centos7_base.repo
    wget -O /etc/yum.repos.d/epel.repo http://mirrors.cloud.tencent.com/repo/epel-7.repo
    yum clean all
    yum makecache
fi
echo ""
echo "############################"
#关闭防火墙#
if systemctl status firewalld &>/dev/null;then
    systemctl stop firewalld &>/dev/null
    systemctl disable firewalld &>/dev/null
else
    echo "防火墙已关闭"
fi
echo ""
echo "############################"
#关闭NetworkManager#
read -p "check net device,if continue:" answ
if [ "$answ" == "y" ];then
    if systemctl status NetworkManager &>/dev/null;then
        systemctl stop `NetworkManager` &>/dev/null
        systemctl disable NetworkManager &>/dev/null
    else
        echo "NetworkManager已关闭"
    fi
fi
echo ""
echo "############################"
#关闭selinux#
se_stat=`getenforce`
if [ "$se_stat" == "Enforcing" ];then
    setenforce 0
    sed -i 's/^SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
else
    echo "selinux状态为$se_stat"
fi
echo ""
echo "############################"
#上传sync命令,并安装#
if ! which rsync &>/dev/null;then
    yum install rsync -y
    which rsync
fi
#调整最大文件打开数#
files_num=`ulimit -n`
if [ $files_num != 102400 ];then
    echo "root soft nofile 102400" >>/etc/security/limits.conf
    echo "root hard nofile 102400" >>/etc/security/limits.conf
else
    echo "文件最大打开数为$files_num"
fi
echo ""
echo "############################"
#检查http代理
echo "-------check_http_proxy------"
echo "$http_proxy" "$https_proxy"
echo ""
echo "-----------------------------"

echo ""
echo "############################"
#创建目录
[ -d "/data" ] || mkdir /data && ls -ld /data/
echo ""
echo "############################"

#确认时间同步
echo "please check date,you can you these cmd to change!"
echo "-------------------------------------------------"
echo ""
echo "date -s 10:20:00;hwclock --systohc"
echo "cp /usr/share/zoneinfo/Asia/Shanghai /etc/localtime"

4.3 清空nginx cache脚本

#!/bin/bash

cache_purge()
PURGE_URL=$1
    URL_NAME=$(echo -n $PURGE_URL | md5sum | awk 'print $1')
    FILE_NAME=$(echo $URL_NAME  | awk 'print "/data/cdn_cache/proxy_cache_dir/"substr($0,length($0),1)"/"substr($0,length($0)-2,2)"/"$0')
    rm -rf $FILE_NAME 


purge_file()
    PURGE_FILE=$1
    for url in $(cat $PURGE_FILE);do
        cache_purge $url
    done


purge_url()
    PURGE_URL=$1
    cache_purge $PURGE_URL


usage()
    echo $"Usage: $0 <url_file | 'url'>"


main ()
    if [ "$#" -ne 1 ];then
        usage;
    else
        if [ -f $1 ];then
            purge_file $1;
        else
            purge_url $1;
        fi
    fi


main $1
nginx_cache_clear.sh

4.4 重启tomcat脚本

#!/bin/bash
TOMCAT_PATH=/usr/local/tomcat

usage()
   echo "Usage: $0 [start|stop|status|restart]"


status_tomcat()

ps aux | grep java | grep tomcat | grep -v 'grep' 



start_tomcat()
/usr/local/tomcat/bin/startup.sh


stop_tomcat()

TPID=$(ps aux | grep java | grep tomcat | grep -v 'grep' | awk 'print $2')
kill -9 $TPID
sleep 5;

TSTAT=$(ps aux | grep java | grep tomcat | grep -v 'grep' | awk 'print $2')
    if [ -z $TSTAT ];then
      echo "tomcat stop"
    else
      kill -9 $TSTAT
    fi

cd $TOMCAT_PATH

rm temp/* -rf
rm work/* -rf



main()
case $1 in

   start)
      start_tomcat;;
   stop)
      stop_tomcat;;
   status)
      status_tomcat;;
   restart)
      stop_tomcat && start_tomcat;;
    *)
      usage;
esac



main $1

cat tomcat.sh

4.5 zabbix监控nginx状态自定义脚本获取数据

#!/bin/bash
NGINX_PORT=$1
nginx_active()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| grep 'Active' | awk 'print $NF'
        
nginx_reading()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| grep 'Reading' | awk 'print $2'
       
nginx_writing()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| grep 'Writing' | awk 'print $4'
       
nginx_waiting()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| grep 'Waiting' | awk 'print $6'
       
nginx_accepts()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| awk NR==3 | awk 'print $1'
       
nginx_handled()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| awk NR==3 | awk 'print $2'
       
nginx_requests()
        /usr/bin/curl "http://127.0.0.1:"$NGINX_PORT"/nginx_status/" 2>/dev/null| awk NR==3 | awk 'print $3'
       

main()
    case $2 in
        active)
            nginx_active;
            ;;
        reading)
            nginx_reading;
            ;;
        writing)
            nginx_writing;
            ;;
        waiting)
            nginx_waiting;
            ;;
        accepts)
            nginx_accepts;
            ;;
        handled)
            nginx_handled;
            ;;
        requests)
            nginx_requests;
        esac 


main $1 $2

zabbix_nginx_plugin.sh

4.6 CentOS[内存]/[cpu]/[硬盘]/[登录用户]

记得安装psutil模块

#!/usr/bin/env python
# coding=utf-8
import sys
import psutil
import time
import os
time_str =  time.strftime( "%Y-%m-%d", time.localtime( ) )
file_name = "./" + time_str + ".log"
 
if os.path.exists ( file_name ) == False :
   os.mknod( file_name )
   handle = open ( file_name , "w" )
else :
   handle = open ( file_name , "a" )
 
if len( sys.argv ) == 1 :
   print_type = 1
else :
   print_type = 2
 
def isset ( list_arr , name ) :
    if name in list_arr :
       return True
    else :
       return False
 
print_str = "";
if ( print_type == 1 ) or isset( sys.argv,"mem" )  :
 memory_convent = 1024 * 1024
 mem = psutil.virtual_memory()
 print_str +=  " 内存状态如下:\\n" 
 print_str = print_str + "   系统的内存容量为: "+str( mem.total/( memory_convent ) ) + " MB\\n" 
 print_str = print_str + "   系统的内存以使用容量为: "+str( mem.used/( memory_convent ) ) + " MB\\n" 
 print_str = print_str + "   系统可用的内存容量为: "+str( mem.total/( memory_convent ) - mem.used/( 1024*1024 )) + "MB\\n"
 print_str = print_str + "   内存的buffer容量为: "+str( mem.buffers/( memory_convent ) ) + " MB\\n" 
 print_str = print_str + "   内存的cache容量为:" +str( mem.cached/( memory_convent ) ) + " MB\\n"
 
if ( print_type == 1 ) or isset( sys.argv,"cpu" ) :
 print_str += " CPU状态如下:\\n"
 cpu_status = psutil.cpu_times()
 print_str = print_str + "   user = " + str( cpu_status.user ) + "\\n" 
 print_str = print_str + "   nice = " + str( cpu_status.nice ) + "\\n"
 print_str = print_str + "   system = " + str( cpu_status.system ) + "\\n"
 print_str = print_str + "   idle = " + str ( cpu_status.idle ) + "\\n"
 print_str = print_str + "   iowait = " + str ( cpu_status.iowait ) + "\\n"
 print_str = print_str + "   irq = " + str( cpu_status.irq ) + "\\n"
 print_str = print_str + "   softirq = " + str ( cpu_status.softirq ) + "\\n" 
 print_str = print_str + "   steal = " + str ( cpu_status.steal ) + "\\n"
 print_str = print_str + "   guest = " + str ( cpu_status.guest ) + "\\n"
 
if ( print_type == 1 ) or isset ( sys.argv,"disk" ) :
 print_str +=  " 硬盘信息如下:\\n" 
 disk_status = psutil.disk_partitions()
 for item in disk_status :
     print_str = print_str + "   "+ str( item ) + "\\n"
 
if ( print_type == 1 ) or isset ( sys.argv,"user" ) :
 print_str +=  " 登录用户信息如下:\\n " 
 user_status = psutil.users()
 for item in  user_status :
     print_str = print_str + "   "+ str( item ) + "\\n"
 
print_str += "---------------------------------------------------------------\\n"
print ( print_str )
handle.write( print_str )
handle.close()

4.7 mysql数据备份脚本

#!/bin/bash
# 
# 备份目录
BAK_DIR=/data/mysqldatabases/mysql_backup
# 备份日期
DATE_TIME=`date +%Y%m%d`
/usr/local/mysql/bin/mysqldump --single-transaction --all-databases --master-data=2 > $BAK_DIR/mysql_backup-$DATE_TIME.sql
# 备份后进行压缩归档,压缩后删除原文件
cd $BAK_DIR && tar Jcf mysql_backup-$DATE_TIME.tar.xz mysql_backup-$DATE_TIME.sql --remove
# 保留最近15天的备份数据
find $BAK_DIR -type f -name "mysql_backup*.tar.xz" -mtime +15 -exec rm -f  \\;

4.8 Nginx access.log 日志分析脚本

vim /data/access.sh
#!/bin/bash
BASE_PATH='/data/nginx/logs'
LOG_PATH=$(date -d yesterday +"%Y%m")
DAY=$(date -d yesterday +"%d")
SJ=`date -d "yesterday" "+%d/%b/%Y"`
date -d yesterday >> /home/shuju/shuju_$day.txt
#日UV
awk 'print $1' $BASE_PATH/$LOG_PATH/access_$DAY.log|sort | uniq -c |wc -l >> /home/shuju/shuju_$DAY.txt
#日PV
awk 'print $1' $BASE_PATH/$LOG_PATH/access_$DAY.log|wc -l >> /home/shuju/shuju_$DAY.txt
##峰值时间段
awk 'print $4' $BASE_PATH/$LOG_PATH/access_$DAY.log| grep "$SJ" |cut -c 14-15|sort|uniq -c|sort -nr|head -n 24 >>  /home/shuju/shuju_$DAY.txt
##访问频率前5的Url
awk 'print $7' $BASE_PATH/$LOG_PATH/access_$DAY.log| sort | uniq -c | sort -nr | head -n 5 >>  /home/shuju/shuju_$DAY.txt  

4.9 Nginx日志切割

#!/bin/bash
BASE_PATH='/data/nginx/logs' ##你的nginx日志目录
LOG_PATH=$(date -d yesterday +"%Y%m")
DAY=$(date -d yesterday +"%d")
mkdir -p $BASE_PATH/$LOG_PATH
mv $BASE_PATH/access.log $BASE_PATH/$LOG_PATH/access_$DAY.log
kill -USR1 `cat /data/nginx/logs/nginx.pid` ##你的nginx.pid文件存放位置

4.10 系统配置脚本

#!/bin/bash
 
currentTime=$(date +"%Y-%m-%d_%H:%M:%S")
 
echo "blacklist i2c_piix4"  >> /etc/modprobe.d/blacklist.conf
echo "black

以上是关于Python3 - 监控CentOS磁盘空间&预警发送邮件的主要内容,如果未能解决你的问题,请参考以下文章

(转)Linux磁盘空间监控告警

磁盘剩余空间监控

Linux/Unix shell 脚本监控磁盘可用空间

安装centos磁盘空间不足

磁盘空间监控--邮件报警

centos磁盘扩容-新空间增加到已有分区空间中