python 分析设备日志统计在线时间

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了python 分析设备日志统计在线时间相关的知识,希望对你有一定的参考价值。

需求:领导要求分析服务日志,计算出各个设备的在线时长,在线率,设备使用率等。

服务日志格式为:

2018-03-01 00:13:52,815 [protocol.handler.1][INFO] - cn.testin.trans.controller.req.script.HeartBeat.deviceLog---{"sdcardWriteable":0,"deviceAction":1,"sdcardAvail":10113417216,"wifiSsid":"testin03_default","rootEnable":0,"deviceid":"863563023966636","networkState":1,"errorMsg":"","minicapEnable":1,"deviceState":1,"romAvail":1655619584,"rpiid":"localhost","networkType":1,"ramAvail":1488954688,"debugMode":1,"errorCount":0,"batteryLevel":100}

以上标红部分:

deviceid: 为设备id, networkState:为网络状态,该值有1、0两个值  deviceState:为设备状态,该值有0、1、2三个值

device_report.py

#!/usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import json
from collections import Counter
from mysql import connector
import datetime
today = datetime.datetime.now().strftime("%Y%m%d")

def execute_sql(insert_sql) :
        try:
                # 配置信息
                config = {
                        'host': '127.0.0.1',
                        'port': 3306,
                        'user': 'analy',
                        'password': '****',
                        'database': 'analy',
                        'charset': 'utf8'
                }
                # 连接
                con = connector.connect(**config)
                cursor = con.cursor()

                # 利用字符串方式查询
                cursor.execute(insert_sql)
                con.commit()                            
   # 关闭
                cursor.close()
                con.close()
        except connector.Error, e:
                print(e)
                print(e.message)

#cur = conn.cursor()
records = None
def device():
    with open(sys.argv[1]) as f:
        for line in f:
#取出日志里的json字符串
            json_origin = line.split('---')[1]
#将json字符串格式化为字典
            json_final = json.loads(json_origin)
            DeviceId = json_final["deviceid"]
            NetworkState = json_final["networkState"]
            DeviceState = json_final["deviceState"]

            yield [DeviceId,NetworkState,DeviceState]

DeviceTotal,NetworkState_0,NetworkState_1,DeviceState_0,DeviceState_1,DeviceState_2 = Counter(), Counter(), Counter(), Counter(), Counter(), Counter()

header = ['deviceid','NetworkState','DeviceState']

for line in device():
    datadict = dict(zip(header,line))
    DeviceTotal[datadict['deviceid']] +=1
    if datadict['NetworkState'] == 0:
        NetworkState_0[datadict['deviceid']] +=1
    elif datadict['NetworkState'] == 1:
        NetworkState_1[datadict['deviceid']] +=1
    if datadict['DeviceState'] == 0:
        DeviceState_0[datadict['deviceid']] +=1
    elif datadict['DeviceState'] == 1:
        DeviceState_1[datadict['deviceid']] +=1
    elif datadict['DeviceState'] == 2:
        DeviceState_2[datadict['deviceid']] +=1

if DeviceTotal:
    for k,v in DeviceTotal.most_common(records):
        data = (today,k, (NetworkState_0[k]/float(v))*100, (NetworkState_1[k]/float(v))*100, (DeviceState_0[k]/float(v))*100, (DeviceState_1[k]/float(v))*100, (DeviceState_2[k]/float(v))*100,((DeviceState_1[k] + DeviceState_0[k])*10))
        query = ("INSERT INTO device_data_report""(date,deviceid,netstate_0_rate,netstate_1_rate,devicestate_0_rate,devicestate_1_rate,devicestate_2_rate,Online_time)" "VALUES('%s','%s','%s','%s','%s','%s','%s','%s')") % data
        execute_sql(query)

此时执行完成,需要不到两分钟。

此前shell脚本:

#!/bin/bash
#show all deviceid counts
Today=`date "+%Y%m%d"`
#FileName='test.txt'
FileName='devicereport'
TempFile='count1.txt'
mysql_sql='mysql -uanaly -pTestin.cn_123 '
cat ${FileName} | awk -F'---' '{print $2}' | jq '.deviceid'|sort|uniq -c >> ${TempFile}
for deviceid in `cat ${TempFile}|awk '{print $2}'`
do
   #find networkstate
   netstate_0=`grep $deviceid ${FileName}| awk -F'---' '{print $2}' | jq '.networkState'|grep 0 |wc -l`
   netstate_1=`grep $deviceid ${FileName}| awk -F'---' '{print $2}' | jq '.networkState'|grep 1 |wc -l`
   devicestate_0=`grep $deviceid ${FileName}| awk -F'---' '{print $2}' | jq '.deviceState'|grep 0 |wc -l`
   devicestate_1=`grep $deviceid ${FileName}| awk -F'---' '{print $2}' | jq '.deviceState'|grep 1 |wc -l`
   devicestate_2=`grep $deviceid ${FileName}| awk -F'---' '{print $2}' | jq '.deviceState'|grep 2 |wc -l`
   sumdevice=`grep $deviceid $TempFile|awk '{print $1}'`
#compute rate
   netstate_0_rate=`echo "scale=2;a=(${netstate_0}/${sumdevice})*100;if(length(a)==scale(a)) print 0;print a"|bc`
   netstate_1_rate=`echo "scale=2;a=(${netstate_1}/${sumdevice})*100;if(length(a)==scale(a)) print 0;print a"|bc`
   devicestate_0_rate=`echo "scale=2;a=(${devicestate_0}/${sumdevice})*100;if(length(a)==scale(a)) print 0;print a"|bc`
   devicestate_1_rate=`echo "scale=2;a=(${devicestate_1}/${sumdevice})*100;if(length(a)==scale(a)) print 0;print a"|bc`
   devicestate_2_rate=`echo "scale=2;a=(${devicestate_2}/${sumdevice})*100;if(length(a)==scale(a)) print 0;print a"|bc`
   Online_time=$(((${devicestate_0} + ${devicestate_1})/360))
#   ${mysql_sql} -e 'insert into analy.device_data_report (date,deviceid,netstate_0_rate,netstate_1_rate,devicestate_0_rate,devicestate_1_rate,devicestate_2_rate,Online_time) values ('$Today','$deviceid','$netstate_0_rate','$netstate_1_rate','$devicestate_0_rate','$devicestate_1_rate','$devicestate_2_rate','$Online_time')'
   echo $Today $deviceid $netstate_0_rate $netstate_1_rate $devicestate_0_rate $devicestate_1_rate $devicestate_2_rate $Online_time
done

此时执行完需要两个多小时。

脚本执行完入库的情况如下图:

技术分享图片

以上是关于python 分析设备日志统计在线时间的主要内容,如果未能解决你的问题,请参考以下文章

Oracle数据库从RMAN备份集片段还原指定单个归档日志进行日志挖掘分析

scala实战之spark用户在线时长和登录次数统计实例

常用python日期日志获取内容循环的代码片段

scala实战之spark用户在线时长和登录次数统计实例

python分析apache和nginx日志文件输出访客ip列表的代码

c_cpp 快速代码片段,用于在统计(阻止)/ dev / rdsk中的设备时验证fstat64和stat64的行为。