Strom运行监控

Posted wenxuechaozhe

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Strom运行监控相关的知识,希望对你有一定的参考价值。

总所周知,storm提供的UI界面进行查看运行情况,但是在storm的运行过程中,无法时刻进行观察storm UI界面是否有失败或者处理延时等情况,根据STORM提供的API进行抓取storm运行情况并输出log文件,进行文件监控,若发生失败场景进行报警处理。

由于使用log日志监控,使用python进行编写此案例,示例代码如下:

#!/usr/bin/python
# -*- coding:UTF-8 -*-

import json
import urllib
import urllib2
import os

global cbe_id
global out_id

cbe_id = ''
out_id = ''


def storm_topology():
    req = urllib2.Request('http://127.0.0.1:8080/api/v1/topology/summary')
    response = urllib2.urlopen(req)
    topology = response.read().replace('"',"'",-1).replace("false","'false'",-1).replace("true","'true'",-1).replace("null","'null'",-1)
    topo = eval(topology)
    jsonStromArray = json.loads(json.dumps(topo))
    for ids in jsonStromArray['topologies']:
        if ids['name'] == "otsp_storm-out":
            global out_id
            out_id = ids['id']
        else:
            global cbe_id
            cbe_id = ids['id']


def cbe_summary(cbe_id):
    request = urllib2.Request('http://127.0.0.1:8080/api/v1/topology/' + cbe_id)
    response = urllib2.urlopen(request)
    cbe_topology = response.read().replace('"',"'",-1).replace("false","'false'",-1).replace("true","'true'",-1).replace("null","'null'",-1)
    cbe_topology = eval(cbe_topology)
    jsonCbeArray = json.loads(json.dumps(cbe_topology))
    for stats in jsonCbeArray['topologyStats']:
         if (stats['failed'] == 'null' or stats['failed'] == 0):
               stat = "%s\\n" % ('0')
         else:
               stat = "%s\\n" % ('0'+str(stats['failed']))
         to_csv_cbe(stat)
         break


def out_summary(out_id):
    request = urllib2.Request('http://127.0.0.1:8080/api/v1/topology/' + out_id)
    response = urllib2.urlopen(request)
    out_topology = response.read().replace('"',"'",-1).replace("false","'false'",-1).replace("true","'true'",-1).replace("null","'null'",-1)
    out_topology = eval(out_topology)
    jsonOutArray = json.loads(json.dumps(out_topology))
    for stats in jsonOutArray['topologyStats']:
        if (stats['failed'] == 'null' or stats['failed'] == 0):
              stat = "%s\\n" % ('0')
        else:
              stat = "%s\\n" % ('0'+str(stats['failed']))
        to_csv_out(stat)
        break

def to_csv_cbe(data):
    with open("/home/wenxuechao/storm_cbe.result", "a") as csvfile:
        csvfile.write(data)
        csvfile.close()

def to_csv_out(data):
    with open("/home/wenxuechao/storm_out.result", "a") as csvfile:
        csvfile.write(data)
        csvfile.close()


storm_topology()
cbe_summary(cbe_id)
out_summary(out_id)

由于本次案例运行有两个拓扑,因此代码同时监控两个拓扑信息,本次示例仅抓去了最近10分钟的失败数,同样如果要监控CAPACITY或者latency等信息从数组进行获取即可。

以上是关于Strom运行监控的主要内容,如果未能解决你的问题,请参考以下文章

strom基础

Strom优化指南

课堂笔记--Strom并发模型

一个PHP程序,同一时刻被请求多次,怎么让它只运行一次?

监控IIS的运行状态

基于 Prometheus 的监控