prometheus+jmx对应用进行监控
Posted ohyeimjia
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了prometheus+jmx对应用进行监控相关的知识,希望对你有一定的参考价值。
一.方案选型
对于应用的监控,可以使用actuator进行侵入式指标采集,也可以基于prometheus的jmx_exporter进行非侵入式采集。
使用actuator可以非常方便的进行服务监控端点基于注册中心进行自动发现,但是侵入式本身对于项目来说具有一定的研发改动成本不利于推进,同时在安全上也需要做一些合规措施,因此这个方案推进难度还是较大的。
这里可以采用第二种方法,基于prometheus的jmx_exporter进行非侵入式采集。推进容易,属于推荐方案。
二.收集指标维度
请求 | 应用进出口流量 |
qps | |
平均请求时间 | |
最慢请求时间 | |
内存 | jvm堆内存使用 |
非堆内存使用 | |
老年代使用情况 | |
新生代使用情况 | |
线程 | 线程死锁数 |
线程状态情况 | |
gc | gc次数 |
gc时间 | |
nmt | 内存泄漏监测 |
概况 | 应用存活 |
class加载数 |
基于以上指标数,可以对应用进行较为全面的监控。
三.操作
1.研发启动tomcat相关bean的收集:(tomcat相关监控指标,比如qps等统计需要)
研发在应用的application.yaml文件中修改:
打开相关参数
2.修改应用启动脚本
示例如下:
java -XX:NativeMemoryTracking=summary -javaagent:/xx/jmx_prometheus_javaagent-0.16.1.jar=19013:/xx/config.yaml -XX:+UnlockDiagnosticVMOptions -XX:+PrintNMTStatistics
config.yaml内容如下:
---
lowercaseOutputLabelNames: true
lowercaseOutputName: true
whitelistObjectNames: ["java.lang:type=OperatingSystem"]
blacklistObjectNames: []
rules:
- pattern: java.lang<type=OperatingSystem><>(committed_virtual_memory|free_physical_memory|free_swap_space|total_physical_memory|total_swap_space)_size:
name: os_$1_bytes
type: GAUGE
attrNameSnakeCase: true
- pattern: java.lang<type=OperatingSystem><>((?!process_cpu_time)\\w+):
name: os_$1
type: GAUGE
attrNameSnakeCase: true
- pattern: Tomcat<type=GlobalRequestProcessor, name=\\"(\\w+-\\w+)-(\\d+)\\"><>(\\w+):
name: tomcat_$3_total
labels:
port: "$2"
protocol: "$1"
help: Tomcat global $3
type: COUNTER
- pattern: Tomcat<j2eeType=Servlet, WebModule=//([-a-zA-Z0-9+&@#/%?=~_|!:.,;]*[-a-zA-Z0-9+&@#/%=~_|]), name=([-a-zA-Z0-9+/$%~_-|!.]*), J2EEApplication=none, J2EEServer=none><>(requestCount|maxTime|processingTime|errorCount):
name: tomcat_servlet_$3_total
labels:
module: "$1"
servlet: "$2"
help: Tomcat servlet $3 total
type: COUNTER
- pattern: Tomcat<type=ThreadPool, name="(\\w+-\\w+)-(\\d+)"><>(currentThreadCount|currentThreadsBusy|keepAliveCount|pollerThreadCount|connectionCount):
name: tomcat_threadpool_$3
labels:
port: "$2"
protocol: "$1"
help: Tomcat threadpool $3
type: GAUGE
- pattern: Tomcat<type=Manager, host=([-a-zA-Z0-9+&@#/%?=~_|!:.,;]*[-a-zA-Z0-9+&@#/%=~_|]), context=([-a-zA-Z0-9+/$%~_-|!.]*)><>(processingTime|sessionCounter|rejectedSessions|expiredSessions):
name: tomcat_session_$3_total
labels:
context: "$2"
host: "$1"
help: Tomcat session $3 total
type: COUNTER
3.修改完毕后将jmx端点注册到prometheus,示例:
- job_name: "jmx"
static_configs:
- targets: [ip:port]
启动应用,重载prometheus后即可观察到相关数据。
应用大屏可参考如下:
"annotations":
"list": [
"builtIn": 1,
"datasource": "-- Grafana --",
"enable": true,
"hide": true,
"iconColor": "rgba(0, 211, 255, 1)",
"name": "Annotations & Alerts",
"target":
"limit": 100,
"matchAny": false,
"tags": [],
"type": "dashboard"
,
"type": "dashboard"
]
,
"description": "Complete dashboard using metrics from prometheus JMX exporter, with drill down per job > instance",
"editable": true,
"gnetId": 8563,
"graphTooltip": 0,
"id": 45,
"iteration": 1645770479707,
"links": [],
"panels": [
"collapsed": false,
"datasource": null,
"gridPos":
"h": 1,
"w": 24,
"x": 0,
"y": 0
,
"id": 119,
"panels": [],
"title": "应用概况",
"type": "row"
,
"cacheTimeout": null,
"datasource": "$datasource",
"fieldConfig":
"defaults":
"color":
"mode": "thresholds"
,
"mappings": [
"options":
"0":
"text": "DOWN"
,
"1":
"text": "UP"
,
"type": "value"
,
"options":
"match": "null",
"result":
"text": "DOWN"
,
"type": "special"
],
"thresholds":
"mode": "absolute",
"steps": [
"color": "#d44a3a",
"value": null
,
"color": "#e24d42",
"value": 0
,
"color": "#299c46",
"value": 1
]
,
"unit": "none"
,
"overrides": []
,
"gridPos":
"h": 4,
"w": 4,
"x": 0,
"y": 1
,
"hideTimeOverride": false,
"id": 21,
"interval": null,
"links": [
"targetBlank": true,
"title": "Tomcat dashboard",
"url": "/d/chanjarster-tomcat-dashboard/tomcat-dashboard?$__url_time_range&$__all_variables"
],
"maxDataPoints": 100,
"options":
"colorMode": "value",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions":
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
,
"text": ,
"textMode": "auto"
,
"pluginVersion": "8.1.5",
"targets": [
"exemplar": true,
"expr": "upjob=\\"$job\\",instance=\\"$instance\\",servicetype=\\"$service\\"",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
],
"timeShift": null,
"title": "Status",
"type": "stat"
,
"cacheTimeout": null,
"datasource": "$datasource",
"fieldConfig":
"defaults":
"color":
"mode": "thresholds"
,
"decimals": 0,
"mappings": [
"options":
"match": "null",
"result":
"text": "N/A"
,
"type": "special"
],
"thresholds":
"mode": "absolute",
"steps": [
"color": "green",
"value": null
,
"color": "red",
"value": 80
]
,
"unit": "s"
,
"overrides": []
,
"gridPos":
"h": 4,
"w": 5,
"x": 4,
"y": 1
,
"id": 14,
"interval": null,
"links": [],
"maxDataPoints": 100,
"options":
"colorMode": "none",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions":
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
,
"text": ,
"textMode": "auto"
,
"pluginVersion": "8.1.5",
"targets": [
"exemplar": true,
"expr": "time() - process_start_time_secondsjob=\\"$job\\",instance=\\"$instance\\",servicetype=\\"$service\\"",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
],
"title": "Uptime",
"type": "stat"
,
"cacheTimeout": null,
"datasource": "$datasource",
"fieldConfig":
"defaults":
"color":
"mode": "thresholds"
,
"mappings": [
"options":
"match": "null",
"result":
"text": "N/A"
,
"type": "special"
],
"thresholds":
"mode": "absolute",
"steps": [
"color": "#299c46",
"value": null
,
"color": "rgba(237, 129, 40, 0.89)",
"value": 35
,
"color": "#d44a3a",
"value": 50
]
,
"unit": "dateTimeAsIso"
,
"overrides": []
,
"gridPos":
"h": 4,
"w": 5,
"x": 9,
"y": 1
,
"id": 15,
"interval": "",
"links": [],
"maxDataPoints": 100,
"options":
"colorMode": "none",
"graphMode": "none",
"justifyMode": "auto",
"orientation": "horizontal",
"reduceOptions":
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
,
"text": ,
"textMode": "auto"
,
"pluginVersion": "8.1.5",
"targets": [
"exemplar": true,
"expr": "process_start_time_secondsjob=\\"$job\\",instance=\\"$instance\\",servicetype=\\"$service\\"*1000",
"format": "time_series",
"instant": true,
"interval": "",
"intervalFactor": 1,
"legendFormat": "",
"refId": "A"
],
"title": "Start time",
"type": "stat"
,
"datasource": null,
"fieldConfig":
"defaults":
"color":
"fixedColor": "blue",
"mode": "fixed"
,
"mappings": [],
"thresholds":
"mode": "absolute",
"steps": [
"color": "green",
"value": null
,
"color": "red",
"value": 80
]
,
"overrides": []
,
"gridPos":
"h": 4,
"w": 5,
"x": 14,
"y": 1
,
"id": 103,
"options":
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions":
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
,
"text": ,
"textMode": "auto"
,
"pluginVersion": "8.1.5",
"targets": [
"exemplar": true,
"expr": "((jvm_memory_bytes_usedagenttype=\\"jmx-agent\\", area=\\"heap\\", instance=\\"$instance\\", job=\\"jmx\\", servicetype=\\"$service\\" / 1024 / 1024 ) / (jvm_memory_bytes_maxagenttype=\\"jmx-agent\\", area=\\"heap\\", instance=\\"$instance\\", job=\\"jmx\\", servicetype=\\"$service\\" / 1024 /1024)) * 100",
"interval": "",
"legendFormat": "",
"refId": "A"
],
"title": "堆内存使用百分比",
"type": "stat"
,
"datasource": null,
"fieldConfig":
"defaults":
"color":
"fixedColor": "blue",
"mode": "fixed"
,
"mappings": [],
"thresholds":
"mode": "absolute",
"steps": [
"color": "green",
"value": null
,
"color": "red",
"value": 80
]
,
"overrides": []
,
"gridPos":
"h": 4,
"w": 5,
"x": 19,
"y": 1
,
"id": 104,
"options":
"colorMode": "value",
"graphMode": "area",
"justifyMode": "auto",
"orientation": "auto",
"reduceOptions":
"calcs": [
"lastNotNull"
],
"fields": "",
"values": false
,
"text": ,
"textMode": "auto"
,
"pluginVersion": "8.1.5",
"targets": [
"exemplar": true,
"expr": "((jvm_memory_bytes_usedagenttype=\\"jmx-agent\\", area=\\"nonheap\\", instance=\\"$instance\\", job=\\"jmx\\", servicetype=\\"$service\\" / 1024 / 1024 ) / (jvm_memory_bytes_maxagenttype=\\"jmx-agent\\", area=\\"nonheap\\", instance=\\"$instance\\", job=\\"jmx\\", servicetype=\\"$service\\" / 1024 /1024)) * 100",
"interval": "",
"legendFormat": "",
"refId": "A"
],
"title": "非堆内存使用百分比",
"type": "stat"
,
"datasource": null,
"fieldConfig":
"defaults":
"color":
"mode": "palette-classic"
,
"custom":
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom":
"legend": false,
"tooltip": false,
"viz": false
,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution":
"type": "linear"
,
"showPoints": "never",
"spanNulls": false,
"stacking":
"group": "A",
"mode": "none"
,
"thresholdsStyle":
"mode": "off"
,
"mappings": [],
"thresholds":
"mode": "absolute",
"steps": [
"color": "green",
"value": null
,
"color": "red",
"value": 80
]
,
"overrides": []
,
"gridPos":
"h": 8,
"w": 12,
"x": 0,
"y": 5
,
"id": 101,
"options":
"legend":
"calcs": [
"lastNotNull",
"max"
],
"displayMode": "table",
"placement": "bottom"
,
"tooltip":
"mode": "multi"
,
"targets": [
"exemplar": true,
"expr": "process_open_fdsagenttype=\\"jmx-agent\\", instance=\\"$instance\\", job=\\"jmx\\", servicetype=\\"$service\\"",
"interval": "",
"legendFormat": "当前打开数",
"refId": "A"
,
"exemplar": true,
"expr": "process_max_fdsagenttype=\\"jmx-agent\\", instance=\\"$instance\\", job=\\"jmx\\", servicetype=\\"$service\\"",
"hide": false,
"interval": "",
"legendFormat": "最大打开数",
"refId": "B"
],
"title": "应用文件打开数",
"type": "timeseries"
,
"datasource": null,
"fieldConfig":
"defaults":
"color":
"mode": "palette-classic"
,
"custom":
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom":
"legend": false,
"tooltip": false,
"viz": false
,
"lineInterpolation": "linear",
"lineWidth": 1,
"pointSize": 5,
"scaleDistribution":
"type": "linear"
,
"showPoints": "never",
"spanNulls": false,
"stacking":
"group": "A",
"mode": "none"
,
"thresholdsStyle":
"mode": "off"
,
"mappings": [],
"thresholds":
"mode": "absolute",
"steps": [
"color": "green",
"value": null
,
"color": "red",
"value": 80
]
,
"overrides": []
,
"gridPos":
"h": 8,
"w": 12,
"x": 12,
"y": 5
,
"id": 111,
"options":
"legend":
"calcs": [
"lastNotNull",
"max",
"min"
],
"displayMode": "table",
"placement": "bottom"
,
"tooltip":
"mode": "multi"
,
"targets": [
"exemplar": true,
"expr": "jvm_classes_loadedjob=\\"$job\\", instance=\\"$instance\\"",
"interval": "",
"legendFormat": "当前打开数",
"refId": "A"
],
"title": "class装载数",
"type": "timeseries"
,
"collapsed": false,
"datasource": null,
"gridPos":
"h": 1,
"w": 24,
"x": 0,
"y": 13
,
"id": 84,
"panels": [],
"title": "请求信息",
"type": "row"
,
"datasource": null,
"fieldConfig":
"defaults":
"color":
"mode": "palette-classic"
,
"custom":
"axisLabel": "",
"axisPlacement": "auto",
"barAlignment": 0,
"drawStyle": "line",
"fillOpacity": 0,
"gradientMode": "none",
"hideFrom":
"legend": false,
以上是关于prometheus+jmx对应用进行监控的主要内容,如果未能解决你的问题,请参考以下文章
prometheus+grafana监控tomcat java应用
集群监控JMX exporter+Prometheus+Grafana监控Hadoop集群