集群部署方案

Posted zhangyunfei-blog

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了集群部署方案相关的知识,希望对你有一定的参考价值。

集群部署方案

目录

 [显示

第一章 EZSonar高可用集群

1.1 集群基本概念

简单的说,集群(cluster)就是一组计算机,它们作为一个整体向用户提供一组网络资源。这些单个的计算机系统就是集群的节点(node)。一个理想的集群是,用户从来不会意识到集群系统底层的节点,在他/她们看来,集群是一个系统,而非多个计算机系统,并且集群系统的管理员可以随意增加和删改集群系统的节点。

1.2 集群的特点

1、高可用性:集群中的一个节点失效,它的任务可传递给其他节点,可以有效防止单点故障。

2、高性能:负载均衡集群允许系统承载更多访问用户。

3、高性价比:可以采用廉价的符合工业标准的硬件构造高性能的系统。

1.3 集群技术的实现

根据计算机集群技术的应用,目前常用的计算机集群系统主要有两种配置方式,即采用N节点配置和N+1节点配置。

N节点配置:计算机集群由N(N最小为2)个计算机节点组成,所有节点在正常情况下都具有自己的用户和工作负载。一个故障节点的资源能够通过故障恢复被转移到另外一个节点,但当剩余服务器承担额外负载的时候,其性能将有所下降。

N+1节点配置:计算机集群由N+1(N最小为2)个计算机节点组成,其中一个节点为热待机节点,它在其它节点正常运行期间一直处于空闲模式。而当运行的节点中某节点发生故障时,则空闲节点负责接管故障节点的工作,从而避免整个系统的性能下降。但是,由于待机节点在正常情况下并不提供服务,因而成本较高。

1.4 EZSonar高可用集群需求和背景

EZSonar产品是过旁路采集网络数据方式,再对流量进行解码,从而对业务系统进行监控。随着监控的业务系统增多,采集流量并发也增大,交易量增多,这样产品原架构就会出现性能瓶颈。同时原产品架构各进程都是单点部署,容易出现单点故障。

为此针对交易监控系统组件存在的问题,提出的集群部署方案,来解决单点问题及提高性能瓶颈,避免单节点故障,增加系统稳定性。 ?

第二章 ES集群

2.1 ES集群基本原理

2.1.1 基本概念

1、cluster(集群)

集群中由一个或者多个节点组成,其中有一个为主节点,这个主节点是可以通过选举产生的,主从节点是对于集群内部来说的。

集群有以下特点:

1) 集群内节点协同工作,共享数据,并共同分担工作负荷。

2) 由于节点是从属集群的,集群会自我重组来均匀地分发数据。

3) cluster Name是很重要的,因为每个节点只能是群集的一部分,当该节点被设置为相同的名称时,就会自动加入群集。

4) 集群中通过选举产生一个master节点,它将负责管理集群范畴的变更,例如创建或删除索引,添加节点到集群或从集群删除节点。master 节点无需参与文档层面的变更和搜索,这意味着仅有一个 master 节点并不会因流量增长而成为瓶颈。任意一个节点都可以成为 master 节点。我们例举的集群只有一个节点,因此它会扮演 master 节点的角色。

5) 作为用户,我们可以访问包括 master 节点在内的集群中的任一节点。每个节点都知道各个文档的位置,并能够将我们的请求直接转发到拥有我们想要的数据的节点。无论我们访问的是哪个节点,它都会控制从拥有数据的节点收集响应的过程,并返回给客户端最终的结果。

2、node(节点)

一个节点是一个逻辑上独立的服务,可以存储数据,并参与集群的索引和搜索功能, 一个节点也有唯一的名字,群集通过节点名称进行管理和通信。

3、index(索引)

索引与关系型数据库实例(Database)相当。索引只是一个 逻辑命名空间,它指向一个或多个分片(shards),内部用Apache Lucene实现索引中数据的读写。

4、Type(文档类型)

相当于数据库中的table概念。每个文档在ElasticSearch中都必须设定它的类型。文档类型使得同一个索引中在存储结构不同文档时,只需要依据文档类型就可以找到对应的参数映射(Mapping)信息,方便文档的存取。

5、Document(文档)

相当于数据库中的row, 是可以被索引的基本单位。在一个索引中,您可以存储多个的文档(文档格式是json)。虽然在一个索引中有多份文档,但这些文档的结构是一致的,并在第一次存储的时候指定, 文档属于一种类型(type),各种各样的类型存在于一个索引中。

6、shard(分片)和replica(副本)

代表索引分片,es可以把一个完整的索引分成多个分片,这样的好处是可以把一个大的索引拆分成多个,分布到不同的节点上,构成分布式搜索。主分片的数量只能在索引创建前指定,并且索引创建后不能更改,实际上,这个数字定义了能存储到索引中的数据最大量(具体的数量取决于你的数据,硬件的使用情况)。副本是主分片的一个副本,它用于用于冗余数据及提高搜索性能,从分片的数量可以在运行的集群中动态的调整,这样我们就可以根据实际需求扩展或者缩小规模。

7、recovery(数据恢复)

代表数据恢复或叫数据重新分布,es在有节点加入或退出时会根据机器的负载对索引分片进行重新分配,挂掉的节点重新启动时也会进行数据恢复。

8、discovery.zen

代表es的自动发现节点机制,es是一个基于p2p的系统,它先通过广播寻找存在的节点,再通过多播协议来进行节点之间的通信,同时也支持点对点的交互。

9、Transport

代表es内部节点或集群与客户端的交互方式,默认内部是使用tcp协议进行交互,同时它支持http协议(json格式)、thrift、servlet、memcached、zeroMQ等的传输协议(通过插件方式集成)。

2.1.2 ES集群启动原理

1、 单一节点集群

技术分享图片
技术分享图片

启动流程:

(1) 先读取hostname信息、es版本信息,加载数据目录等;

(2) 启动节点;

(3) 查看是否有集群和对应的集群master;

(4) 如果没有master,就会选举master节点;

(5) 恢复数据。 如下图启动日志:

技术分享图片

总结:一节点存在很大风险,如果这个节点故障了,那所有的数据都会丢失,因此单节点ES集群会有数据丢失的风险。

2、 两个节点集群

技术分享图片
技术分享图片

启动流程:

(1) ES2节点启动,启动时与ES1启动方式一样;

(2) 集群自我感知,把ES2节点加到集群中,原集群成双节点集群;(配置ES2与ES1的 cluster.name 相同)

(3) ES1把副本分配到ES2节点(复制分配过程中,先是存储在主分片中,然后平行复制到关联的复制节点上);

ES1新增的日志:

技术分享图片

ES2的日志:

技术分享图片

总结:双节点集群确保我们的数据在节点和复制节点上都可以被检索,同时也意味着在丢失一个节点的情况下,依旧能保证数据的完整性。双节点是能保证单节点故障,还没达到高可用性,仅仅是保证数据的完整性。

3、 三个节点集群

技术分享图片
技术分享图片

启动流程:

(1) ES3节点启动,启动时与ES1启动方式一样;

(2) 集群自我感知,把ES3节点加到集群中,原集群变为三节点集群;(配置ES3与ES1、ES2的 cluster.name 相同)

(3) ES3节点的数据是由ES1和ES2转移过去,最后三个节点都含2个分片,代替原来每个节点3个分片。

ES1新增的日志:

技术分享图片

ES2新增的日志:

技术分享图片

ES3的日志:

技术分享图片

总结:三节点,6个分片,每个节点含2个分片,这意味着每个节点的硬件资源(CPU、RAM、I/O)被较少的分片共享,这样每个分片就会有更好的表现。从而性能比原来双节点集群提高,达到高可用。

2.1.3 ES集群故障切换原理

当集群中,一个节点挂掉,ES集群是怎样切换的呢?

1、 master节点故障切换原理 master节点故障,首先是要重新选举master,master选举机制是根据加入集群的先后顺序

(1) master节点故障(例如ES1),ES1原来的主分片1和主分片2丢失,缺少主分片的时候索引是不能工作;

(2) 重新选举新的master(例如ES3);

(3) 由于丢失主分片的副本是分布在其他节点上,所以新的master(ES3)此时需提升ES2和ES3节点中的相应副本,让它们成为主分片,此时集群的健康状态为yellow;

(4) 由于集群有三个主分片,由于相同数据不能同时存储同一个节点,因此到这里,会导致把剩下的分片分下次,从而达到集群要求。此时集群的健康状态因此就变为green;

(5) 当新的master(ES3)也随着故障,集群会重新再选举master(ES2),由于ES2节点保存了每个分配的副本,所以应用是不会对数据丢失。

如下图所示:

技术分享图片
技术分享图片
技术分享图片


2、 普通节点故障切换原理

非master节点故障,切换过程比master节点故障少一个选举过程,由master节点对剩下的分配根据相同数据不能同时存储同一个节点原则进行分配,从而达到数据正常。

2.1.4 ES集群节点恢复原理

当故障的节点恢复,ES集群自我感知到,发现新节点,因此把该节点加入到集群中,同时master节点会把集群原来两个节点的分片移到到新节点,从而达到数据的平衡和高可用性能。

2.2 ES集群架构规划

2.2.1 shard和复制分片的规划

ES集群通过分片( shard )和副本( replica )实现了高性能、高伸缩和高可用。分片技术为大规模并行索引和搜索提供了支持,极大地提高了索引和搜索的性能,极大地提高了水平扩展能力;副本技术为数据提供冗余,部分机器故障不影响系统的正常使用,保证了系统的持续高可用。

主分片的数量在创建索引时已经给定。实际上,这个数字定义了能存储到索引里数据的最大数量(实际的数量取决于你的数据、硬件和使用情况)。当然,读请求搜索和文档检索能够通过主分片或者副本处理,所以数据的冗余越多。副本的数量可以在运行中的集群中动态地变更,这允许我们可以根据需求扩大或者缩小规模。

主分片的数量取决于数据的量级来定,副本的数量可以让集群横向拓展点,但是不是越多副本越好,当副本太多,因为大部分请求都聚集到了分片少的节点,导致一个节点吞吐量太大,反而降低性能。

建议集群至少含3个主分片和1个副本。

例如:

index.number_of_replicas: 1

index.number_of_shards: 3

2.2.2 node角色规划

每个节点是否允许被选举为主节点,是否允许存储数据,都是可以配置的。不同情况,有不同效果。

1、如果你想该节点可以为主节点和数据存储(默认情况);

node.master: true

node.data: true

2、如果你想该节点成为集群的“负载器”,那应该把这个节点设置为数据存储节点,非主节点;

node.master: false

node.data: true

3、如果你想该节点成为集群的“分配器”,那应该把这个节点只设置为主节点;

node.master: true

node.data: false

4、如果你想该节点成为集群的“搜索负载均衡器”,那应该把该节点设置非master和data;

node.master: false

node.data: false

2.2.3 硬件规划

整个集群的节点分为以下四种类型:

1、 既是master又是data节点:负责维护集群状态,又保存数据,硬件要求:配置要求越高越好,给es进程分配24g内存,硬盘最好是SSD硬盘;

2、 Master 节点:负责维护集群状态,不保存index数据,硬件要求: 一般性能的机器就可以,给es进程分配16g内存;

3、 只是data节点,只保存index的数据,不被选举为Master nodes 硬件要求: 配置要求越高越好,使用大硬盘,有条件可以上SSD硬盘;

4、 Client Nodes:主要用于负载均衡,不被选举为Master node, 也不保存index数据,硬件要求: 4核CPU, 64G内存或更高;

一个合理的集群应该包含三个master nodes, 1到多个data nodes, 最少一个client node,同时每个节点最好单独部署在一台硬件服务器。

2.3 ES安装

2.3.1 JDK安装

ES依赖java,因此安装ES前先确保安装好jdk,在这里选择jdk-8u65-linux-x64.rpm包。如果安装好的系统已经安装过jdk,并且jdk版本低于8,要先卸载低版本,再安装指定版本。

步骤如下:

? 检查系统是否存在jdk

[[email protected]]# rpm -qa|grep jdk

如存在,卸载,如下

[[email protected]]# rpm -e jdk1.7.0_65-1.7.0_65-fcs.x86_64

? 安装jdk

[[email protected]]# rpm -ivh jdk-8u65-linux-x64.rpm

? 添加环境变量,编辑/etc/profile文件,添加以下内容

export JAVA_HOME=/usr/java/default

export JRE_HOME=/usr/java/default/jre export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar

export PATH=$PATH:/sbin:/bin:/usr/sbin:/usr/bin:/usr/X11R6/bin:$JAVA_HOME/bin

2.3.2 ntp安装和设置

EZSonar的解码引擎和分析引擎必须进行时间同步,否则数据会出现严重的准确性问题。

如果客户环境部署了NTP服务器,则配置解码引擎和分析引擎同客户环境的NTP服务器进行同步。如果客户环境无NTP服务器,则配置分析引擎为NTP服务器,解码引擎为NTP客户端,彼此进行数据同步。

设置ntp步骤如下:

? 设置ntpd开机自启动

[[email protected]]# chkconfig ntpd on

? 配置ntp服务连接

[[email protected]]# vi /etc/ntp.conf

? 修改server的地址,示例使用了外网的NTP服务器地址

server 202.112.10.36

server 202.118.1.81

server 202.118.1.130

? 启动ntpd服务

[[email protected]]# service ntpd start

? 查看ntp的时间同步状态

[[email protected]]# ntpq -np

2.3.3 系统参数配置

? 在/etc/profile文件最后面添加

[[email protected]]# vi /etc/profile

ulimit -s unlimited

unset i

? 在limits.conf文件最后面添加

[[email protected]]# vi /etc/security/limits.conf

  • soft nofile 64000
  • hard nofile 64000
  • soft nproc unlimited
  • hard nproc unlimited
  • soft memlock unlimited
  • hard memlock unlimited

? 将90-nproc.conf文件root那行1024修改成unlimited

[[email protected]]# vi /etc/security/limits.d/90-nproc.conf

  • soft nproc unlimited

root soft nproc unlimited

2.3.4 安装ES

elasticsearch安装如下:

? 安装路径elasticsearch.tar.gz

[[email protected]]#tar -xf elasticsearch.tar.gz -C /usr/local/ezsonar/

2.4 ES集群配置

2.4.1 内存设置

默认24G,可根据虚拟机内存大小做更改,ES_MIN_MEM和ES_MAX_MEM设置的值要相同。

[[email protected]]# vi /usr/local/ezsonar/elasticsearch/bin/elasticsearch.in.sh/
1 if [ "x$ES_MIN_MEM" = "x" ]; then 2     ES_MIN_MEM=1g 3 fi 4 if [ "x$ES_MAX_MEM" = "x" ]; then 5 ES_MAX_MEM=1g 

2.4.2 ES节点配置

1、 配置文件详解

(1)集群名称

cluster.name: fushionskye

配置es的集群名称,默认是elasticsearch,es会自动发现在同一网段下的es,如果在同一网段下有多个集群,就可以用这个属性来区分不同的集群。

(2)节点名

node.name: "ES1"

(3)节点是否有资格被选举为master,默认为true。

node.master: true

(4)指定该节点是否存储索引数据,默认为true。

node.data: true

(5)设置默认索引分片个数,默认为5片。

index.number_of_shards: 5

(6)设置默认索引副本个数,默认为1个副本。

index.number_of_replicas: 1

(7)设置索引数据的存储路径,默认是es根目录下的data文件夹,可以设置多个存储路径,用逗号隔开.

path.data: /ezdata/es/

(8)设置日志文件的存储路径

path.logs: /var/log/ezsonar/es

(9)设置绑定的ip地址,可以是ipv4或ipv6的,默认为0.0.0.0。

network.bind_host: 192.168.0.1

(10)设置其它节点和该节点交互的ip地址。

network.publish_host: 192.168.0.1

(11)这个参数是用来同时设置bind_host和publish_host上面两个参数。

network.host: 192.168.0.1

(12)设置节点间交互的tcp端口,默认是9300。

transport.tcp.port: 9300

(13)设置是否压缩tcp传输时的数据,默认为false,不压缩。

transport.tcp.compress: true

(14)设置对外服务的http端口,默认为9200。

http.port: 9200 (15)设置内容的最大容量,默认100mb

http.max_content_length: 100mb

(16)设置集群中N个节点启动时进行数据恢复,默认为1。

gateway.recover_after_nodes: 1

(17)设置初始化数据恢复进程的超时时间,默认是5分钟。

gateway.recover_after_time: 5m

(18)设置这个集群中节点的数量,默认为2。

gateway.expected_nodes: 2

(19)设置这个参数来保证集群中的节点可以知道其它N个有master资格的节点。默认为1,对于大的集群来说,设置规则为(N/2+1)。

discovery.zen.minimum_master_nodes: 1

(20)设置集群中自动发现其它节点时ping连接超时时间,默认为3秒,对于比较差的网络环境可以高点的值来防止自动发现时出错。

discovery.zen.ping.timeout: 3s

(21)设置是否打开多播发现节点,默认是true。

discovery.zen.ping.multicast.enabled: false

(22)设置集群中master节点的初始列表,可以通过这些节点来自动发现新加入集群的节点。

discovery.zen.ping.unicast.hosts: ["host1", "host2:port", "host3[portX-portY]"]

(23)慢日志参数设置

index.search.slowlog.threshold.query.warn: 8s

index.search.slowlog.threshold.query.info: 3s index.search.slowlog.threshold.fetch.warn: 1s

index.search.slowlog.threshold.fetch.info: 800ms

index.indexing.slowlog.threshold.index.warn: 10s

index.indexing.slowlog.threshold.index.info: 5s

monitor.jvm.gc.old.warn: 10s

monitor.jvm.gc.old.info: 5s monitor.jvm.gc.old.debug: 2s

monitor.jvm.gc.ConcurrentMarkSweep.warn: 10s

monitor.jvm.gc.ConcurrentMarkSweep.info: 5s

monitor.jvm.gc.ConcurrentMarkSweep.debug: 2s

2、 涉及参数配置

(1)、 修改集群名称

luster.name: fusionskye

(2)、 修改节点名称

node.name: ES1

(3)、 修改master和data参数

node.master: true node.data: true

(4)、 配置集群参数

discovery.zen.ping.multicast.enabled: false

discovery.zen.ping.unicast.hosts:

["192.168.137.8","192.168.137.9:9300","192.168.137.10:9300"]

discovery.zen.minimum_master_nodes: 1

(5)、 配置主分片和副本

index.number_of_replicas: 1

index.number_of_shards: 3

2.4.3 tomcat配置

在路径/usr/local/ezsonar/tomcat7/conf/ezsonar下修改indexer.properties配置文件

修改indexer.host项,保证每个节点都连接tomcat。

indexer.host=192.168.137.8:9300,192.168.137.9:9300,192.168.137.10:9300

2.4.4 collector配置

修改collector的es-source.properties配置文件,修改时注意channel数和sink数要一致,同时至少每个ES节点有两个详细索引的sink,一个统计索引的sink。配置文件如下所示:
  1 agentes.sources = source1 memSrc aggegation   2 agentes.sinks = sink1 sink2 sink3 sink4 sink5sink6 sink7 sink8 sink9 sink10 sink11 sink12   3 agentes.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12   4    5 # Describe/configure source1   6 agentes.sources.source1.type = avro   7 agentes.sources.source1.bind = 0.0.0.0   8 agentes.sources.source1.port = 44444   9 agentes.sources.source1.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12  10 agentes.sources.source1.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector  11 agentes.sources.source1.selector.summaryIndex = channel10 channel11 channel12  12 agentes.sources.source1.interceptors = interceptor1  13 agentes.sources.source1.interceptors.interceptor1.type = com.fusionskye.ezsonar.collector.interceptor.EZSonarSourceInterceptor$Builder  14 agentes.sources.source1.interceptors.interceptor1.mongoHost = 127.0.0.1  15 agentes.sources.source1.interceptors.interceptor1.mongoPort = 27017  16 agentes.sources.source1.interceptors.interceptor1.mongoDatabase = ezsonar  17 agentes.sources.source1.interceptors.interceptor1.mongoUser = ezsonaruser  18 agentes.sources.source1.interceptors.interceptor1.mongoPassword = 123  19 agentes.sources.source1.interceptors.interceptor1.host = ezsonar_host  20 agentes.sources.source1.interceptors.interceptor1.streamFrequency = 60   21 agentes.sources.source1.interceptors.interceptor1.hostLocationFrequency = 60  22 agentes.sources.source1.interceptors.interceptor1.metricFrequency = 20  23 agentes.sources.source1.interceptors.interceptor1.cacheMaximumSize = 1000  24 agentes.sources.source1.interceptors.interceptor1.cacheExpire = 10  25 agentes.sources.source1.interceptors.interceptor1.filters = geo_ip, add_field, busi_filter  26 agentes.sources.source1.interceptors.interceptor1.databaseFile = /usr/local/ezsonar/collector/GeoLite2-City.mmdb  27 #agentes.sources.source1.interceptors.interceptor1.statsFile = /var/log/ezsonar/collector/collector_source_benchmark.log  28 #agentes.sources.source1.interceptors.interceptor1.statsMetricFilter = .*[t|T]ime.*  29 #agentes.sources.source1.interceptors.interceptor1.statsMetricFilter = aaaaa  30 agentes.sources.source1.interceptors.interceptor1.amountIsYuan = false  31 agentes.sources.source1.interceptors.interceptor1.debugStatus = false  32 agentes.sources.source1.interceptors.interceptor1.debugMessage = false  33 agentes.sources.source1.interceptors.interceptor1.timerStatus = true   34 agentes.sources.source1.interceptors.interceptor1.ttmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{DATA:_trans_id}[\|]%{DATA:_trans_ref}[\|]%{DATA:_ret_code}[\|]%{DATA:_ret_code_x}[\|]%{INT:_in_pkts}[\|]%{INT:_in_bytes}[\|]%{INT:_out_pkts}[\|]%{INT:_out_bytes}[\|]%{INT:_in_retran}[\|]%{INT:_out_retran}[\|]%{INT:_in_ooo}[\|]%{INT:_out_ooo}[\|]%{INT:_latency_msec}[\|]%{INT:_tot_syn}[\|]%{INT:_tot_synack}[\|]%{INT:_tot_fin}[\|]%{INT:_tot_fin_s}[\|]%{INT:_tot_rst}[\|]%{INT:_tot_rst_s}[\|]%{INT:_tot_zero_server}[\|]%{INT:_tot_zero_client}[\|]%{INT:_rtt}[\|]%{INT:_start_at}[\|]%{INT:_start_at_ms}[\|]%{DATA:_start_at_s}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{INT:_protocol}[\|]%{INT:_trans_transfer_ms}  35 #agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{INT:_protocol}[\|]%{INT:_start_at}[\|]%{INT:_start_at_ms}[\|]%{INT:_in_bytes}[\|]%{INT:_out_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_out_pkts}[\|]%{INT:_in_retran}[\|]%{INT:_out_retran}[\|]%{INT:_nw_delay_c2p_s}[\|]%{INT:_nw_delay_c2p_us}[\|]%{INT:_nw_delay_p2s_s}[\|]%{INT:_nw_delay_p2s_us}[\|]%{DATA:_flow_state}  36 #agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{INT:_input_snmp}[\|]%{INT:_output_snmp}[\|]%{INT:_in_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_protocol}[\|]%{DATA:_protocol_map}[\|]%{INT:_src_tos}[\|]%{INT:_tcp_flags}[\|]%{INT:_l4_src_port}[\|]%{DATA:_l4_src_port_map}[\|]%{DATA:_ipv4_src_addr}[\|]%{INT:_l4_dst_port}[\|]%{DATA:_l4_dst_port_map}[\|]%{DATA:_ipv4_dst_addr}[\|]%{INT:_src_as}[\|]%{INT:_dst_as}[\|]%{INT:_last_switched}[\|]%{INT:_start_at}[\|]%{INT:_out_bytes}[\|]%{INT:_out_pkts}[\|]%{DATA:_ipv6_src_addr}[\|]%{DATA:_ipv6_dst_addr}[\|]%{INT:_icmp_type}[\|]%{DATA:_in_src_mac}[\|]%{DATA:_out_dst_mac}[\|]%{INT:_src_vlan}[\|]%{INT:_dst_vlan}[\|]%{INT:_ip_protocol_version}[\|]%{INT:_direction}[\|]%{INT:_fragments}[\|]%{DOUBLE:_total_nw_latency_ms}[\|]%{INT:_num_pkts_up_to_128_bytes}[\|]%{INT:_num_pkts_128_to_256_bytes}[\|]%{INT:_num_pkts_256_to_512_bytes}[\|]%{INT:_num_pkts_512_to_1024_bytes}[\|]%{INT:_num_pkts_1024_to_1514_bytes}[\|]%{INT:_num_pkts_over_1514_bytes}[\|]%{INT:_retransmitted_in_pkts}[\|]%{INT:_retransmitted_out_pkts}[\|]%{INT:_ooorder_in_pkts}[\|]%{INT:_ooorder_out_pkts}[\|]%{INT:_tcp_win_zero_in}[\|]%{INT:_tcp_win_zero_out}[\|]%{INT:_tcp_est_latency_ms}[\|]%{INT:_tcp_flow_state}[\|]%{INT:_num_pkts_ttl_eq_1}[\|]%{INT:_num_pkts_ttl_2_5}[\|]%{INT:_num_pkts_ttl_5_32}[\|]%{INT:_num_pkts_ttl_32_64}[\|]%{INT:_num_pkts_ttl_64_96}[\|]%{INT:_num_pkts_ttl_96_128}[\|]%{INT:_num_pkts_ttl_128_160}[\|]%{INT:_num_pkts_ttl_160_192}[\|]%{INT:_num_pkts_ttl_192_224}[\|]%{INT:_num_pkts_ttl_224_255}[\|]%{INT:_duration_in}[\|]%{INT:_duration_out}[\|]%{INT:_tcp_win_min_in}[\|]%{INT:_tcp_win_max_in}[\|]%{INT:_tcp_win_mss_in}[\|]%{INT:_tcp_win_scale_in}[\|]%{INT:_tcp_win_min_out}[\|]%{INT:_tcp_win_max_out}[\|]%{INT:_tcp_win_mss_out}[\|]%{INT:_tcp_win_scale_out}[\|]%{DOUBLE:_appl_latency_ms}[\|]%{INT:_total_keepalive}[\|]%{INT:_cip}[\|]%{INT:_sip}  37 agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{INT:_input_snmp}[\|]%{INT:_output_snmp}[\|]%{INT:_in_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_protocol}[\|]%{DATA:_protocol_map}[\|]%{INT:_src_tos}[\|]%{INT:_tcp_flags}[\|]%{INT:_l4_src_port}[\|]%{DATA:_l4_src_port_map}[\|]%{DATA:_ipv4_src_addr}[\|]%{INT:_l4_dst_port}[\|]%{DATA:_l4_dst_port_map}[\|]%{DATA:_ipv4_dst_addr}[\|]%{INT:_src_as}[\|]%{INT:_dst_as}[\|]%{INT:_last_switched}[\|]%{INT:_start_at}[\|]%{INT:_out_bytes}[\|]%{INT:_out_pkts}[\|]%{DATA:_ipv6_src_addr}[\|]%{DATA:_ipv6_dst_addr}[\|]%{INT:_icmp_type}[\|]%{DATA:_in_src_mac}[\|]%{DATA:_out_dst_mac}[\|]%{INT:_src_vlan}[\|]%{INT:_dst_vlan}[\|]%{INT:_ip_protocol_version}[\|]%{INT:_direction}[\|]%{INT:_fragments}[\|]%{DOUBLE:_total_nw_latency_ms}[\|]%{INT:_num_pkts_up_to_128_bytes}[\|]%{INT:_num_pkts_128_to_256_bytes}[\|]%{INT:_num_pkts_256_to_512_bytes}[\|]%{INT:_num_pkts_512_to_1024_bytes}[\|]%{INT:_num_pkts_1024_to_1514_bytes}[\|]%{INT:_num_pkts_over_1514_bytes}[\|]%{INT:_retransmitted_in_pkts}[\|]%{INT:_retransmitted_out_pkts}[\|]%{INT:_ooorder_in_pkts}[\|]%{INT:_ooorder_out_pkts}[\|]%{INT:_tcp_win_zero_in}[\|]%{INT:_tcp_win_zero_out}[\|]%{INT:_tcp_est_latency_ms}[\|]%{INT:_tcp_flow_state}[\|]%{INT:_num_pkts_ttl_eq_1}[\|]%{INT:_num_pkts_ttl_2_5}[\|]%{INT:_num_pkts_ttl_5_32}[\|]%{INT:_num_pkts_ttl_32_64}[\|]%{INT:_num_pkts_ttl_64_96}[\|]%{INT:_num_pkts_ttl_96_128}[\|]%{INT:_num_pkts_ttl_128_160}[\|]%{INT:_num_pkts_ttl_160_192}[\|]%{INT:_num_pkts_ttl_192_224}[\|]%{INT:_num_pkts_ttl_224_255}[\|]%{INT:_duration_in}[\|]%{INT:_duration_out}[\|]%{INT:_tcp_win_min_in}[\|]%{INT:_tcp_win_max_in}[\|]%{INT:_tcp_win_mss_in}[\|]%{INT:_tcp_win_scale_in}[\|]%{INT:_tcp_win_min_out}[\|]%{INT:_tcp_win_max_out}[\|]%{INT:_tcp_win_mss_out}[\|]%{INT:_tcp_win_scale_out}[\|]%{DOUBLE:_appl_latency_ms}[\|]%{INT:_total_keepalive}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{DOUBLE:_client_nw_latency_ms}[\|]%{DOUBLE:_server_nw_latency_ms}[\|]%{INT:_appl_req_transfer_us}[\|]%{INT:_appl_resp_transfer_us}  38 #agentes.sources.source1.interceptors.interceptor1.dtmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{INT:_start_at}[\|]%{DATA:_dtype}[\|]%{DATA:_pay_load}  39 agentes.sources.source1.interceptors.interceptor1.addField.ttm._src_tcp = %{_src_ip}:%{_sport}  40 agentes.sources.source1.interceptors.interceptor1.addField.ntm._src_tcp = %{_src_ip}:%{_sport}  41 agentes.sources.source1.interceptors.interceptor1.addField.ntm._dport_protocol = %{_dport}:%{_protocol}  42 #agentes.sources.source1.interceptors.interceptor1.addField.dtm._pair = %{_src_ip}:%{_dst_ip}:%{_dport}:%{_dtype}  43 #agentes.sources.source1.interceptors.interceptor1.addField.ntm._pair = %{_src_ip}:%{_dst_ip}:%{_dport}  44 agentes.sources.source1.interceptors.interceptor1.addField.ntm._pair = %{_ipv4_src_addr}:%{_ipv4_dst_addr}:%{_l4_dst_port}:%{_l4_dst_port_map}  45 agentes.sources.source1.interceptors.interceptor1.serverValueFrequency = 60  46 agentes.sources.source1.interceptors.interceptor1.jmsDelayed = 60000  47 agentes.sources.source1.interceptors.interceptor1.jmsClearTime = 10000  48 #agentes.sources.source1.interceptors.interceptor1.bcStream = 54b4cbc014eae6d51b32203f  49 agentes.sources.source1.interceptors.interceptor1.splitChar = |  50 agentes.sources.source1.interceptors.interceptor1.ipFile = /usr/local/ezsonar/collector/china.ez  51 agentes.sources.source1.interceptors.interceptor1.busi_filter.keys = _trans_ref.BusiID  52 agentes.sources.source1.interceptors.interceptor1.busi_filter.matchKey = _trans_ref.Cid  53   54   55 # Describe sinsink1  56 agentes.sinks.sink1.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink  57 agentes.sinks.sink1.hostNames = 172.16.1.20:9300  58 agentes.sinks.sink1.indexName = ezsonar,ezsonarnpm  59 agentes.sinks.sink1.indexType = message  60 agentes.sinks.sink1.clusterName = fusionskye  61 agentes.sinks.sink1.batchSize = 8000  62 agentes.sinks.sink1.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory  63 agentes.sinks.sink1.serializer.hasMessage = false  64 agentes.sinks.sink1.timerStatus = true   65 agentes.sinks.sink1.channel = channel1  66 agentes.sinks.sink1.refreshInterval = 10  67 agentes.sinks.sink1.replics = 1  68 agentes.sinks.sink1.shards = 3  69   70 agentes.sinks.sink2.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink  71 agentes.sinks.sink2.hostNames = 172.16.1.20:9300  72 agentes.sinks.sink2.indexName = ezsonar,ezsonarnpm  73 agentes.sinks.sink2.indexType = message  74 agentes.sinks.sink2.clusterName = fusionskye  75 agentes.sinks.sink2.batchSize = 8000  76 agentes.sinks.sink2.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory  77 agentes.sinks.sink2.serializer.hasMessage = false  78 agentes.sinks.sink2.timerStatus = true   79 agentes.sinks.sink2.channel = channel2  80 agentes.sinks.sink2.refreshInterval = 10  81 agentes.sinks.sink2.replics = 1  82 agentes.sinks.sink2.shards = 3  83   84 agentes.sinks.sink3.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink  85 agentes.sinks.sink3.hostNames = 172.16.1.20:9300  86 agentes.sinks.sink3.indexName = ezsonar,ezsonarnpm  87 agentes.sinks.sink3.indexType = message  88 agentes.sinks.sink3.clusterName = fusionskye  89 agentes.sinks.sink3.batchSize = 8000  90 agentes.sinks.sink3.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory  91 agentes.sinks.sink3.serializer.hasMessage = false  92 agentes.sinks.sink3.timerStatus = true   93 agentes.sinks.sink3.channel = channel3  94 agentes.sinks.sink3.refreshInterval = 10  95 agentes.sinks.sink3.replics = 1  96 agentes.sinks.sink3.shards = 3  97   98 agentes.sinks.sink4.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink  99 agentes.sinks.sink4.hostNames = 172.16.1.30:9300 100 agentes.sinks.sink4.indexName = ezsonar,ezsonarnpm 101 agentes.sinks.sink4.indexType = message 102 agentes.sinks.sink4.clusterName = fusionskye 103 agentes.sinks.sink4.batchSize = 8000 104 agentes.sinks.sink4.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 105 agentes.sinks.sink4.serializer.hasMessage = false 106 agentes.sinks.sink4.timerStatus = true  107 agentes.sinks.sink4.channel = channel4 108 agentes.sinks.sink4.refreshInterval = 10 109 agentes.sinks.sink4.replics = 1 110 agentes.sinks.sink4.shards = 3 111  112 agentes.sinks.sink5.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 113 agentes.sinks.sink5.hostNames = 172.16.1.30:9300 114 agentes.sinks.sink5.indexName = ezsonar,ezsonarnpm 115 agentes.sinks.sink5.indexType = message 116 agentes.sinks.sink5.clusterName = fusionskye 117 agentes.sinks.sink5.batchSize = 8000 118 agentes.sinks.sink5.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 119 agentes.sinks.sink5.serializer.hasMessage = false 120 agentes.sinks.sink5.timerStatus = true  121 agentes.sinks.sink5.channel = channel5 122 agentes.sinks.sink5.refreshInterval = 10 123 agentes.sinks.sink5.replics = 1 124 agentes.sinks.sink5.shards = 3 125  126 agentes.sinks.sink6.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 127 agentes.sinks.sink6.hostNames = 172.16.1.30:9300 128 agentes.sinks.sink6.indexName = ezsonar,ezsonarnpm 129 agentes.sinks.sink6.indexType = message 130 agentes.sinks.sink6.clusterName = fusionskye 131 agentes.sinks.sink6.batchSize = 8000 132 agentes.sinks.sink6.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 133 agentes.sinks.sink6.serializer.hasMessage = false 134 agentes.sinks.sink6.timerStatus = true  135 agentes.sinks.sink6.channel = channel6 136 agentes.sinks.sink6.refreshInterval = 10 137 agentes.sinks.sink6.replics = 1 138 agentes.sinks.sink6.shards = 3 139  140 agentes.sinks.sink7.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 141 agentes.sinks.sink7.hostNames = 172.16.1.40:9300 142 agentes.sinks.sink7.indexName = ezsonar,ezsonarnpm 143 agentes.sinks.sink7.indexType = message 144 agentes.sinks.sink7.clusterName = fusionskye 145 agentes.sinks.sink7.batchSize = 8000 146 agentes.sinks.sink7.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 147 agentes.sinks.sink7.serializer.hasMessage = false 148 agentes.sinks.sink7.timerStatus = true  149 agentes.sinks.sink7.channel = channel7 150 agentes.sinks.sink7.refreshInterval = 10 151 agentes.sinks.sink7.replics = 1 152 agentes.sinks.sink7.shards = 3 153  154 agentes.sinks.sink8.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 155 agentes.sinks.sink8.hostNames = 172.16.1.40:9300 156 agentes.sinks.sink8.indexName = ezsonar,ezsonarnpm 157 agentes.sinks.sink8.indexType = message 158 agentes.sinks.sink8.clusterName = fusionskye 159 agentes.sinks.sink8.batchSize = 8000 160 agentes.sinks.sink8.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 161 agentes.sinks.sink8.serializer.hasMessage = false 162 agentes.sinks.sink8.timerStatus = true  163 agentes.sinks.sink8.channel = channel8 164 agentes.sinks.sink8.refreshInterval = 10 165 agentes.sinks.sink8.replics = 1 166 agentes.sinks.sink8.shards = 3 167  168 agentes.sinks.sink9.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 169 agentes.sinks.sink9.hostNames = 172.16.1.40:9300 170 agentes.sinks.sink9.indexName = ezsonar,ezsonarnpm 171 agentes.sinks.sink9.indexType = message 172 agentes.sinks.sink9.clusterName = fusionskye 173 agentes.sinks.sink9.batchSize = 8000 174 agentes.sinks.sink9.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 175 agentes.sinks.sink9.serializer.hasMessage = false 176 agentes.sinks.sink9.timerStatus = true  177 agentes.sinks.sink9.channel = channel9 178 agentes.sinks.sink9.refreshInterval = 10 179 agentes.sinks.sink9.replics = 1 180 agentes.sinks.sink9.shards = 3 181  182 agentes.sinks.sink10.type = com.fusionskye.ezsonar.collector.sink.EZSonarSummaryIndexSink 183 agentes.sinks.sink10.hostNames = 172.16.1.20:9300 184 agentes.sinks.sink10.indexName = analyzier,heatmap_summary 185 agentes.sinks.sink10.indexType = message 186 agentes.sinks.sink10.clusterName = fusionskye 187 agentes.sinks.sink10.batchSize = 8000 188 agentes.sinks.sink10.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 189 agentes.sinks.sink10.channel = channel10 190 agentes.sinks.sink10.refreshInterval = 10 191 agentes.sinks.sink10.summaryTimeout = 50 192 agentes.sinks.sink10.replics = 1 193 agentes.sinks.sink10.shards = 3 194  195 agentes.sinks.sink11.type = com.fusionskye.ezsonar.collector.sink.EZSonarSummaryIndexSink 196 agentes.sinks.sink11.hostNames = 172.16.1.30:9300 197 agentes.sinks.sink11.indexName = analyzier,heatmap_summary 198 agentes.sinks.sink11.indexType = message 199 agentes.sinks.sink11.clusterName = fusionskye 200 agentes.sinks.sink11.batchSize = 8000 201 agentes.sinks.sink11.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 202 agentes.sinks.sink11.channel = channel11 203 agentes.sinks.sink11.refreshInterval = 10 204 agentes.sinks.sink11.summaryTimeout = 50 205 agentes.sinks.sink11.replics = 1 206 agentes.sinks.sink11.shards = 3 207  208 agentes.sinks.sink12.type = com.fusionskye.ezsonar.collector.sink.EZSonarSummaryIndexSink 209 agentes.sinks.sink12.hostNames = 172.16.1.40:9300 210 agentes.sinks.sink12.indexName = analyzier,heatmap_summary 211 agentes.sinks.sink12.indexType = message 212 agentes.sinks.sink12.clusterName = fusionskye 213 agentes.sinks.sink12.batchSize = 8000 214 agentes.sinks.sink12.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 215 agentes.sinks.sink12.channel = channel12 216 agentes.sinks.sink12.refreshInterval = 10 217 agentes.sinks.sink12.summaryTimeout = 50 218 agentes.sinks.sink12.replics = 1 219 agentes.sinks.sink12.shards = 3 220  221 # Use a channel which buffers events in memory 222 agentes.channels.channel1.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 223 agentes.channels.channel1.capacity = 30000 224 agentes.channels.channel1.transactionCapacity = 8000 225 agentes.channels.channel1.keep-alive = 5  226  227 agentes.channels.channel2.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 228 agentes.channels.channel2.capacity = 30000 229 agentes.channels.channel2.transactionCapacity = 8000 230 agentes.channels.channel2.keep-alive = 5  231  232 agentes.channels.channel3.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 233 agentes.channels.channel3.capacity = 30000 234 agentes.channels.channel3.transactionCapacity = 8000 235 agentes.channels.channel3.keep-alive = 5  236  237 agentes.channels.channel4.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 238 agentes.channels.channel4.capacity = 30000 239 agentes.channels.channel4.transactionCapacity = 8000 240 agentes.channels.channel4.keep-alive = 5  241  242 agentes.channels.channel5.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 243 agentes.channels.channel5.capacity = 30000 244 agentes.channels.channel5.transactionCapacity = 8000 245 agentes.channels.channel5.keep-alive = 5  246  247 agentes.channels.channel6.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 248 agentes.channels.channel6.capacity = 30000 249 agentes.channels.channel6.transactionCapacity = 8000 250 agentes.channels.channel6.keep-alive = 5  251  252 agentes.channels.channel7.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 253 agentes.channels.channel7.capacity = 30000 254 agentes.channels.channel7.transactionCapacity = 8000 255 agentes.channels.channel7.keep-alive = 5  256  257 agentes.channels.channel8.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 258 agentes.channels.channel8.capacity = 30000 259 agentes.channels.channel8.transactionCapacity = 8000 260 agentes.channels.channel8.keep-alive = 5  261  262 agentes.channels.channel9.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 263 agentes.channels.channel9.capacity = 30000 264 agentes.channels.channel9.transactionCapacity = 8000 265 agentes.channels.channel9.keep-alive = 5  266  267 agentes.channels.channel10.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 268 agentes.channels.channel10.capacity = 30000 269 agentes.channels.channel10.transactionCapacity = 8000 270 agentes.channels.channel10.keep-alive = 5  271  272 agentes.channels.channel11.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 273 agentes.channels.channel11.capacity = 30000 274 agentes.channels.channel11.transactionCapacity = 8000 275 agentes.channels.channel11.keep-alive = 5  276  277 agentes.channels.channel12.type = com.fusionskye.ezsonar.collector.channel.BalanceMemoryChannel 278 agentes.channels.channel12.capacity = 30000 279 agentes.channels.channel12.transactionCapacity = 8000 280 agentes.channels.channel12.keep-alive = 5  281  282 agentes.sources.memSrc.type = com.fusionskye.ezsonar.collector.source.MemorySource 283 agentes.sources.memSrc.memKey = JMS 284 agentes.sources.memSrc.capacity = 50000 285 agentes.sources.memSrc.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 286 agentes.sources.memSrc.batchSize = 3000 287 agentes.sources.memSrc.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector 288 agentes.sources.memSrc.selector.summaryIndex = channel10 channel11 channel12 289  290 agentes.sources.aggegation.type = com.fusionskye.ezsonar.collector.source.MemorySource 291 agentes.sources.aggegation.memKey = aggegation 292 agentes.sources.aggegation.capacity = 100000 293 agentes.sources.aggegation.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 294 agentes.sources.aggegation.batchSize = 3000 295 agentes.sources.aggegation.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector 296 agentes.sources.aggegation.selector.summaryIndex = channel10 channel11 channel12 

2.5 ES集群验证

验证方法可参考如下:

1、每个节点启动,看节点的日志,看日志是否有正常,新增的节点是否有加入集群的日志报告;

2、查询启动后,集群的健康状态;

3、模拟节点故障,看数据显示是否正常,集群的健康状态和分片状态;

2.6 ES集群维护

1、使用浏览器获取ES版本信息

通过浏览器获取ES的版本信息,确保浏览器可以访问ES所在服务器的9200端口。

http://192.168.127.8:9200/?pretty

将在浏览器中显示如下信息,下列实例第6行内容为版本号信息, "number" : "1.7.0"。
 1 {  2      "status" : 200,  3      "name" : "Yellow Claw",  4      "cluster_name" : "fusionskye",  5      "version" : {  6      "number" : "1.7.0",  7        "build_hash" : "929b9739cae115e73c346cb5f9a6f24ba735a743",  8        "build_timestamp" : "2015-07-16T14:31:07Z",  9        "build_snapshot" : false, 10       "lucene_version" : "4.10.4" 11    }, 12     "tagline" : "You Know, for Search" 13   } 

2、使用命令行获取ES版本信息和健康状态

(1)获取ES版本信息的命令

curl ‘http://192.168.127.8:9200‘

(2)获取ES集群健康状态

curl -XGET ‘http://192.168.137.8:9200/_cluster/health?pretty‘

3、使用浏览器获取ES集群健康状态

http://192.168.127.9:9200/_plugin/head/

4、生产环境ES集群启停

如果ES集群中已经存放了大量的数据,直接重启ES集群中的某个节点可能会将会涉及到很多shard的relocation,重启的过程会很慢,而且会给系统带来很大的压力,直接的影响是导致交易无法及时写入ES,导致数据积压,界面数据展示出现延迟。

解决办法先把cluster.routing.allocation.disable_allocation设置为true,再重启对应节点,再把先把cluster.routing.allocation.disable_allocation设置为false。

(1) 确认当前ES集群的cluster.routing.allocation.disable_allocation状态,浏览器中输入:http://172.16.1.20:9200/_cluster/settings ,IP为ES节点;

(2) 关闭allocation功能,在ES所在的服务器输入命令,将cluster.routing.allocation.disable_allocation设置为true;
1 #临时将cluster.routing.allocation.disable_allocation设置为true,重启整个Cluster以后失效; 2 curl -XPUT  "http:// 172.16.1.20:9200/_cluster/settings" -d ‘{ "transient" : { "cluster.routing.allocation.disable_allocation" : true } }‘  3 #永久将cluster.routing.allocation.disable_allocation设置为true,重启整个Cluster以后仍然有效; 4 curl -XPUT  "http:// 172.16.1.20:9200/_cluster/settings" -d ‘{ "persistent" : { "cluster.routing.allocation.disable_allocation" : true } }‘ 

(3) 重启ES节点;

(4) 在ES所在的服务器输入命令,将cluster.routing.allocation.disable_allocation设置为false;
1 #临时将cluster.routing.allocation.disable_allocation设置为false,重启整个Cluster以后失效; 2  3 curl -XPUT  "http:// 172.16.1.20:9200/_cluster/settings" -d ‘{ "transient" : { "cluster.routing.allocation.disable_allocation" : false} }‘  4  5 #永久将cluster.routing.allocation.disable_allocation设置为false,重启整个Cluster以后仍然有效; 6  7 curl -XPUT  "http:// 172.16.1.20:9200/_cluster/settings" -d ‘{ "persistent" : { "cluster.routing.allocation.disable_allocation" : false} }‘ 

第三章 Collector集群

3.1 Collector集群原理

1、 采用负载均衡的方式,数据平均发送到多个Collector中,采用FileChannel的方式,保证数据完整。当一个Collector积压或异常,数据自动分配到其他的Collector上;

2、 采用负载均衡的方式,数据平均发送到多个Collector,减轻单节点的处理压力,提高性能。

3.2 Collector集群架构规划

3.2.1 当前现状

技术分享图片

当前设计的缺点:

(1) 单节点:一个进程,进程挂了就挂了,数据积压就积压;

(2) 性能瓶颈:当数据量大时,单节点的数据处理压力就大,性能有所限制;

(3) 重启丢数据:原来的数据主要是存储在内存上,重启将会导致数据丢失。

3.2.2 集群规划

技术分享图片

1、 分配节点不限定是2个,为了达到冗余备份和减轻单节点的压力,一般建议分配节点为2个。从目前设计来说分配节点2个也足够;

2、 数据处理节点数目2个以上,根据实际数据量来定数据节点数。一天10亿数据量,用两个数据处理节点已足够(从韬哥培训获取的信息);

3、 每个节点可以分开部署在不同的服务器,也可以部署在同一台服务器上,当启动的端口号要不一样。部署在不同的服务器上优点是当一台硬件服务器挂了,只影响一个节点,在集群中不会影响到数据,缺点是服务器的采购成本就高;

4、 考虑到成本和集群的安全性,一台服务器可以部署一个分配节点+一个数据处理节点,因此采用2+2(2个分配节点+2个数据处理节点)时,可采用两台硬件服务器;

5、 如果条件允许,collector集群最好与ES集群分开部署;

6、 单个collector实例内存分8-16G,推荐使用16G内存,分配节点和数据节点的内存都是一样的分配;

3.3 Collector安装与配置

3.3.1 Flume端

1)flume的基本原理

简单补充一下flume的基本原理,flume是分布式的日志收集系统,它将各个服务器中的数据收集起来并送到指定的地方去,简单来说flume就是收集日志的。

flume的核心是把数据从数据源(source)收集过来,在将收集到的数据送到指定的目的地(sink)。为了保证输送的过程一定成功,在送到目的地(sink)之前,会先缓存数据(channel),待数据真正到达目的地(sink)后,flume在删除自己缓存的数据。

2)flume的基本架构

技术分享图片

flume包含3个核心的组件:source-->channel-->sink,类似生产者、仓库、消费者的架构。

(1)source:source组件是专门用来收集数据的,可以处理各种类型、各种格式的日志数据,包括avro、exec等。

(2)channel:source组件把数据收集来以后,临时存放在channel中,即channel组件在agent中是专门用来存放临时数据的,对采集到的数据进行简单的缓存,可以存放在memory、jdbc、file等。

(3)sink:sink组件是用于把数据发送到目的地的组件,目的地包括avro、file等。

source接收到数据之后,将数据发送给channel,chanel作为一个数据缓冲区会临时存放这些数据,随后sink会将channel中的数据发送到指定的地方,只有在sink将channel中的数据成功发送出去之后,channel才会将临时数据进行删除,这种机制保证了数据传输的可靠性与安全性。

注意:本设计channe中的数据存放在file上。

3)flume安装

[[email protected]]#rpm –ivh jdk-8u65-linux-x64.rpm

[[email protected]]#tar -xf apache-flume-1.6.0-bin.tar.gz -C /usr/local/ezsonar/

4)flume部署架构

技术分享图片

5)flume配置

a) 从整体上描述代理agent中sources、sinks、channels所涉及到的组件。

设计中collector集群是由2+2组成的,即2个分配节点+2个数据节点,因此下面的配置定义两个channel和两个sinks,分别往collector两个分配节点送数据,以达到负载均衡。
1 agent.sources = ttmSrc                                //定义source 2 agent.channels = memoryChannel1 memoryChannel2    //定义channels 3 agent.sinks = avroSink1 avroSink2                    //定义sinks 
b) 详细描述agent中每一个source、sink与channel的具体实现。
 1 #配置source  2 agent.sources.ttmSrc.type = spooldir  3 agent.sources.ttmSrc.channels = memoryChannel1 memoryChannel2  4 agent.sources.ttmSrc.spoolDir = /ezdata/dump/eth2/app  5 agent.sources.ttmSrc.batchSize = 500  6 agent.sources.ttmSrc.deletePolicy = immediate  7 agent.sources.ttmSrc.ignorePattern = ^.*[temp|failover]$  8 agent.sources.ttmSrc.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector  9 agent.sources.ttmSrc.consumeOrder = youngest 10 #配置sink1 11 agent.sinks.avroSink1.type = avro       12 agent.sinks.avroSink1.hostname = 192.168.137.8    13 agent.sinks.avroSink1.port = 44444 14 agent.sinks.avroSink1.channel = memoryChannel1 15 agent.sinks.avroSink1.batch-size = 1000 16 #配置sink2 17 agent.sinks.avroSink2.type = avro  18 agent.sinks.avroSink2.hostname = 192.168.137.9 19 agent.sinks.avroSink2.port = 44444 20 agent.sinks.avroSink2.channel = memoryChannel2 21 agent.sinks.avroSink2.batch-size = 1000 22 #配置channel1 23 agent.channels.memoryChannel1.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 24 agent.channels.memoryChannel1.capacity = 20000 25 agent.channels.memoryChannel1.transactionCapacity = 5000 26 agent.channels.memoryChannel1.checkpointDir = /ezdata/channel1/checkpoint 27 agent.channels.memoryChannel1.dataDirs = /ezdata/channel1/data 28 agent.channels.memoryChannel1.maxFileSize = 1048576 29  30 #配置channel2 31 agent.channels.memoryChannel2.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 32 agent.channels.memoryChannel2.capacity = 20000 33 agent.channels.memoryChannel2.transactionCapacity = 5000 34 agent.channels.memoryChannel2.checkpointDir = /ezdata/channel2/checkpoint 35 agent.channels.memoryChannel2.dataDirs = /ezdata/channel2/data 36 agent.channels.memoryChannel2.maxFileSize = 1048576 

备注:双份channel和sink,做负载均衡(比如以前只有一个sink,那就变更为2个,如果以前就有两个,就加到4个)。

6)flume端插件安装

由于flume原来的设计是通过channel往内存存储数据,而现在高可能用的集群是通过file文件形势存放,这种方法既可以减缓内存和IO口的占用,也能长时间保存数据,不会因为进程中断而丢数据。因此对应要修改flume与collector连接的方式。

c) 增加collector集群对应的jar

例如:collector集群当前使用的版本为r1380

存放目录如下:

/usr/local/ezsonar/flume/lib 例如:

d) 增加source的type配置

agent.sources.ttmSrc.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector

e) 修改balance的type配置

agent.channels.memoryChannel1.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel

f) 启动脚本配置

bin/flume-ng agent -c conf -f conf/flume名称.conf -n agent -Dflume.monitoring.type=http -Dflume.monitoring.port=34546&

3.3.2 Collector-Balance端

1、 Balance端的原理

Balance是collector的分配节点,目的是把从flume端接收到的数据进行分配下发,起到数据分发作用,这样能高效处理数据,并且可以根据节点的性能来分配数据,以达到负载均衡,高可用性。

2、 Balnace端的安装

(1) JDK安装

[[email protected]]# rpm -ivh jdk-8u65-linux-x64.rpm

(2) 添加环境变量
1 export JAVA_HOME=/usr/java/jdk1.8.0_65 2 export JRE_HOME=/usr/java/jdk1.8.0_65/jre 3 export CLASSPATH=.:/lib/dt.jar:/lib/tools.jar 4 export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/bin 5 export fusionskye_home=/usr/local/ezsonar/common 6 export FLUME_CLASSPATH=/usr/local/ezsonar/collector/plugins.d 

(3) 安装common

注意:fusionskye.jar的版本为加密版本,大小为237k,如果原来的版本大小为220k的,请更新版本。

[[email protected]]# tar -xf common.tar.gz -C /usr/local/ezsonar/

(4) 安装Lib包(libEZSonar_L.so和libcryptopp.so)

[[email protected]]# unzip lib.zip –r /usr/lib64

(5) 安装collector

[[email protected]]#tar -xf collector.tar.gz -C /usr/local/ezsonar/

(6) 重命名collector目录名

为了更好区分分配节点和数据节点的collector目录,分配节点的目录采用collector-balance,数据节点目录采用collector-data

[[email protected]]#mv /usr/local/ezsonar/collector /usr/local/ezsonar/collector-balance1

(7) 更新collector版本

如果安装介质的版本是r1201,请更新为r1380版本
1 ##移动r1380版本到/usr/local/ezsonar/collector/plugins.d/EZSonar/lib 2 [[email protected]]#cp /home/ezsonar/ezsonar-collector-r1380-es1.7.0-jar-with-dependencies.jar /usr/local/ezsonar/collector/plugins.d/EZSonar/lib 3 ##删除原软连接,并重建软链接 4 ln -s ezsonar-collector-r1380-es1.7.0-jar-with-dependencies.jar ezsonar-collector-jar-with-dependencies.jar 

(8) 修改运行配置文件

? 新增参数:

1) #all表示单个运行,data表示数据处理,loadBalance表示负载均衡,不配置将使用all

agentes.sources.source1.interceptors.interceptor1.modelType = loadBalance

? 修改参数:

1) Sink数量是10个为一组,对应发给一个数据端,Channel数量和sink一致,一一对应。

例如:两个数据节点

1 agentes.sinks = sink1 sink2 sink3 sink4 sink5 sink6 sink7 sink8 sink9 sink10 sink11 sink12 sink13 sink14 sink15 sink16 sink17 sink18 sink19 sink20 2 agentes.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 channel13 channel14 channel15 channel16 channel17 channel18 channel19 channel20 

2) Sink的数据类型为avro;

agentes.sources.source1.type = avro

3) agentes.sources.source1.interceptors.interceptor1.host=节点名称(例如:coll-balance1),集群中各节点的名称必须不一样;

agentes.sources.source1.interceptors.interceptor1.host = coll_balance1

4) Channel采用File的模式,例如:

agentes.channels.channel1.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel

5) 新增参数

agentes.channels.channel名称.checkpointDir = /ezdata/b-channel名称/checkpoint

agentes.channels.channel名称.dataDirs = /ezdata/b-channel名称/data

agentes.channels.channel1.maxFileSize = 1048576 (单位是k,默认保存2g数据,超过后先清空)
1 agentes.channels.channel1.checkpointDir = /ezdata/b-channel1/checkpoint 2 agentes.channels.channel1.dataDirs = /ezdata/b-channel1/data 3 agentes.channels.channel1.maxFileSize = 1048576 

? 注意事项:

如果collector的Balance和Data端在同一台服务器,进程的端口号不能一样。Flume与collector的Balance相连,不要配置错端口号。
1 #定义collctor-balance节点的类型,绑定IP和端口号 2 agentes.sources.source1.type = avro 3 agentes.sources.source1.bind = 0.0.0.0 4 agentes.sources.source1.port = 44444 
(9) 可参考的详细配置:
  1 agentes.sources = source1 memSrc aggegation   2 agentes.sinks = sink1 sink2 sink3 sink4 sink5 sink6 sink7 sink8 sink9 sink10 sink11 sink12 sink13 sink14 sink15 sink16 sink17 sink18 sink19 sink20   3 agentes.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 channel13 channel14 channel15 channel16 channel17 channel18 channel19 channel20   4    5 # Describe/configure source1   6 agentes.sources.source1.type = avro   7 agentes.sources.source1.bind = 0.0.0.0   8 agentes.sources.source1.port = 44444   9 agentes.sources.source1.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 channel13 channel14 channel15 channel16 channel17 channel18 channel19 channel20  10 agentes.sources.source1.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector  11 agentes.sources.source1.interceptors = interceptor1  12 agentes.sources.source1.interceptors.interceptor1.type = com.fusionskye.ezsonar.collector.interceptor.EZSonarSourceInterceptor$Builder  13 agentes.sources.source1.interceptors.interceptor1.mongoHost = 127.0.0.1  14 agentes.sources.source1.interceptors.interceptor1.mongoPort = 27017  15 agentes.sources.source1.interceptors.interceptor1.mongoDatabase = ezsonar  16 agentes.sources.source1.interceptors.interceptor1.mongoUser = ezsonaruser  17 agentes.sources.source1.interceptors.interceptor1.mongoPassword = 123  18 agentes.sources.source1.interceptors.interceptor1.host = coll-balance1  19 agentes.sources.source1.interceptors.interceptor1.streamFrequency = 60   20 agentes.sources.source1.interceptors.interceptor1.hostLocationFrequency = 60  21 agentes.sources.source1.interceptors.interceptor1.metricFrequency = 20  22 agentes.sources.source1.interceptors.interceptor1.cacheMaximumSize = 1000  23 agentes.sources.source1.interceptors.interceptor1.cacheExpire = 10  24 agentes.sources.source1.interceptors.interceptor1.filters = geo_ip, add_field,app  25 agentes.sources.source1.interceptors.interceptor1.databaseFile = /usr/local/ezsonar/collector/GeoLite2-City.mmdb  26 agentes.sources.source1.interceptors.interceptor1.debugStatus = false  27 agentes.sources.source1.interceptors.interceptor1.debugMessage = false  28 agentes.sources.source1.interceptors.interceptor1.timerStatus = false   29 agentes.sources.source1.interceptors.interceptor1.ttmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{DATA:_trans_id}[\|]%{DATA:_trans_ref}[\|]%{DATA:_ret_code}[\|]%{DATA:_ret_code_x}[\|]%{INT:_in_pkts}[\|]%{INT:_in_bytes}[\|]%{INT:_out_pkts}[\|]%{INT:_out_bytes}[\|]%{INT:_in_retran}[\|]%{INT:_out_retran}[\|]%{INT:_in_ooo}[\|]%{INT:_out_ooo}[\|]%{INT:_latency_msec}[\|]%{INT:_tot_syn}[\|]%{INT:_tot_synack}[\|]%{INT:_tot_fin}[\|]%{INT:_tot_fin_s}[\|]%{INT:_tot_rst}[\|]%{INT:_tot_rst_s}[\|]%{INT:_tot_zero_server}[\|]%{INT:_tot_zero_client}[\|]%{INT:_rtt}[\|]%{INT:_start_at}[\|]%{INT:_start_at_ms}[\|]%{DATA:_start_at_s}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{INT:_protocol}[\|]%{INT:_trans_transfer_ms}  30 agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{INT:_input_snmp}[\|]%{INT:_output_snmp}[\|]%{INT:_in_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_protocol}[\|]%{DATA:_protocol_map}[\|]%{INT:_src_tos}[\|]%{INT:_tcp_flags}[\|]%{INT:_l4_src_port}[\|]%{DATA:_l4_src_port_map}[\|]%{DATA:_ipv4_src_addr}[\|]%{INT:_l4_dst_port}[\|]%{DATA:_l4_dst_port_map}[\|]%{DATA:_ipv4_dst_addr}[\|]%{INT:_src_as}[\|]%{INT:_dst_as}[\|]%{INT:_last_switched}[\|]%{INT:_start_at}[\|]%{INT:_out_bytes}[\|]%{INT:_out_pkts}[\|]%{DATA:_ipv6_src_addr}[\|]%{DATA:_ipv6_dst_addr}[\|]%{INT:_icmp_type}[\|]%{DATA:_in_src_mac}[\|]%{DATA:_out_dst_mac}[\|]%{INT:_src_vlan}[\|]%{INT:_dst_vlan}[\|]%{INT:_ip_protocol_version}[\|]%{INT:_direction}[\|]%{INT:_fragments}[\|]%{DOUBLE:_total_nw_latency_ms}[\|]%{INT:_num_pkts_up_to_128_bytes}[\|]%{INT:_num_pkts_128_to_256_bytes}[\|]%{INT:_num_pkts_256_to_512_bytes}[\|]%{INT:_num_pkts_512_to_1024_bytes}[\|]%{INT:_num_pkts_1024_to_1514_bytes}[\|]%{INT:_num_pkts_over_1514_bytes}[\|]%{INT:_retransmitted_in_pkts}[\|]%{INT:_retransmitted_out_pkts}[\|]%{INT:_ooorder_in_pkts}[\|]%{INT:_ooorder_out_pkts}[\|]%{INT:_tcp_win_zero_in}[\|]%{INT:_tcp_win_zero_out}[\|]%{INT:_tcp_est_latency_ms}[\|]%{INT:_tcp_flow_state}[\|]%{INT:_num_pkts_ttl_eq_1}[\|]%{INT:_num_pkts_ttl_2_5}[\|]%{INT:_num_pkts_ttl_5_32}[\|]%{INT:_num_pkts_ttl_32_64}[\|]%{INT:_num_pkts_ttl_64_96}[\|]%{INT:_num_pkts_ttl_96_128}[\|]%{INT:_num_pkts_ttl_128_160}[\|]%{INT:_num_pkts_ttl_160_192}[\|]%{INT:_num_pkts_ttl_192_224}[\|]%{INT:_num_pkts_ttl_224_255}[\|]%{INT:_duration_in}[\|]%{INT:_duration_out}[\|]%{INT:_tcp_win_min_in}[\|]%{INT:_tcp_win_max_in}[\|]%{INT:_tcp_win_mss_in}[\|]%{INT:_tcp_win_scale_in}[\|]%{INT:_tcp_win_min_out}[\|]%{INT:_tcp_win_max_out}[\|]%{INT:_tcp_win_mss_out}[\|]%{INT:_tcp_win_scale_out}[\|]%{DOUBLE:_appl_latency_ms}[\|]%{INT:_total_keepalive}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{DOUBLE:_client_nw_latency_ms}[\|]%{DOUBLE:_server_nw_latency_ms}  31   32 agentes.sources.source1.interceptors.interceptor1.dtmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{INT:_start_at}[\|]%{DATA:_dtype}[\|]%{DATA:_pay_load}  33   34 agentes.sources.source1.interceptors.interceptor1.rtmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{INT:_start_at_us}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{DATA:_trans_ref}[\|]%{DATA:sn}[\|]%{DATA:rsp_header}[\|]%{DATA:req_header}[\|]%{INT:_in_pkts}[\|]%{INT:_in_bytes}[\|]%{INT:_out_pkts}[\|]%{INT:_out_bytes}[\|]%{INT:_in_retran}[\|]%{INT:_out_retran}[\|]%{INT:_in_ooo}[\|]%{INT:_out_ooo}[\|]%{INT:_latency_msec}[\|]%{INT:_tot_syn}[\|]%{INT:_tot_synack}[\|]%{INT:_tot_fin}[\|]%{INT:_tot_fin_s}[\|]%{INT:_tot_rst}[\|]%{INT:_tot_rst_s}[\|]%{INT:_tot_zero_server}[\|]%{INT:_tot_zero_client}[\|]%{INT:_rtt}[\|]%{INT:_start_at}[\|]%{INT:_start_at_ms}[\|]%{DATA:_start_at_s}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{INT:_protocol}[\|]%{INT:_trans_transfer_ms}  35   36 agentes.sources.source1.interceptors.interceptor1.stmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_stat}  37   38   39 agentes.sources.source1.interceptors.interceptor1.addField.ttm._src_tcp = %{_src_ip}:%{_sport}  40 agentes.sources.source1.interceptors.interceptor1.addField.ntm._pair = %{_ipv4_src_addr}:%{_ipv4_dst_addr}:%{_l4_dst_port}:%{_l4_dst_port_map}  41 agentes.sources.source1.interceptors.interceptor1.addField.dtm._pair = %{_src_ip}:%{_dst_ip}:%{_dport}:%{_dtype}  42 agentes.sources.source1.interceptors.interceptor1.serverValueFrequency = 60  43 agentes.sources.source1.interceptors.interceptor1.jmsDelayed = 60000  44 agentes.sources.source1.interceptors.interceptor1.jmsClearTime = 10000  45 agentes.sources.source1.interceptors.interceptor1.splitChar = |  46 agentes.sources.source1.interceptors.interceptor1.ipFile = /usr/local/ezsonar/collector/china.ez  47 agentes.sources.source1.interceptors.interceptor1.amountIsYuan = false  48 #all表示单个运行,data表示数据处理,loadBalance表示负载均衡,不配置将使用all  49 agentes.sources.source1.interceptors.interceptor1.modelType = loadBalance  50   51 # Describe sinsink1  52 agentes.sinks.sink1.type = avro  53 agentes.sinks.sink1.hostname = 192.168.137.8  54 agentes.sinks.sink1.port = 44446  55 agentes.sinks.sink1.channel = channel1  56 agentes.sinks.sink1.batchSize = 1000  57   58 agentes.sinks.sink2.type = avro  59 agentes.sinks.sink2.hostname = 192.168.137.8  60 agentes.sinks.sink2.port = 44446  61 agentes.sinks.sink2.channel = channel2  62 agentes.sinks.sink2.batchSize = 1000  63   64 agentes.sinks.sink3.type = avro  65 agentes.sinks.sink3.hostname = 192.168.137.8  66 agentes.sinks.sink3.port = 44446  67 agentes.sinks.sink3.channel = channel3  68 agentes.sinks.sink3.batchSize = 1000  69   70 agentes.sinks.sink4.type = avro  71 agentes.sinks.sink4.hostname = 192.168.137.8  72 agentes.sinks.sink4.port = 44446  73 agentes.sinks.sink4.channel = channel4  74 agentes.sinks.sink4.batchSize = 1000  75   76 agentes.sinks.sink5.type = avro  77 agentes.sinks.sink5.hostname = 192.168.137.8  78 agentes.sinks.sink5.port = 44446  79 agentes.sinks.sink5.channel = channel5  80 agentes.sinks.sink5.batchSize = 1000  81   82 agentes.sinks.sink6.type = avro  83 agentes.sinks.sink6.hostname = 192.168.137.8  84 agentes.sinks.sink6.port = 44446  85 agentes.sinks.sink6.channel = channel6  86 agentes.sinks.sink6.batchSize = 1000  87   88 agentes.sinks.sink7.type = avro  89 agentes.sinks.sink7.hostname = 192.168.137.8  90 agentes.sinks.sink7.port = 44446  91 agentes.sinks.sink7.channel = channel7  92 agentes.sinks.sink7.batchSize = 1000  93   94 agentes.sinks.sink8.type = avro  95 agentes.sinks.sink8.hostname = 192.168.137.8  96 agentes.sinks.sink8.port = 44446  97 agentes.sinks.sink8.channel = channel8  98 agentes.sinks.sink8.batchSize = 1000  99  100 agentes.sinks.sink9.type = avro 101 agentes.sinks.sink9.hostname = 192.168.137.8 102 agentes.sinks.sink9.port = 44446 103 agentes.sinks.sink9.channel = channel9 104 agentes.sinks.sink9.batchSize = 1000 105  106  107 agentes.sinks.sink10.type = avro 108 agentes.sinks.sink10.hostname = 192.168.137.8 109 agentes.sinks.sink10.port = 44446 110 agentes.sinks.sink10.channel = channel10 111 agentes.sinks.sink10.batchSize = 1000 112  113 agentes.sinks.sink11.type = avro 114 agentes.sinks.sink11.hostname = 192.168.137.9 115 agentes.sinks.sink11.port = 44447 116 agentes.sinks.sink11.channel = channel11 117 agentes.sinks.sink11.batchSize = 1000 118  119 agentes.sinks.sink12.type = avro 120 agentes.sinks.sink12.hostname = 192.168.137.9 121 agentes.sinks.sink12.port = 44447 122 agentes.sinks.sink12.channel = channel12 123 agentes.sinks.sink12.batchSize = 1000 124  125 agentes.sinks.sink13.type = avro 126 agentes.sinks.sink13.hostname = 192.168.137.9 127 agentes.sinks.sink13.port = 44447 128 agentes.sinks.sink13.channel = channel13 129 agentes.sinks.sink13.batchSize = 1000 130  131 agentes.sinks.sink14.type = avro 132 agentes.sinks.sink14.hostname = 192.168.137.9 133 agentes.sinks.sink14.port = 44447 134 agentes.sinks.sink14.channel = channel14 135 agentes.sinks.sink14.batchSize = 1000 136  137 agentes.sinks.sink15.type = avro 138 agentes.sinks.sink15.hostname = 192.168.137.9 139 agentes.sinks.sink15.port = 44447 140 agentes.sinks.sink15.channel = channel15 141 agentes.sinks.sink15.batchSize = 1000 142  143 agentes.sinks.sink16.type = avro 144 agentes.sinks.sink16.hostname = 192.168.137.9 145 agentes.sinks.sink16.port = 44447 146 agentes.sinks.sink16.channel = channel16 147 agentes.sinks.sink16.batchSize = 1000 148  149 agentes.sinks.sink17.type = avro 150 agentes.sinks.sink17.hostname = 192.168.137.9 151 agentes.sinks.sink17.port = 44447 152 agentes.sinks.sink17.channel = channel17 153 agentes.sinks.sink17.batchSize = 1000 154  155 agentes.sinks.sink18.type = avro 156 agentes.sinks.sink18.hostname = 192.168.137.9 157 agentes.sinks.sink18.port = 44447 158 agentes.sinks.sink18.channel = channel18 159 agentes.sinks.sink18.batchSize = 1000 160  161 agentes.sinks.sink19.type = avro 162 agentes.sinks.sink19.hostname = 192.168.137.9 163 agentes.sinks.sink19.port = 44447 164 agentes.sinks.sink19.channel = channel19 165 agentes.sinks.sink19.batchSize = 1000 166  167 agentes.sinks.sink20.type = avro 168 agentes.sinks.sink20.hostname = 192.168.137.9 169 agentes.sinks.sink20.port = 44447 170 agentes.sinks.sink20.channel = channel20 171 agentes.sinks.sink20.batchSize = 1000 172  173 # Use a channel which buffers events in memory 174 agentes.channels.channel1.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 175 agentes.channels.channel1.capacity = 50000 176 agentes.channels.channel1.transactionCapacity = 1000 177 agentes.channels.channel1.checkpointDir = /ezdata/b-channel1/checkpoint 178 agentes.channels.channel1.dataDirs = /ezdata/b-channel1/data 179 agentes.channels.channel1.maxFileSize = 1048576 180  181 agentes.channels.channel2.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 182 agentes.channels.channel2.capacity = 50000 183 agentes.channels.channel2.transactionCapacity = 1000 184 agentes.channels.channel2.checkpointDir = /ezdata/b-channel2/checkpoint 185 agentes.channels.channel2.dataDirs = /ezdata/b-channel2/data  186 agentes.channels.channel2.maxFileSize = 1048576 187  188 agentes.channels.channel3.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 189 agentes.channels.channel3.capacity = 50000 190 agentes.channels.channel3.transactionCapacity = 1000 191 agentes.channels.channel3.checkpointDir = /ezdata/b-channel3/checkpoint 192 agentes.channels.channel3.dataDirs = /ezdata/b-channel3/data 193 agentes.channels.channel3.maxFileSize = 1048576 194  195 agentes.channels.channel4.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 196 agentes.channels.channel4.capacity = 50000 197 agentes.channels.channel4.transactionCapacity = 1000 198 agentes.channels.channel4.checkpointDir = /ezdata/b-channel4/checkpoint 199 agentes.channels.channel4.dataDirs = /ezdata/b-channel4/data 200 agentes.channels.channel4.maxFileSize = 1048576 201  202 agentes.channels.channel5.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 203 agentes.channels.channel5.capacity = 50000 204 agentes.channels.channel5.transactionCapacity = 1000 205 agentes.channels.channel5.checkpointDir = /ezdata/b-channel5/checkpoint 206 agentes.channels.channel5.dataDirs = /ezdata/b-channel5/data 207 agentes.channels.channel5.maxFileSize = 1048576 208  209 agentes.channels.channel6.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 210 agentes.channels.channel6.capacity = 50000 211 agentes.channels.channel6.transactionCapacity = 1000 212 agentes.channels.channel6.checkpointDir = /ezdata/b-channel6/checkpoint 213 agentes.channels.channel6.dataDirs = /ezdata/b-channel6/data 214 agentes.channels.channel6.maxFileSize = 1048576 215  216 agentes.channels.channel7.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 217 agentes.channels.channel7.capacity = 50000 218 agentes.channels.channel7.transactionCapacity = 1000 219 agentes.channels.channel7.checkpointDir = /ezdata/b-channel7/checkpoint 220 agentes.channels.channel7.dataDirs = /ezdata/b-channel7/data 221 agentes.channels.channel7.maxFileSize = 1048576 222  223 agentes.channels.channel8.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 224 agentes.channels.channel8.capacity = 50000 225 agentes.channels.channel8.transactionCapacity = 1000 226 agentes.channels.channel8.checkpointDir = /ezdata/b-channel8/checkpoint 227 agentes.channels.channel8.dataDirs = /ezdata/b-channel8/data 228 agentes.channels.channel8.maxFileSize = 1048576 229  230 agentes.channels.channel9.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 231 agentes.channels.channel9.capacity = 50000 232 agentes.channels.channel9.transactionCapacity = 1000 233 agentes.channels.channel9.checkpointDir = /ezdata/b-channel9/checkpoint 234 agentes.channels.channel9.dataDirs = /ezdata/b-channel9/data 235 agentes.channels.channel9.maxFileSize = 1048576 236  237 agentes.channels.channel10.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 238 agentes.channels.channel10.capacity = 50000 239 agentes.channels.channel10.transactionCapacity = 1000 240 agentes.channels.channel10.checkpointDir = /ezdata/b-channel10/checkpoint 241 agentes.channels.channel10.dataDirs = /ezdata/b-channel10/data 242 agentes.channels.channel10.maxFileSize = 1048576 243  244 agentes.channels.channel11.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 245 agentes.channels.channel11.capacity = 50000 246 agentes.channels.channel11.transactionCapacity = 1000 247 agentes.channels.channel11.checkpointDir = /ezdata/b-channel11/checkpoint 248 agentes.channels.channel11.dataDirs = /ezdata/b-channel11/data 249 agentes.channels.channel11.maxFileSize = 1048576 250  251 agentes.channels.channel12.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 252 agentes.channels.channel12.capacity = 50000 253 agentes.channels.channel12.transactionCapacity = 1000 254 agentes.channels.channel12.checkpointDir = /ezdata/b-channel12/checkpoint 255 agentes.channels.channel12.dataDirs = /ezdata/b-channel12/data 256 agentes.channels.channel12.maxFileSize = 1048576 257  258 agentes.channels.channel13.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 259 agentes.channels.channel13.capacity = 50000 260 agentes.channels.channel13.transactionCapacity = 1000 261 agentes.channels.channel13.checkpointDir = /ezdata/b-channel13/checkpoint 262 agentes.channels.channel13.dataDirs = /ezdata/b-channel13/data 263 agentes.channels.channel13.maxFileSize = 1048576 264  265 agentes.channels.channel14.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 266 agentes.channels.channel14.capacity = 50000 267 agentes.channels.channel14.transactionCapacity = 1000 268 agentes.channels.channel14.checkpointDir = /ezdata/b-channel14/checkpoint 269 agentes.channels.channel14.dataDirs = /ezdata/b-channel14/data 270 agentes.channels.channel14.maxFileSize = 1048576 271  272 agentes.channels.channel15.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 273 agentes.channels.channel15.capacity = 50000 274 agentes.channels.channel15.transactionCapacity = 1000 275 agentes.channels.channel15.checkpointDir = /ezdata/b-channel15/checkpoint 276 agentes.channels.channel15.dataDirs = /ezdata/b-channel15/data 277 agentes.channels.channel15.maxFileSize = 1048576 278  279 agentes.channels.channel16.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 280 agentes.channels.channel16.capacity = 50000 281 agentes.channels.channel16.transactionCapacity = 1000 282 agentes.channels.channel16.checkpointDir = /ezdata/b-channel16/checkpoint 283 agentes.channels.channel16.dataDirs = /ezdata/b-channel16/data 284 agentes.channels.channel16.maxFileSize = 1048576 285  286 agentes.channels.channel17.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 287 agentes.channels.channel17.capacity = 50000 288 agentes.channels.channel17.transactionCapacity = 1000 289 agentes.channels.channel17.checkpointDir = /ezdata/b-channel17/checkpoint 290 agentes.channels.channel17.dataDirs = /ezdata/b-channel17/data 291 agentes.channels.channel17.maxFileSize = 1048576 292  293 agentes.channels.channel18.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 294 agentes.channels.channel18.capacity = 50000 295 agentes.channels.channel18.transactionCapacity = 1000 296 agentes.channels.channel18.checkpointDir = /ezdata/b-channel18/checkpoint 297 agentes.channels.channel18.dataDirs = /ezdata/b-channel18/data 298 agentes.channels.channel18.maxFileSize = 1048576 299  300 agentes.channels.channel19.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 301 agentes.channels.channel19.capacity = 50000 302 agentes.channels.channel19.transactionCapacity = 1000 303 agentes.channels.channel19.checkpointDir = /ezdata/b-channel19/checkpoint 304 agentes.channels.channel19.dataDirs = /ezdata/b-channel19/data 305 agentes.channels.channel19.maxFileSize = 1048576 306  307 agentes.channels.channel20.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 308 agentes.channels.channel20.capacity = 50000 309 agentes.channels.channel20.transactionCapacity = 1000 310 agentes.channels.channel20.checkpointDir = /ezdata/b-channel20/checkpoint 311 agentes.channels.channel20.dataDirs = /ezdata/b-channel20/data 312 agentes.channels.channel20.maxFileSize = 1048576 313  314 agentes.sources.memSrc.type = com.fusionskye.ezsonar.collector.source.MemorySource 315 agentes.sources.memSrc.memKey = JMS 316 agentes.sources.memSrc.capacity = 50000 317 agentes.sources.memSrc.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 channel13 channel14 channel15 channel16 channel17 channel18 channel19 channel20 318 agentes.sources.memSrc.batchSize = 1000 319 agentes.sources.memSrc.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector 320  321 agentes.sources.aggegation.type = com.fusionskye.ezsonar.collector.source.MemorySource 322 agentes.sources.aggegation.memKey = aggegation 323 agentes.sources.aggegation.capacity = 50000 324 agentes.sources.aggegation.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 channel13 channel14 channel15 channel16 channel17 channel18 channel19 channel20 325 agentes.sources.aggegation.batchSize = 1000 326 agentes.sources.aggegation.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector 

(10) 启动脚本配置

1 export FLUME_CLASSPATH=/usr/local/ezsonar/collector/plugins.d  2 bin/flume-ng agent -c conf -f conf/es-source-balance.properties -n agentes -Dflume.monitoring.type=http -Dflume.monitoring.port=34546 -Dflume.root.logger=ERROR,LOGFILE -Dflume.log.file=collector-balance.log & 

3.3.3 Collector-Data端

(1) JDK安装

[[email protected]]# rpm -ivh jdk-8u65-linux-x64.rpm

(2) 添加环境变量

1 export JAVA_HOME=/usr/java/jdk1.8.0_65 2 export JRE_HOME=/usr/java/jdk1.8.0_65/jre 3 export CLASSPATH=.:/lib/dt.jar:/lib/tools.jar 4 export PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/bin 5 export fusionskye_home=/usr/local/ezsonar/common 6 export FLUME_CLASSPATH=/usr/local/ezsonar/collector/plugins.d 

(3) 安装common

注意:fusionskye.jar的版本为加密版本,大小为237k,如果原来的版本大小为220k的,请更新版本。

[[email protected]]# tar -xf common.tar.gz -C /usr/local/ezsonar/

(4) 安装Lib包(libEZSonar_L.so和libcryptopp.so)

[[email protected]]# unzip lib.zip –r /usr/lib64

(5) 安装collector

[[email protected]]#tar -xf collector.tar.gz -C /usr/local/ezsonar/

(6) 重命名collector目录名

为了更好区分分配节点和数据节点的collector目录,分配节点的目录采用collector-balance,数据节点目录采用collector-data

[[email protected]]#mv /usr/local/ezsonar/collector /usr/local/ezsonar/collector-data1

(7) 更新collector版本

如果安装介质的版本是r1201,请更新为r1380版本
1 ##移动r1380版本到/usr/local/ezsonar/collector/plugins.d/EZSonar/lib 2 [[email protected]]#cp /home/ezsonar/ezsonar-collector-r1380-es1.7.0-jar-with-dependencies.jar /usr/local/ezsonar/collector/plugins.d/EZSonar/lib 3 ##删除原软连接,并重建软链接 4 ln -s ezsonar-collector-r1380-es1.7.0-jar-with-dependencies.jar ezsonar-collector-jar-with-dependencies.jar 

(8) 修改运行配置文件

? 新增参数:

1) #all表示单个运行,data表示数据处理,loadBalance表示负载均衡,不配置将使用all agentes.sources.source1.interceptors.interceptor1.modelType = data

? 修改参数:

1) Sink数量和ES集群数量相关,一般来说3个统计索引的sink+3*n个详细索引的sink(n为数据节点数量)+1个数据导出的sink,Channel数量和sink一致,一一对应 例如:一个ES对应3个详细索引+1个统计索引,那么ES集群有3个node,则共需要12个sink,12个channel。如下:

1 agentes.sinks = sink1 sink2 sink3 sink4 sink5 sink6 sink7 sink8 sink9 sink10 sink11 sink12 2 agentes.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 

2) Sink的数据类型为avro;

agentes.sources.source1.type = avro

3) agentes.sources.source1.interceptors.interceptor1.host=节点名称(例如:coll-data1),集群中各节点的名称必须不一样;

agentes.sources.source1.interceptors.interceptor1.host = coll_data1

4) 详细索引的Channel采用File的模式,当前设计统计索引只能使用memory模式例如:

详细索引:

agentes.channels.channel1.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel

统计索引:

agentes.channels.channel10.type = memory

5) 新增参数

agentes.channels.channel名称.checkpointDir = /ezdata/d-channel名称/checkpoint agentes.channels.channel名称.dataDirs = /ezdata/d-channel名称/data agentes.channels.channel1.maxFileSize = 1048576 (单位是k,默认保存2g数据,超过后先清空)

1 agentes.channels.channel1.checkpointDir = /ezdata/d-channel1/checkpoint 2 agentes.channels.channel1.dataDirs = /ezdata/d-channel1/data 3 agentes.channels.channel1.maxFileSize = 1048576 

? 注意事项:

a) 如果collector的Balance和Data端在同一台服务器,进程的端口号不能一样。Data端配置的端口要与Balance端sink连接的端口一样。
1 #定义collctor-balance节点的类型,绑定IP和端口号 2 agentes.sources.source1.type = avro 3 agentes.sources.source1.bind = 0.0.0.0 4 agentes.sources.source1.port = 44446 

b) channelsize最好是2w-3w,太大容易gc问题,太小负载均衡就没有显著的效果。

agentes.channels.channel1.capacity = 30000

(9) 可参考的详细配置:
  1 # Source三个,固定   2 #Sink数量和ES集群数量相关,一般来说3个统计索引的sink+3*n个详细索引的sink(n为数据节点数量)+一个数据导出的sink   3 #如果没有统计索引,请去除统计索引的sink配置   4 #如果没有数据导出的sink,请去除数据导出的sink配置   5 #Channel数量和sink一致,一一对应   6 agentes.sources = source1 memSrc aggegation   7 agentes.sinks = sink1 sink2 sink3 sink4 sink5 sink6 sink7 sink8 sink9 sink10 sink11 sink12   8 agentes.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12   9 #定义节点信息  10 # Describe/configure source1  11 agentes.sources.source1.type = avro  12 agentes.sources.source1.bind = 0.0.0.0  13 agentes.sources.source1.port = 44446  14 agentes.sources.source1.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12  15 agentes.sources.source1.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector  16 agentes.sources.source1.selector.summaryIndex = channel10 channel11 channel12  17 agentes.sources.source1.interceptors = interceptor1  18 agentes.sources.source1.interceptors.interceptor1.type = com.fusionskye.ezsonar.collector.interceptor.EZSonarSourceInterceptor$Builder  19 ##连接mongo数据库  20 agentes.sources.source1.interceptors.interceptor1.mongoHost = 192.168.137.8  21 agentes.sources.source1.interceptors.interceptor1.mongoPort = 27017  22 agentes.sources.source1.interceptors.interceptor1.mongoDatabase = ezsonar  23 agentes.sources.source1.interceptors.interceptor1.mongoUser = ezsonaruser  24 agentes.sources.source1.interceptors.interceptor1.mongoPassword = 123  25 ##collector节点名称  26 agentes.sources.source1.interceptors.interceptor1.host = coll_data1  27 ##获取流配置的间隔时间  28 agentes.sources.source1.interceptors.interceptor1.streamFrequency = 60   29 ##获取自定义地理位置的间隔时间  30 agentes.sources.source1.interceptors.interceptor1.hostLocationFrequency = 60  31 ##生成性能指标的间隔时间  32 agentes.sources.source1.interceptors.interceptor1.metricFrequency = 20  33 ##缓存地理位置的最大大小  34 agentes.sources.source1.interceptors.interceptor1.cacheMaximumSize = 1000  35 ##缓存地理位置的清理时间  36 agentes.sources.source1.interceptors.interceptor1.cacheExpire = 10  37 ##模块配置,geo_ip表示地理位置,如果只有NPM则可以去掉,add_field为追加字段,默认保留  38 agentes.sources.source1.interceptors.interceptor1.filters = geo_ip, add_field, app  39 ##地理位置信息的文件  40 agentes.sources.source1.interceptors.interceptor1.databaseFile = /usr/local/ezsonar/collector/GeoLite2-City.mmdb  41 #agentes.sources.source1.interceptors.interceptor1.statsFile = /var/log/ezsonar/collector/collector_source_benchmark.log  42 #agentes.sources.source1.interceptors.interceptor1.statsMetricFilter = .*[t|T]ime.*  43 #agentes.sources.source1.interceptors.interceptor1.statsMetricFilter = aaaaa  44 agentes.sources.source1.interceptors.interceptor1.debugStatus = false  45 agentes.sources.source1.interceptors.interceptor1.debugMessage = false  46 agentes.sources.source1.interceptors.interceptor1.timerStatus = false   47 ##这三个是TTM、NTM、DTM的数据的格式配置,不要修改  48 agentes.sources.source1.interceptors.interceptor1.ttmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{DATA:_trans_id}[\|]%{DATA:_trans_ref}[\|]%{DATA:_ret_code}[\|]%{DATA:_ret_code_x}[\|]%{INT:_in_pkts}[\|]%{INT:_in_bytes}[\|]%{INT:_out_pkts}[\|]%{INT:_out_bytes}[\|]%{INT:_in_retran}[\|]%{INT:_out_retran}[\|]%{INT:_in_ooo}[\|]%{INT:_out_ooo}[\|]%{INT:_latency_msec}[\|]%{INT:_tot_syn}[\|]%{INT:_tot_synack}[\|]%{INT:_tot_fin}[\|]%{INT:_tot_fin_s}[\|]%{INT:_tot_rst}[\|]%{INT:_tot_rst_s}[\|]%{INT:_tot_zero_server}[\|]%{INT:_tot_zero_client}[\|]%{INT:_rtt}[\|]%{INT:_start_at}[\|]%{INT:_start_at_ms}[\|]%{DATA:_start_at_s}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{INT:_protocol}[\|]%{INT:_trans_transfer_ms}  49 #agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{INT:_protocol}[\|]%{INT:_start_at}[\|]%{INT:_start_at_ms}[\|]%{INT:_in_bytes}[\|]%{INT:_out_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_out_pkts}[\|]%{INT:_in_retran}[\|]%{INT:_out_retran}[\|]%{INT:_nw_delay_c2p_s}[\|]%{INT:_nw_delay_c2p_us}[\|]%{INT:_nw_delay_p2s_s}[\|]%{INT:_nw_delay_p2s_us}[\|]%{DATA:_flow_state}  50 #agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{INT:_input_snmp}[\|]%{INT:_output_snmp}[\|]%{INT:_in_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_protocol}[\|]%{DATA:_protocol_map}[\|]%{INT:_src_tos}[\|]%{INT:_tcp_flags}[\|]%{INT:_l4_src_port}[\|]%{DATA:_l4_src_port_map}[\|]%{DATA:_ipv4_src_addr}[\|]%{INT:_l4_dst_port}[\|]%{DATA:_l4_dst_port_map}[\|]%{DATA:_ipv4_dst_addr}[\|]%{INT:_src_as}[\|]%{INT:_dst_as}[\|]%{INT:_last_switched}[\|]%{INT:_start_at}[\|]%{INT:_out_bytes}[\|]%{INT:_out_pkts}[\|]%{DATA:_ipv6_src_addr}[\|]%{DATA:_ipv6_dst_addr}[\|]%{INT:_icmp_type}[\|]%{DATA:_in_src_mac}[\|]%{DATA:_out_dst_mac}[\|]%{INT:_src_vlan}[\|]%{INT:_dst_vlan}[\|]%{INT:_ip_protocol_version}[\|]%{INT:_direction}[\|]%{INT:_fragments}[\|]%{DOUBLE:_total_nw_latency_ms}[\|]%{INT:_num_pkts_up_to_128_bytes}[\|]%{INT:_num_pkts_128_to_256_bytes}[\|]%{INT:_num_pkts_256_to_512_bytes}[\|]%{INT:_num_pkts_512_to_1024_bytes}[\|]%{INT:_num_pkts_1024_to_1514_bytes}[\|]%{INT:_num_pkts_over_1514_bytes}[\|]%{INT:_retransmitted_in_pkts}[\|]%{INT:_retransmitted_out_pkts}[\|]%{INT:_ooorder_in_pkts}[\|]%{INT:_ooorder_out_pkts}[\|]%{INT:_tcp_win_zero_in}[\|]%{INT:_tcp_win_zero_out}[\|]%{INT:_tcp_est_latency_ms}[\|]%{INT:_tcp_flow_state}[\|]%{INT:_num_pkts_ttl_eq_1}[\|]%{INT:_num_pkts_ttl_2_5}[\|]%{INT:_num_pkts_ttl_5_32}[\|]%{INT:_num_pkts_ttl_32_64}[\|]%{INT:_num_pkts_ttl_64_96}[\|]%{INT:_num_pkts_ttl_96_128}[\|]%{INT:_num_pkts_ttl_128_160}[\|]%{INT:_num_pkts_ttl_160_192}[\|]%{INT:_num_pkts_ttl_192_224}[\|]%{INT:_num_pkts_ttl_224_255}[\|]%{INT:_duration_in}[\|]%{INT:_duration_out}[\|]%{INT:_tcp_win_min_in}[\|]%{INT:_tcp_win_max_in}[\|]%{INT:_tcp_win_mss_in}[\|]%{INT:_tcp_win_scale_in}[\|]%{INT:_tcp_win_min_out}[\|]%{INT:_tcp_win_max_out}[\|]%{INT:_tcp_win_mss_out}[\|]%{INT:_tcp_win_scale_out}[\|]%{DOUBLE:_appl_latency_ms}[\|]%{INT:_total_keepalive}[\|]%{INT:_cip}[\|]%{INT:_sip}  51 agentes.sources.source1.interceptors.interceptor1.ntmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{INT:_input_snmp}[\|]%{INT:_output_snmp}[\|]%{INT:_in_bytes}[\|]%{INT:_in_pkts}[\|]%{INT:_protocol}[\|]%{DATA:_protocol_map}[\|]%{INT:_src_tos}[\|]%{INT:_tcp_flags}[\|]%{INT:_l4_src_port}[\|]%{DATA:_l4_src_port_map}[\|]%{DATA:_ipv4_src_addr}[\|]%{INT:_l4_dst_port}[\|]%{DATA:_l4_dst_port_map}[\|]%{DATA:_ipv4_dst_addr}[\|]%{INT:_src_as}[\|]%{INT:_dst_as}[\|]%{INT:_last_switched}[\|]%{INT:_start_at}[\|]%{INT:_out_bytes}[\|]%{INT:_out_pkts}[\|]%{DATA:_ipv6_src_addr}[\|]%{DATA:_ipv6_dst_addr}[\|]%{INT:_icmp_type}[\|]%{DATA:_in_src_mac}[\|]%{DATA:_out_dst_mac}[\|]%{INT:_src_vlan}[\|]%{INT:_dst_vlan}[\|]%{INT:_ip_protocol_version}[\|]%{INT:_direction}[\|]%{INT:_fragments}[\|]%{DOUBLE:_total_nw_latency_ms}[\|]%{INT:_num_pkts_up_to_128_bytes}[\|]%{INT:_num_pkts_128_to_256_bytes}[\|]%{INT:_num_pkts_256_to_512_bytes}[\|]%{INT:_num_pkts_512_to_1024_bytes}[\|]%{INT:_num_pkts_1024_to_1514_bytes}[\|]%{INT:_num_pkts_over_1514_bytes}[\|]%{INT:_retransmitted_in_pkts}[\|]%{INT:_retransmitted_out_pkts}[\|]%{INT:_ooorder_in_pkts}[\|]%{INT:_ooorder_out_pkts}[\|]%{INT:_tcp_win_zero_in}[\|]%{INT:_tcp_win_zero_out}[\|]%{INT:_tcp_est_latency_ms}[\|]%{INT:_tcp_flow_state}[\|]%{INT:_num_pkts_ttl_eq_1}[\|]%{INT:_num_pkts_ttl_2_5}[\|]%{INT:_num_pkts_ttl_5_32}[\|]%{INT:_num_pkts_ttl_32_64}[\|]%{INT:_num_pkts_ttl_64_96}[\|]%{INT:_num_pkts_ttl_96_128}[\|]%{INT:_num_pkts_ttl_128_160}[\|]%{INT:_num_pkts_ttl_160_192}[\|]%{INT:_num_pkts_ttl_192_224}[\|]%{INT:_num_pkts_ttl_224_255}[\|]%{INT:_duration_in}[\|]%{INT:_duration_out}[\|]%{INT:_tcp_win_min_in}[\|]%{INT:_tcp_win_max_in}[\|]%{INT:_tcp_win_mss_in}[\|]%{INT:_tcp_win_scale_in}[\|]%{INT:_tcp_win_min_out}[\|]%{INT:_tcp_win_max_out}[\|]%{INT:_tcp_win_mss_out}[\|]%{INT:_tcp_win_scale_out}[\|]%{DOUBLE:_appl_latency_ms}[\|]%{INT:_total_keepalive}[\|]%{INT:_cip}[\|]%{INT:_sip}[\|]%{DOUBLE:_client_nw_latency_ms}[\|]%{DOUBLE:_server_nw_latency_ms}[\|]%{INT:_appl_req_transfer_us}[\|]%{INT:_appl_resp_transfer_us}  52 #agentes.sources.source1.interceptors.interceptor1.dtmGShortMessage = %{INT:facility}[\|]%{DATA:_probe_name}[\|]%{IP:_src_ip}[\|]%{INT:_sport}[\|]%{IP:_dst_ip}[\|]%{INT:_dport}[\|]%{INT:_start_at}[\|]%{DATA:_dtype}[\|]%{DATA:_pay_load}  53 ## addField为追加的字段配置,比如addField.ttm为追加到ttm(交易数据)里面的配置,现场可以自行增加,但是这两个不要修改  54 ##serverValueFrequency不用修改  55 ##jmsDelayed为异步交易的超时时间(毫秒),jmsClearTime为清理超时的异步交易的时间间隔(毫秒)  56 ##splitChar为ttm/ntm等的间隔符,不用修改  57 ##amountIsYuan表示交易金额是不是元,如果是则为true,如果是分则为false  58 agentes.sources.source1.interceptors.interceptor1.addField.ttm._src_tcp = %{_src_ip}:%{_sport}  59 agentes.sources.source1.interceptors.interceptor1.addField.ntm._src_tcp = %{_src_ip}:%{_sport}  60 agentes.sources.source1.interceptors.interceptor1.addField.ntm._dport_protocol = %{_dport}:%{_protocol}  61 #agentes.sources.source1.interceptors.interceptor1.addField.dtm._pair = %{_src_ip}:%{_dst_ip}:%{_dport}:%{_dtype}  62 #agentes.sources.source1.interceptors.interceptor1.addField.ntm._pair = %{_src_ip}:%{_dst_ip}:%{_dport}  63 agentes.sources.source1.interceptors.interceptor1.addField.ntm._pair = %{_ipv4_src_addr}:%{_ipv4_dst_addr}:%{_l4_dst_port}:%{_l4_dst_port_map}  64 agentes.sources.source1.interceptors.interceptor1.serverValueFrequency = 60  65 agentes.sources.source1.interceptors.interceptor1.jmsDelayed = 60000  66 agentes.sources.source1.interceptors.interceptor1.jmsClearTime = 10000  67 #agentes.sources.source1.interceptors.interceptor1.bcStream = 54b4cbc014eae6d51b32203f  68 ##agentes.sources.source1.interceptors.interceptor1.splitChar = |  69 agentes.sources.source1.interceptors.interceptor1.ipFile = /usr/local/ezsonar/collector/china.ez  70 ##agentes.sources.source1.interceptors.interceptor1.busi_filter.keys = _trans_ref.BusiID  71 ##agentes.sources.source1.interceptors.interceptor1.busi_filter.matchKey = _trans_ref.Cid  72 ## 增加data类型  73 agentes.sources.source1.interceptors.interceptor1.modelType = data  74   75 # Describe sinsink1  76 ##type、indexType、clusterName、serializer、timerStatus都不要修改  77 ##hostNames为es的地址  78 ##indexName,如果没有ntm,可以配置为ezsonar  79 ##batchSize与channel的transactionCapacity保持一致,默认3000  80 ##Shards为分片数,如果是集群的话,保证每个数据节点1-2个,比如三个数据节点,shards配置为3,或者6  81 ##refreshInterval,如果客户对实时性要求很高,配置为10,如果没有要求,30s即可  82 ##Replics,复制的数量  83 ##sink1-9是详细索引  84 agentes.sinks.sink1.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink  85 agentes.sinks.sink1.hostNames = 192.168.137.8:9300  86 agentes.sinks.sink1.indexName = ezsonar,ezsonarnpm  87 agentes.sinks.sink1.indexType = message  88 agentes.sinks.sink1.clusterName = fusionskye  89 agentes.sinks.sink1.batchSize = 8000  90 agentes.sinks.sink1.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory  91 agentes.sinks.sink1.serializer.hasMessage = false  92 #agentes.sinks.sink1.timerStatus = true   93 agentes.sinks.sink1.timerStatus = false  94 agentes.sinks.sink1.channel = channel1  95 agentes.sinks.sink1.refreshInterval = 10  96 agentes.sinks.sink1.replics = 1  97 agentes.sinks.sink1.shards = 3  98 agentes.sinks.sink1.disableAllocationHour = -1  99 agentes.sinks.sink1.enableAllocationHour = -1 100  101 agentes.sinks.sink2.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 102 agentes.sinks.sink2.hostNames = 192.168.137.8:9300 103 agentes.sinks.sink2.indexName = ezsonar,ezsonarnpm 104 agentes.sinks.sink2.indexType = message 105 agentes.sinks.sink2.clusterName = fusionskye 106 agentes.sinks.sink2.batchSize = 8000 107 agentes.sinks.sink2.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 108 agentes.sinks.sink2.serializer.hasMessage = false 109 agentes.sinks.sink2.timerStatus = false  110 agentes.sinks.sink2.channel = channel2 111 agentes.sinks.sink2.refreshInterval = 10 112 agentes.sinks.sink2.replics = 1 113 agentes.sinks.sink2.shards = 3 114 agentes.sinks.sink2.disableAllocationHour = -1 115 agentes.sinks.sink2.enableAllocationHour = -1 116  117 agentes.sinks.sink3.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 118 agentes.sinks.sink3.hostNames = 192.168.137.8:9300 119 agentes.sinks.sink3.indexName = ezsonar,ezsonarnpm 120 agentes.sinks.sink3.indexType = message 121 agentes.sinks.sink3.clusterName = fusionskye 122 agentes.sinks.sink3.batchSize = 8000 123 agentes.sinks.sink3.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 124 agentes.sinks.sink3.serializer.hasMessage = false 125 agentes.sinks.sink3.timerStatus = false  126 agentes.sinks.sink3.channel = channel3 127 agentes.sinks.sink3.refreshInterval = 10 128 agentes.sinks.sink3.replics = 1 129 agentes.sinks.sink3.shards = 3 130 agentes.sinks.sink3.disableAllocationHour = -1 131 agentes.sinks.sink3.enableAllocationHour = -1 132  133 agentes.sinks.sink4.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 134 agentes.sinks.sink4.hostNames = 192.168.137.9:9300 135 agentes.sinks.sink4.indexName = ezsonar,ezsonarnpm 136 agentes.sinks.sink4.indexType = message 137 agentes.sinks.sink4.clusterName = fusionskye 138 agentes.sinks.sink4.batchSize = 8000 139 agentes.sinks.sink4.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 140 agentes.sinks.sink4.serializer.hasMessage = false 141 agentes.sinks.sink4.timerStatus = false  142 agentes.sinks.sink4.channel = channel4 143 agentes.sinks.sink4.refreshInterval = 10 144 agentes.sinks.sink4.replics = 1 145 agentes.sinks.sink4.shards = 3 146 agentes.sinks.sink4.disableAllocationHour = -1 147 agentes.sinks.sink4.enableAllocationHour = -1 148  149 agentes.sinks.sink5.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 150 agentes.sinks.sink5.hostNames = 192.168.137.9:9300 151 agentes.sinks.sink5.indexName = ezsonar,ezsonarnpm 152 agentes.sinks.sink5.indexType = message 153 agentes.sinks.sink5.clusterName = fusionskye 154 agentes.sinks.sink5.batchSize = 8000 155 agentes.sinks.sink5.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 156 agentes.sinks.sink5.serializer.hasMessage = false 157 agentes.sinks.sink5.timerStatus = false  158 agentes.sinks.sink5.channel = channel5 159 agentes.sinks.sink5.refreshInterval = 10 160 agentes.sinks.sink5.replics = 1 161 agentes.sinks.sink5.shards = 3 162 agentes.sinks.sink5.disableAllocationHour = -1 163 agentes.sinks.sink5.enableAllocationHour = -1 164  165 agentes.sinks.sink6.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 166 agentes.sinks.sink6.hostNames = 192.168.137.9:9300 167 agentes.sinks.sink6.indexName = ezsonar,ezsonarnpm 168 agentes.sinks.sink6.indexType = message 169 agentes.sinks.sink6.clusterName = fusionskye 170 agentes.sinks.sink6.batchSize = 8000 171 agentes.sinks.sink6.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 172 agentes.sinks.sink6.serializer.hasMessage = false 173 agentes.sinks.sink6.timerStatus = false  174 agentes.sinks.sink6.channel = channel6 175 agentes.sinks.sink6.refreshInterval = 10 176 agentes.sinks.sink6.replics = 1 177 agentes.sinks.sink6.shards = 3 178 agentes.sinks.sink6.disableAllocationHour = -1 179 agentes.sinks.sink6.enableAllocationHour = -1 180  181 agentes.sinks.sink7.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 182 agentes.sinks.sink7.hostNames = 192.168.137.10:9300 183 agentes.sinks.sink7.indexName = ezsonar,ezsonarnpm 184 agentes.sinks.sink7.indexType = message 185 agentes.sinks.sink7.clusterName = fusionskye 186 agentes.sinks.sink7.batchSize = 8000 187 agentes.sinks.sink7.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 188 agentes.sinks.sink7.serializer.hasMessage = false 189 agentes.sinks.sink7.timerStatus = false  190 agentes.sinks.sink7.channel = channel7 191 agentes.sinks.sink7.refreshInterval = 10 192 agentes.sinks.sink7.replics = 1 193 agentes.sinks.sink7.shards = 3 194 agentes.sinks.sink7.disableAllocationHour = -1 195 agentes.sinks.sink7.enableAllocationHour = -1 196  197 agentes.sinks.sink8.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 198 agentes.sinks.sink8.hostNames = 192.168.137.10:9300 199 agentes.sinks.sink8.indexName = ezsonar,ezsonarnpm 200 agentes.sinks.sink8.indexType = message 201 agentes.sinks.sink8.clusterName = fusionskye 202 agentes.sinks.sink8.batchSize = 8000 203 agentes.sinks.sink8.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 204 agentes.sinks.sink8.serializer.hasMessage = false 205 agentes.sinks.sink8.timerStatus = false  206 agentes.sinks.sink8.channel = channel8 207 agentes.sinks.sink8.refreshInterval = 10 208 agentes.sinks.sink8.replics = 1 209 agentes.sinks.sink8.shards = 3 210 agentes.sinks.sink8.disableAllocationHour = -1 211 agentes.sinks.sink8.enableAllocationHour = -1 212  213 agentes.sinks.sink9.type = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchSink 214 agentes.sinks.sink9.hostNames = 192.168.137.10:9300 215 agentes.sinks.sink9.indexName = ezsonar,ezsonarnpm 216 agentes.sinks.sink9.indexType = message 217 agentes.sinks.sink9.clusterName = fusionskye 218 agentes.sinks.sink9.batchSize = 8000 219 agentes.sinks.sink9.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 220 agentes.sinks.sink9.serializer.hasMessage = false 221 agentes.sinks.sink9.timerStatus = false  222 agentes.sinks.sink9.channel = channel9 223 agentes.sinks.sink9.refreshInterval = 10 224 agentes.sinks.sink9.replics = 1 225 agentes.sinks.sink9.shards = 3 226 agentes.sinks.sink9.disableAllocationHour = -1 227 agentes.sinks.sink9.enableAllocationHour = -1 228 ##sink10-12是统计索引 229 ##type、indexType、clusterName都不要修改 230 ##hostNames为es的地址 231 ##indexName,如果没有热力图,可以配置为analyzier 232 ##batchSize与channel的transactionCapacity保持一致,默认3000 233 ##Shards为分片数,如果是集群的话,保证每个数据节点1-2个,比如三个数据节点,shards配置为3,或者6 234 ##refreshInterval,如果客户对实时性要求很高,配置为10,如果没有要求,30s即可 235 ##Replics,复制的数量 236 ##summaryTimeout为统计数据的超时时间,多久以后的数据会自动写入到es中 237 agentes.sinks.sink10.type = com.fusionskye.ezsonar.collector.sink.EZSonarSummaryIndexSink 238 agentes.sinks.sink10.hostNames = 192.168.137.8:9300 239 agentes.sinks.sink10.indexName = analyzier,heatmap_summary 240 agentes.sinks.sink10.indexType = message 241 agentes.sinks.sink10.clusterName = fusionskye 242 agentes.sinks.sink10.batchSize = 8000 243 agentes.sinks.sink10.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 244 agentes.sinks.sink10.channel = channel10 245 agentes.sinks.sink10.refreshInterval = 10 246 agentes.sinks.sink10.summaryTimeout = 50 247 agentes.sinks.sink10.replics = 1 248 agentes.sinks.sink10.shards = 3 249 agentes.sinks.sink10.clearInterval = 3 250 agentes.sinks.sink10.summaryInterval = 15 251 agentes.sinks.sink10.disableAllocationHour = -1 252 agentes.sinks.sink10.enableAllocationHour = -1 253  254 agentes.sinks.sink11.type = com.fusionskye.ezsonar.collector.sink.EZSonarSummaryIndexSink 255 agentes.sinks.sink11.hostNames = 192.168.137.9:9300 256 agentes.sinks.sink11.indexName = analyzier,heatmap_summary 257 agentes.sinks.sink11.indexType = message 258 agentes.sinks.sink11.clusterName = fusionskye 259 agentes.sinks.sink11.batchSize = 8000 260 agentes.sinks.sink11.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 261 agentes.sinks.sink11.channel = channel11 262 agentes.sinks.sink11.refreshInterval = 10 263 agentes.sinks.sink11.summaryTimeout = 50 264 agentes.sinks.sink11.replics = 1 265 agentes.sinks.sink11.shards = 3 266 agentes.sinks.sink11.clearInterval = 3 267 agentes.sinks.sink11.summaryInterval = 15 268 agentes.sinks.sink11.disableAllocationHour = -1 269 agentes.sinks.sink11.enableAllocationHour = -1 270  271 agentes.sinks.sink12.type = com.fusionskye.ezsonar.collector.sink.EZSonarSummaryIndexSink 272 agentes.sinks.sink12.hostNames = 192.168.137.10:9300 273 agentes.sinks.sink12.indexName = analyzier,heatmap_summary 274 agentes.sinks.sink12.indexType = message 275 agentes.sinks.sink12.clusterName = fusionskye 276 agentes.sinks.sink12.batchSize = 8000 277 agentes.sinks.sink12.serializer = com.fusionskye.ezsonar.collector.sink.EZSonarElasticSearchIndexRequestBuilderFactory 278 agentes.sinks.sink12.channel = channel12 279 agentes.sinks.sink12.refreshInterval = 10 280 agentes.sinks.sink12.summaryTimeout = 50 281 agentes.sinks.sink12.replics = 1 282 agentes.sinks.sink12.shards = 3 283 agentes.sinks.sink12.clearInterval = 3 284 agentes.sinks.sink12.summaryInterval = 15 285 agentes.sinks.sink12.disableAllocationHour = -1 286 agentes.sinks.sink12.enableAllocationHour = -1 287  288 # Use a channel which buffers events in memory 289 ##Capacity为channel的大小,太大会导致GC性能问题 290 ##transactionCapacity为批次提交数据的大小,与sink的batchSize保持一致 291 agentes.channels.channel1.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 292 agentes.channels.channel1.capacity = 30000 293 agentes.channels.channel1.transactionCapacity = 8000 294 agentes.channels.channel1.checkpointDir = /ezdata/d-channel1/checkpoint 295 agentes.channels.channel1.dataDirs = /ezdata/d-channel1/data 296 agentes.channels.channel1.maxFileSize = 1048576 297  298 agentes.channels.channel2.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 299 agentes.channels.channel2.capacity = 30000 300 agentes.channels.channel2.transactionCapacity = 8000 301 agentes.channels.channel2.checkpointDir = /ezdata/d-channel2/checkpoint 302 agentes.channels.channel2.dataDirs = /ezdata/d-channel2/data 303 agentes.channels.channel2.maxFileSize = 1048576 304  305 agentes.channels.channel3.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 306 agentes.channels.channel3.capacity = 30000 307 agentes.channels.channel3.transactionCapacity = 8000 308 agentes.channels.channel3.checkpointDir = /ezdata/d-channel3/checkpoint 309 agentes.channels.channel3.dataDirs = /ezdata/d-channel3/data 310 agentes.channels.channel3.maxFileSize = 1048576 311  312 agentes.channels.channel4.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 313 agentes.channels.channel4.capacity = 30000 314 agentes.channels.channel4.transactionCapacity = 8000 315 agentes.channels.channel4.checkpointDir = /ezdata/d-channel4/checkpoint 316 agentes.channels.channel4.dataDirs = /ezdata/d-channel4/data 317 agentes.channels.channel4.maxFileSize = 1048576 318  319 agentes.channels.channel5.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 320 agentes.channels.channel5.capacity = 30000 321 agentes.channels.channel5.transactionCapacity = 8000 322 agentes.channels.channel5.checkpointDir = /ezdata/d-channel5/checkpoint 323 agentes.channels.channel5.dataDirs = /ezdata/d-channel5/data 324 agentes.channels.channel5.maxFileSize = 1048576 325  326 agentes.channels.channel6.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 327 agentes.channels.channel6.capacity = 30000 328 agentes.channels.channel6.transactionCapacity = 8000 329 agentes.channels.channel6.checkpointDir = /ezdata/d-channel6/checkpoint 330 agentes.channels.channel6.dataDirs = /ezdata/d-channel6/data 331 agentes.channels.channel6.maxFileSize = 1048576 332  333 agentes.channels.channel7.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 334 agentes.channels.channel7.capacity = 30000 335 agentes.channels.channel7.transactionCapacity = 8000 336 agentes.channels.channel7.checkpointDir = /ezdata/d-channel7/checkpoint 337 agentes.channels.channel7.dataDirs = /ezdata/d-channel7/data 338 agentes.channels.channel7.maxFileSize = 1048576 339  340 agentes.channels.channel8.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 341 agentes.channels.channel8.capacity = 30000 342 agentes.channels.channel8.transactionCapacity = 8000 343 agentes.channels.channel8.checkpointDir = /ezdata/d-channel8/checkpoint 344 agentes.channels.channel8.dataDirs = /ezdata/d-channel8/data 345 agentes.channels.channel8.maxFileSize = 1048576 346  347 agentes.channels.channel9.type = com.fusionskye.ezsonar.collector.channel.BalanceFileChannel 348 agentes.channels.channel9.capacity = 30000 349 agentes.channels.channel9.transactionCapacity = 8000 350 agentes.channels.channel9.checkpointDir = /ezdata/d-channel9/checkpoint 351 agentes.channels.channel9.dataDirs = /ezdata/d-channel9/data 352 agentes.channels.channel9.maxFileSize = 1048576 353  354 agentes.channels.channel10.type = memory 355 agentes.channels.channel10.capacity = 30000 356 agentes.channels.channel10.transactionCapacity = 8000 357 agentes.channels.channel10.checkpointDir = /ezdata/d-channel10/checkpoint 358 agentes.channels.channel10.dataDirs = /ezdata/d-channel10/data 359 agentes.channels.channel10.maxFileSize = 1048576 360  361 agentes.channels.channel11.type = memory 362 agentes.channels.channel11.capacity = 30000 363 agentes.channels.channel11.transactionCapacity = 8000 364 agentes.channels.channel11.checkpointDir = /ezdata/d-channel11/checkpoint 365 agentes.channels.channel11.dataDirs = /ezdata/d-channel11/data 366 agentes.channels.channel11.maxFileSize = 1048576 367  368 agentes.channels.channel12.type = memory 369 agentes.channels.channel12.capacity = 30000 370 agentes.channels.channel12.transactionCapacity = 8000 371 agentes.channels.channel12.checkpointDir = /ezdata/d-channel12/checkpoint 372 agentes.channels.channel12.dataDirs = /ezdata/d-channel12/data 373 agentes.channels.channel12.maxFileSize = 1048576 374  375 agentes.sources.memSrc.type = com.fusionskye.ezsonar.collector.source.MemorySource 376 agentes.sources.memSrc.memKey = JMS 377 agentes.sources.memSrc.capacity = 50000 378 agentes.sources.memSrc.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 379 agentes.sources.memSrc.batchSize = 3000 380 agentes.sources.memSrc.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector 381 agentes.sources.memSrc.selector.summaryIndex = channel10 channel11 channel12 382  383 agentes.sources.aggegation.type = com.fusionskye.ezsonar.collector.source.MemorySource 384 agentes.sources.aggegation.memKey = aggegation 385 agentes.sources.aggegation.capacity = 50000 386 agentes.sources.aggegation.channels = channel1 channel2 channel3 channel4 channel5 channel6 channel7 channel8 channel9 channel10 channel11 channel12 387 agentes.sources.aggegation.batchSize = 3000 388 agentes.sources.aggegation.selector.type = com.fusionskye.ezsonar.collector.channel.BalanceChannelSelector 389 agentes.sources.aggegation.selector.summaryIndex = channel10 channel11 channel12 

(10) 启动脚本配置

1 export FLUME_CLASSPATH=/usr/local/ezsonar/collector/plugins.d  2 bin/flume-ng agent -c conf -f conf/es-source-data.properties -n agentes -Dflume.monitoring.type=http -Dflume.monitoring.port=34547 -Dflume.root.logger=ERROR,LOGFILE -Dflume.log.file=collector-data.log & 

3.4 Collector集群验证

集群的作用是在有故障时可以自动切换和数据即时恢复,并在业务增长或下降时,集群可伸缩或扩展(手动),并不影响在线业务,因此集群验证就通过断掉节点来看数据处理情况和界面显示的折线图有没有异样。

3.5 Collector集群维护

1、 启动情况

a) 启动各节点后,可通过查看输出的日志情况来观察集群的连接情况,如果有节点不正常,正常的节点是会有报错信息输出,日志输出信息是找不到节点。例如:A—B(172.16.1.30:44446),B挂后,在A就会输出连接不上172.16.1.30:44446;

b) 启动节点后,先用ps –ef|grep coll查看进程有没起来,当你看到进程已起,不要认为该节点已正常,此时要使用netstat –tnlp查看对应节点端口号有没有打开。在配置过程中,遇到配置文件或者collector版本不对时,进程是可以启动的,但端口号是没有打开。最后再查看输出日志情况来排查问题。

2、 问题排查

a) 启动flume时,在/ezdata/目录下没生成/ezdata/channel1/checkpoint和/ezdata/channel1/data这两个目录,反而在root目录下生成file-channel/data目录和文件,同时flume日志输出以下信息。

技术分享图片

问题产生的原因是flume的配置文件,定义checkpoint和data的dir写少一个s,因此默认把data目录写到root目录下。

注意flume配置文件,不要写错定义的信息,例如:

xxx.checkpointDir =/xxx/xxx/xxx(Dir后面不带s) xxx.dataDirs =/xxx/xxx/xxx (Dir后面带s)

b) 启动collector时产生common一些报错,如下所示:

技术分享图片

问题产生的原因是common下的fusionskye.jar与collector对应不上,原来是fusionskye.jar为非加密,需更换为加密版本。

c) 启动collector集群时,es的统计索引没有写入数据

排查办法:(1)先查看有没有开启统计索引;

(2)查看collector的data节点配置sink和channel有没有配错,特别是统计索引几个channel;

(3)查看data节点配置文件的capacity最好是2w-3w,太大容易gc问题,太小负载均衡就没有显著的效果。

(4)当前的设计统计索引channel 的type只能memory。

第四章 EZSonar集群应用实例

技术分享图片

以上是关于集群部署方案的主要内容,如果未能解决你的问题,请参考以下文章

RocketMQ企业级部署方案

Java应用集群下的定时任务处理方案(mysql)

集群部署方案

ClickHouse集群部署

Kafka核心技术与实战——06 | Kafka线上集群部署方案怎么做?

solr 学习片段