07_Flume_regex interceptor实践

Posted shayzhang

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了07_Flume_regex interceptor实践相关的知识,希望对你有一定的参考价值。

 实践一:regex filter interceptor

1、目标场景

regex filter interceptor的作用:

1)将event body的内容和配置中指定的正则表达式进行匹配
2)如果内容匹配,则将该event丢弃
3)如果内容不匹配,则将该event放行

 

2、Flume Agent配置文件

# 01 define agent name, source/sink/channel 
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 02 source,http,jsonhandler
a1.sources.r1.type = http
a1.sources.r1.bind = master
a1.sources.r1.port = 6666
a1.sources.r1.handler = org.apache.flume.source.http.JSONHandler

# 03 regex filter interceptor, match event body for filter
a1.sources.r1.interceptors = i1  
a1.sources.r1.interceptors.i1.type = regex_filter  
a1.sources.r1.interceptors.i1.regex = ^[0-9]*$ 
# filter matched event 
a1.sources.r1.interceptors.i1.excludeEvents = true  

# 04 logger sink
a1.sinks.k1.type = logger

# 05 channel,memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 06 bind source,sink to channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3、验证regex filter interceptor

1) 通过curl -X POST -d \'json数据\' 发送带有不同body的HTTP请求,其中有1个满足regex

 

2)观察终端打印出的event,body为1234的event被过滤, 并没有出现

 

 4、regex filter interceptor的官方文档

 

 

实践二:regex extractor interceptor

1、目标场景

regex extractor interceptor的作用:
1)将event body的内容和配置中指定的正则表达式进行匹配
2)如果内容匹配,将配合配置文件中给定的key, 组成key:value添加到event的header中
3)event body中的内容不会变化

 

2、Flume Agent的配置文件

# 01 define agent name, source/sink/channel 
a1.sources = r1
a1.sinks = k1
a1.channels = c1

# 02 source,http,jsonhandler
a1.sources.r1.type = http
a1.sources.r1.bind = master
a1.sources.r1.port = 6666
a1.sources.r1.handler = org.apache.flume.source.http.JSONHandler

# 03 regex extractor interceptor,match event body to extract character and digital
a1.sources.r1.interceptors = i1  
a1.sources.r1.interceptors.i1.type = regex_extractor
a1.sources.r1.interceptors.i1.regex = (^[a-zA-Z]*)\\\\s([0-9]*$)  # regex匹配并进行分组,匹配结果将有两个部分, 注意\\s空白字符要进行转义
# specify key for 2 matched part
a1.sources.r1.interceptors.i1.serializers = s1 s2
# key name
a1.sources.r1.interceptors.i1.serializers.s1.name = word
a1.sources.r1.interceptors.i1.serializers.s2.name = digital 

# 04 logger sink
a1.sinks.k1.type = logger

# 05 channel,memory
a1.channels.c1.type = memory
a1.channels.c1.capacity = 1000
a1.channels.c1.transactionCapacity = 100

# 06 bind source,sink to channel
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

3、验证regex extractor interceptor

1) 通过curl -X POST -d \'json数据\'的方式发送HTTP请求,body中的内容为"shayzhang 1234", 其中shayzhang,1234将被正则表达式匹配

 

2) 观察logger打印到终端的event,header中将增加两部分 word:shayzhang, digital:1234

 

以上是关于07_Flume_regex interceptor实践的主要内容,如果未能解决你的问题,请参考以下文章

2-vue中Axios封装请求

Android/Kotlin:在 Retrofit Interceptor 中访问 sharedPreferences

java struts2入门学习---拦截器学习

Hadoop实战-Flume之Source interceptor(2017-05-16 22:40)

03JavaScript程序设计修炼之道_2019-07-02_20-11-09_ 2019-07-02_21-28-28 常用事件(onkeydown...)offsetclientscro

Idea_学习_07_Idea常用配置