geoip查找失败弹性堆栈logstash
Posted
技术标签:
【中文标题】geoip查找失败弹性堆栈logstash【英文标题】:geoip lookup failure elastic stack logstash 【发布时间】:2017-06-15 05:27:03 【问题描述】:使用 filebeat 将 apache 日志从 Windows 系统发送到我在 linux EC2 中的 logstash 服务器,然后发送到弹性搜索和 Kibana。
弹性搜索和 Kibana - 5.3 Logstash 和 filebeat - 5.3
filebeat.yml:
filebeat.prospectors:
- input_type: log
# Paths that should be crawled and fetched. Glob based paths.
paths:
#- /var/log/*.log
#- c:\programdata\elasticsearch\logs\*
- C:\Users\Sagar\Desktop\elastic_test4\data\log\*
output.logstash:
# The Logstash hosts
hosts: ["10.101.00.11:5044"]
template.name: "filebeat-poc"
template.path: "filebeat.template.json"
template.overwrite: false
Ubuntu Linux EC2 实例中的logstash.conf
input
beats
port => 5044
filter
grok
match =>
"message" => "%COMBINEDAPACHELOG"
geoip
source => "clientip"
target => "geoip"
add_field => [ "[geoip][coordinates]", "%[geoip][longitude]" ]
add_field => [ "[geoip][coordinates]", "%[geoip][latitude]" ]
mutate
convert => [ "[geoip][coordinates]", "float"]
output
elasticsearch
hosts => ["elastic-instance-1.es.amazonaws.com:80"]
index => "apache-%+YYYY.MM.dd"
document_type => "apache_logs"
stdout codec => rubydebug
我的虚拟日志文件。
64.242.88.10 - - [07/Mar/2004:16:05:49 -0800] "GET /twiki/bin/edit/Main/Double_bounce_sender?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12846
64.242.88.10 - - [07/Mar/2004:16:06:51 -0800] "GET /twiki/bin/rdiff/TWiki/NewUserTemplate?rev1=1.3&rev2=1.2 HTTP/1.1" 200 4523
64.242.88.10 - - [07/Mar/2004:16:10:02 -0800] "GET /mailman/listinfo/hsdivision HTTP/1.1" 200 6291
64.242.88.10 - - [07/Mar/2004:16:11:58 -0800] "GET /twiki/bin/view/TWiki/WikiSyntax HTTP/1.1" 200 7352
64.242.88.10 - - [07/Mar/2004:16:20:55 -0800] "GET /twiki/bin/view/Main/DCCAndPostFix HTTP/1.1" 200 5253
64.242.88.10 - - [07/Mar/2004:16:23:12 -0800] "GET /twiki/bin/oops/TWiki/AppendixFileSystem?template=oopsmore¶m1=1.12¶m2=1.12 HTTP/1.1" 200 11382
64.242.88.10 - - [07/Mar/2004:16:24:16 -0800] "GET /twiki/bin/view/Main/PeterThoeny HTTP/1.1" 200 4924
64.242.88.10 - - [07/Mar/2004:16:29:16 -0800] "GET /twiki/bin/edit/Main/Header_checks?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12851
64.242.88.10 - - [07/Mar/2004:16:30:29 -0800] "GET /twiki/bin/attach/Main/OfficeLocations HTTP/1.1" 401 12851
64.242.88.10 - - [07/Mar/2004:16:31:48 -0800] "GET /twiki/bin/view/TWiki/WebTopicEditTemplate HTTP/1.1" 200 3732
64.242.88.10 - - [07/Mar/2004:16:32:50 -0800] "GET /twiki/bin/view/Main/WebChanges HTTP/1.1" 200 40520
64.242.88.10 - - [07/Mar/2004:16:33:53 -0800] "GET /twiki/bin/edit/Main/Smtpd_etrn_restrictions?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12851
我可以将这些日志发送到弹性和 kibana 仪表板。管道已设置并且可以正常工作,但 geoip 无法正常工作。
这是我在搜索时的 kibana 输出。
"_index": "apache-2017.06.15",
"_type": "apache_logs",
"_id": "AVyqJhi6ItD-cRj2_AW6",
"_score": 1,
"_source":
"@timestamp": "2017-06-15T05:06:48.038Z",
"offset": 154,
"@version": "1",
"input_type": "log",
"beat":
"hostname": "sagar-machine",
"name": "sagar-machine",
"version": "5.3.2"
,
"host": "by-df164",
"source": """C:\Users\Sagar\Desktop\elastic_test4\data\log\apache-log.log""",
"message": """64.242.88.10 - - [07/Mar/2004:16:05:49 -0800] "GET /twiki/bin/edit/Main/Double_bounce_sender?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12846""",
"type": "log",
"tags": [
"beats_input_codec_plain_applied",
"_grokparsefailure",
"_geoip_lookup_failure"
]
知道为什么我会遇到这个问题。
【问题讨论】:
【参考方案1】:您有一个_grokparsefailure
,因此clientip
字段不存在。这会导致_geoip_lookup_failure
,因为geoip
过滤器正在获取不存在的clientip
字段。
您的日志匹配%COMMONAPACHELOG
模式,而不是您正在使用的模式。所以你的配置看起来像:
filter
grok
match =>
"message" => "%COMMONAPACHELOG"
...
使用正确的模式后,您应该注意到 clientip
字段存在,然后,希望 geoip
过滤器能够工作。 :)
【讨论】:
【参考方案2】:我不知道您的日志格式对于 apache 是否正确。因为你的日志看起来像这样
64.242.88.10 - - [07/Mar/2004:16:05:49 -0800] "GET /twiki/bin/edit/Main/Double_bounce_sender?topicparent=Main.ConfigurationVariables HTTP/1.1" 401 12846
标准的 apache 日志看起来像这样
149.148.126.144 - - [10/Sep/2017:06:30:44 -0700] "GET /apps/cart.jsp?appID=6944 HTTP/1.0" 200 4981 "http://hernandez.net/app/main/search/homepage.php" "Mozilla/5.0 (X11; Linux i686) AppleWebKit/5322 (Khtml, like Gecko) Chrome/13.0.896.0 Safari/5322"
我建议您将即将到来的 apache 日志格式标准化。否则默认的 grok 配置将不适合您。然后你必须为你的自定义日志编写你自己的 grok 模式。这将解析你即将到来的日志行
除此之外,您收到此类错误的原因有很多
你没有在你的 filebeat 中评论 'filebeat-template' 配置。当您直接发送时我们将使用的 filebeat 模板 从 filebeat 到 elastic 的日志。
更改文件节拍的配置。
filebeat.prospectors:
- input_type: log
paths: C:\Users\Sagar\Desktop\elastic_test4\data\log\*.log
output.logstash:
hosts: ["10.101.00.11:5043"]
您必须将“ingest-geoip”过滤器插件安装到 elastic 中 搜索。如果您没有使用任何外部数据库或服务。
您可以使用以下命令安装弹性插件
elasticsearch-plugin install ingest-geoip
我不确定您的弹性实例,因为它是默认的 监听 9200 端口而不是 80 端口。
您必须更改logstash 的配置脚本。下面是这样的。
input
beats
host => "10.101.00.11"
port => "5044"
filter
grok match => "message" => "%COMBINEDAPACHELOG"
geoip source => "clientip"
output
elasticsearch
#hosts => ["elastic-instance-1.es.amazonaws.com:80"]
hosts => ["elastic-instance-1.es.amazonaws.com:9200"]
index => "apache-%+YYYY.MM.dd"
stdout codec => rubydebug
应用这些配置后,您的输出将如下所示。
"_index": "apache-2017.09.21",
"_type": "log",
"_id": "AV6kqsr3A-YOTHfOm2US",
"_version": 1,
"_score": null,
"_source":
"request": "/apps/cart.jsp?appID=9421",
"agent": "\"Mozilla/5.0 (Windows 95; sl-SI; rv:1.9.2.20) Gecko/2017-08-19 13:55:15 Firefox/12.0\"",
"geoip":
"city_name": "Beijing",
"timezone": "Asia/Shanghai",
"ip": "106.121.102.198",
"latitude": 39.9289,
"country_name": "China",
"country_code2": "CN",
"continent_code": "AS",
"country_code3": "CN",
"region_name": "Beijing",
"location":
"lon": 116.3883,
"lat": 39.9289
,
"region_code": "11",
"longitude": 116.3883
,
"offset": 11050275,
"auth": "-",
"ident": "-",
"input_type": "log",
"verb": "POST",
"source": "C:\\Users\\admin\\Desktop\\experiment\\Elastic\\access_log_20170915-005134.log",
"message": "106.121.102.198 - - [19/Dec/2017:05:54:29 -0700] \"POST /apps/cart.jsp?appID=9421 HTTP/1.0\" 200 4984 \"http://cross.com/login/\" \"Mozilla/5.0 (Windows 95; sl-SI; rv:1.9.2.20) Gecko/2017-08-19 13:55:15 Firefox/12.0\"",
"type": "log",
"tags": [
"beats_input_codec_plain_applied"
],
"referrer": "\"http://cross.com/login/\"",
"@timestamp": "2017-09-21T13:39:55.047Z",
"response": "200",
"bytes": "4984",
"clientip": "106.121.102.198",
"@version": "1",
"beat":
"hostname": "DESKTOP-16QDF02",
"name": "DESKTOP-16QDF02",
"version": "5.5.2"
,
"host": "DESKTOP-16QDF02",
"httpversion": "1.0",
"timestamp": "19/Dec/2017:05:54:29 -0700"
,
"fields":
"@timestamp": [
1506001195047
]
,
"sort": [
1506001195047
]
我希望这是您正在寻找的解决方案..
【讨论】:
【参考方案3】:您可能必须确保 apache 日志的模式正确:
SYSLOGBASE %SYSLOGTIMESTAMP:timestamp (?:%SYSLOGFACILITY )?%SYSLOGHOST:logsource %SYSLOGPROG:
COMMONAPACHELOG %IPORHOST:clientip %USER:ident %USER:auth \[%HTTPDATE:timestamp\] "(?:%WORD:verb %NOTSPACE:request(?: HTTP/%NUMBER:httpversion)?|%DATA:rawrequest)" %NUMBER:response (?:%NUMBER:bytes|-)
COMBINEDAPACHELOG %COMMONAPACHELOG %QS:referrer %QS:agent
grok match的模式可以在https://github.com/elastic/logstash/blob/v1.4.2/patterns/grok-patterns查看详情。
除此之外,您还可以查看https://www.ip2location.com/tutorials/how-to-use-ip2location-filter-plugin-with-elk。
【讨论】:
虽然链接很有帮助,但最好将相关部分复制到格式化的代码块中。 Not everyone can access external sites, and the links may break over time以上是关于geoip查找失败弹性堆栈logstash的主要内容,如果未能解决你的问题,请参考以下文章
MaxMind GeoIP API:fseek() [function.fseek]:流不支持在 geoip.inc 中查找