[elk]停电日志离线恢复故障处理-elk环境极速搭建

Posted 毛台

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了[elk]停电日志离线恢复故障处理-elk环境极速搭建相关的知识,希望对你有一定的参考价值。

es数据手动导入

周末停电了两天,发现两天的日志没导入:
原因:  1. elk开启没设启动
       2.日志入库时间是当前时间,不是日志本身的time字段
- 导入步骤
1. 先把日志拖下来
2. 事先需要干掉残缺的那个索引
curl http://192.168.x.x:9200/_cat/indices?v
curl -XDELETE 'http://192.168.x.x:9200/app-2017.12.24'

3. 使用logstash导入离线日志
   索引名字按当天日期
input {
    beats {
        port => "5043"
    }
    file{
        path => "/root/logs/*.log"
        start_position => 'beginning'
        codec => "json"
        sincedb_path => "/root/logs/maotai.txt"  
        sincedb_write_interval => 1
    }
}

output {
    elasticsearch{
        hosts => ["192.168.x.x:9200"]
        index => "app-2017.12.25"
    }
#        stdout { codec => rubydebug }
}

发现问题: 导入完成后,kibana无法检索到当天的日志. 为何? 根本原因是 kibana是按照入库时间索引的,而logstash导入时候时间是系统当前时间

解决:
1,修改系统时间(如果有crontab,先停掉,后记得开启)
date -s "2017/12/25 09:00"

2.待导入完成后改回来
ntpdate  ntp1.aliyun.com

验证导入后没问题了(因为这写日志都不需要时序化,所以日期对应到当天就可以了.)

彻底完美入库解决时间问题

协调改日志格式,让每条日志追加"@timestamp" : "2017-12-06T09:23:51.244Z"字段作为es入库时间.

改程序输出日志格式- 追加一个字段(时间一定要是utc格式,相对现在差8小时)
    "@timestamp" : "2017-12-06T09:23:51.244Z"  给elk看.用于做为es入库时间.
    原time:字段也保留,方便定位问题

@timestamp修改原理

/usr/local/logstash/bin/logstash -e 'input {stdin{ codec => "json" }} output {stdout{ codec => rubydebug }}'

默认一条日志多了3个字段: @version host @timestamp

{
      "@version" => "1",
          "host" => "n1.ma.com",
    "@timestamp" => 2017-12-26T09:59:42.401Z,
       "message" => "sdf"
}

当日志条目本身有了@timestamp字段,就会覆盖系统自动追加的值.

设置开机启动elk(见下)

搭建elk测试环境

极速构建elk测试环境

目录约定:
存放logstash的配置 :/root/es/
存放es数据         :/data/es
存放启动log        :/tmp/

- 处理内核
sysctl -w vm.max_map_count=262144
vim /etc/security/limits.conf
*               soft    nproc           65536
*               hard    nproc           65536
*               soft    nofile          65536
*               hard    nofile          65536

- 安装es
useradd elk
cd /usr/local/src/
tar xf elasticsearch-6.0.0.tar.gz -C /usr/local/
tar xf kibana-6.0.0-linux-x86_64.tar.gz -C /usr/local/
tar xf logstash-6.0.0.tar.gz -C /usr/local/
ln -s /usr/local/elasticsearch-6.0.0 /usr/local/elasticsearch
ln -s /usr/local/kibana-6.0.0-linux-x86_64 /usr/local/kibana
ln -s /usr/local/logstash-6.0.0 /usr/local/logstash
cd

chown -R elk. /usr/local/elasticsearch
chown -R elk. /usr/local/elasticsearch/
chown -R elk. /usr/local/kibana
chown -R elk. /usr/local/kibana/
chown -R elk. /usr/local/logstash
chown -R elk. /usr/local/logstash/

sed -i 's#\#network.host: 192.168.0.1#network.host: 0.0.0.0#g' /usr/local/elasticsearch/config/elasticsearch.yml
echo 'http.cors.enabled: true' >> /usr/local/elasticsearch/config/elasticsearch.yml
echo 'http.cors.allow-origin: "*"' >> /usr/local/elasticsearch/config/elasticsearch.yml
sed -i 's#\#server.host: "localhost"#server.host: "0.0.0.0"#g' /usr/local/kibana/config/kibana.yml 
mkdir -p /data/es/{data,logs}
chown -R elk. /data/es
sed -i 's#\#path.data: /path/to/logs#path.data: /data/es/data#g' /usr/local/elasticsearch/config/elasticsearch.yml
sed -i 's#\#path.data: /path/to/logs#path.data: /data/es/logs#g' /usr/local/elasticsearch/config/elasticsearch.yml


- on the fly启动es和kibana
su - elk -c "/usr/local/elasticsearch/bin/elasticsearch"
su - elk -c "/usr/local/kibana/bin/kibana"

启动logstash

docker run -d -v /etc/localtime:/etc/localtime --restart=always -p 9100:9100 mobz/elasticsearch-head:5

- 放到stdout

/usr/local/logstash/bin/logstash -e 'input {stdin{ codec => "json" }} output {stdout{ codec => rubydebug }}'


echo "/usr/local/logstash/bin/logstash -e 'input {stdin{ codec => "json" }} output {stdout{ codec => rubydebug }}'" > /root/es/pipeline.sh
sh pipeline.sh


- 放到es和stdout-(默认logstash索引)
/usr/local/logstash/bin/logstash -e 'input {stdin{ codec => "json" }} output {stdout{ codec => rubydebug } elasticsearch {}}'

mkdir -p /root/es/
cat > /root/es/pipeline-file.conf<<EOF
input {
    beats {
        port => "5043"
    }
    file{
        path => "/root/logs/*.log"
        start_position => 'beginning'
        codec => "json"
        sincedb_path => "/root/logs/maotai.txt"  
        sincedb_write_interval => 1
    }
}

output {
    elasticsearch{
        hosts => ["192.168.x.x:9200"]
        #index => "syslog-%{+YYYY.MM.dd}"
        index => "x.x-2017.12.25"
    }
    stdout { codec => rubydebug }
}
EOF

elk无法开机启动排错-处理rc.local

1, rc.local追加的日志一定要/tmp下.否则可能启动不了,即使相关dir给了elk用户权限
2. max_map_count也要写在rc.local,经过我多次测试,一次性没解决问题

vim /etc/rc.local
sysctl -w vm.max_map_count=262144
/usr/bin/nohup /bin/su - elk -c "/usr/local/elasticsearch/bin/elasticsearch" > /tmp/es-start.log 2>&1 &
/usr/bin/nohup /bin/su - elk -c "/usr/local/kibana/bin/kibana" > /tmp/kibana-start.log 2>&1 &
/usr/local/logstash/bin/logstash -f /root/es/pipeline.conf > /tmp/logstash-start.log 2>&1 &

head插件

docker run -d -v /etc/localtime:/etc/localtime --restart=always -p 9100:9100 mobz/elasticsearch-head:5

elk docker模式启动

参考:http://elk-docker.readthedocs.io/#installation
https://github.com/gregbkr/elk-dashboard-v5-docker

注: elk容器要占2g内存,vm分配至少给2g 
sysctl -w vm.max_map_count=262144

docker run -d -v /etc/localtime:/etc/localtime --restart=always -p 5601:5601 -p 9200:9200 -p 5044:5044 -it --name elk sebp/elk

docker run -d -v /etc/localtime:/etc/localtime --restart=always -p 9100:9100 mobz/elasticsearch-head:5
或 docker-compose up -d

以上是关于[elk]停电日志离线恢复故障处理-elk环境极速搭建的主要内容,如果未能解决你的问题,请参考以下文章

日志分析系统ELK!

ELK是什么

案例:ELK日志分析系统

ELK部署生产实践部署

离线部署ELK+kafka日志管理系统

离线部署ELK+kafka日志管理系统