ELK集群部署

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ELK集群部署相关的知识,希望对你有一定的参考价值。

ELK简介

1.ElasticSearch简称ES,它是一个实时的分布式搜索和分析引擎,它可以用于全文搜索,结构化搜索以及分析。它是一个建立在全文搜索引擎 Apache Lucene 基础上的搜索引擎,使用 Java 语言编写。

2.Logstash是一个具有实时传输能力的数据收集引擎,用来进行数据收集(如:读取文本文件)、解析、过滤,并将数据发送给ES。

3.Kibana为 Elasticsearch 提供了分析和可视化的 Web 平台。它可以在 Elasticsearch 的索引中查找,交互数据,并生成各种维度表格、图形。

环境准备
cat /etc/redhat-release
CentOS Linux release 7.7.1908 (Core)

角色划分
NODE IP(自己设置) 节点类型

elk-node1 192.168.1.123 数据、主节点(安装elasticsearch、logstash、kabana、filebeat)

elk-node2 192.168.1.124 数据节点(安装elasticsearch、filebeat)

elk-node3 192.168.1.125 数据节点(安装elasticsearch、filebeat)

安装jdk11 (两种安装方式)

------------------------------二进制安装------------------------------

下载安装包

cd /home/tools 
wget https://download.java.net/java/GA/jdk11/13/GPL/openjdk-11.0.1_linux-x64_bin.tar.gz

解压到指定目录

tar -xvf openjdk-11.0.1_linux-x64_bin.tar.gz -C /usr/local/jdk  

配置环境变量(set java environment)

JAVA_HOME=/usr/local/jdk/jdk-11.0.1
CLASSPATH=$JAVA_HOME/lib/
PATH=$PATH:$JAVA_HOME/bin
export PATH JAVA_HOME CLASSPATH

使环境变量生效

source  /etc/profile

---------------------------------yun安装------------------------------

yum -y install java
查看版本
java -version

修改系统内核参数,调整最大虚拟内存映射空间

sysctl -w vm.max_map_count=262144
echo >> /etc/sysctl.conf <<EOF vm.max_map_count=262144 EOF

sudo vi /etc/security/limits.conf
* soft nofile  1000000
* hard nofile 1000000
* soft nproc  1000000
* hard nproc 1000000
* soft memlock unlimited
* hard memlock unlimited
sysctl -p

下载依赖包,安装repo源

yum install -y yum-utils device-mapper-persistent-data lvm2 net-tools vim lrzsz tree screen lsof tcpdump wget ntpdate
vi /etc/yum.repos.d/elastic.repo    

[elasticsearch-7.x]
name=Elasticsearch repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1   
autorefresh=1
type=rpm-md

[kibana-7.x]
name=Kibana repository for 7.x packages
baseurl=https://artifacts.elastic.co/packages/7.x/yum
gpgcheck=1
gpgkey=https://artifacts.elastic.co/GPG-KEY-elasticsearch
enabled=1
autorefresh=1
type=rpm-md
yum repolist

修改hosts文件

vi /etc/hosts
elk-node1   192.168.1.123
elk-node2   192.168.1.124
elk-node3   192.168.1.125

部署elasticsearch集群,在所有节点上操作

yum -y install elasticsearch
mv /etc/elasticsearch/elasticsearch.yml /etc/elasticsearch/elasticsearch.bak
vi /etc/elasticsearch/elasticsearch.yml
cluster.name: my-elk
node.name: elk-node1    #(对应主机名)
path.data: /var/lib/elasticsearch
path.logs: /var/log/elasticsearch
transport.tcp.compress: true
network.host: 0.0.0.0
http.port: 9200
transport.tcp.port: 9300
discovery.seed_hosts: ["192.168.1.123","192.168.1.124","192.168.1.125"]
cluster.initial_master_nodes: ["192.168.1.123","192.168.1.124","192.168.1.125"]
network.publish_host: 192.168.1.123 ##本机IP (重点)
node.master: true
node.data: true
xpack.security.enabled: true
http.cors.enabled: true
http.cors.allow-origin: "*" ##跨域访问,支持head插件可以访问es

-------------------------------可忽略内容-----------------------------

xpack.security.transport.ssl.enabled: true
xpack.security.transport.ssl.verification_mode: certificate
xpack.security.transport.ssl.keystore.path: /etc/elasticsearch/elastic-certificates.p12
xpack.security.transport.ssl.truststore.path: /etc/elasticsearch/elastic-certificates.p12

elasticesearch在实际生产中非常消耗内存,需要将初始申请的JVM内存调高,默认是1G

vi /etc/elasticsearch/jvm.options 

修改这两行

-Xms4g #设置最小堆的值为4g
-Xmx4g #设置组大堆的值为4g

ElasticSearch默认情况下会每天rolling一个文件,当到达2G的时候,才开始清除超出的部分,
当一个文件只有几十K的时候,文件会一直累计下来。

vi /etc/elasticsearch/log4j2.properties ##

appender.rolling.strategy.action.condition.nested_condition.type = IfLastModified
appender.rolling.strategy.action.condition.nested_condition.age = 30D
限制集群日志增长,这里只保存30天的日志

参考文档:https://blog.51cto.com/huanghai/2430038 ElasticSearch集群日志限制问题
配置TLS和身份验证
在Elasticsearch主节点上配置TLS.
cd /usr/share/elasticsearch/
./bin/elasticsearch-certutil ca ##一直用enter键
./bin/elasticsearch-certutil cert --ca elastic-stack-ca.p12
ll
total 540
drwxr-xr-x 2 root root 4096 Jun 28 10:42 bin
-rw------- 1 root root 3443 Jun 28 16:46 elastic-certificates.p12
-rw------- 1 root root 2527 Jun 28 16:43 elastic-stack-ca.p12
drwxr-xr-x 8 root root 96 Jun 28 10:42 jdk
drwxr-xr-x 3 root root 4096 Jun 28 10:42 lib
-rw-r--r-- 1 root root 13675 Jun 20 23:50 LICENSE.txt
drwxr-xr-x 30 root root 4096 Jun 28 10:42 modules
-rw-rw-r-- 1 root root 502598 Jun 20 23:56 NOTICE.txt
drwxr-xr-x 2 root root 6 Jun 21 00:04 plugins
-rw-r--r-- 1 root root 8478 Jun 20 23:50 README.textile

给生产的文件添加elasticsearch组权限
chgrp elasticsearch /usr/share/elasticsearch/elastic-certificates.p12 /usr/share/elasticsearch/elastic-stack-ca.p12
给这两个文件赋640权限
chmod 640 /usr/share/elasticsearch/elastic-certificates.p12 /usr/share/elasticsearch/elastic-stack-ca.p12
把这两个文件移动到elasticsearch配置文件夹中
mv /usr/share/elasticsearch/elastic-* /etc/elasticsearch/
将tls身份验证文件拷贝到节点配置文件夹中
scp /etc/elasticsearch/elastic-certificates.p12 root@192.168.1.123:/etc/elasticsearch/
scp /etc/elasticsearch/elastic-stack-ca.p12 root@192.168.1.123:/etc/elasticsearch/

>----------------------------------------------------------------------

**启动服务,验证集群(注意,云服务器安全组配置,依次启动,先主节点集群,在随后启动其他节点**

systemctl start elasticsearch


**设置密码**

/usr/share/elasticsearch/bin/elasticsearch-setup-passwords interactive

> ##统一设置密码为123456

**验证集群**

http://192.168.243.163:9200/_cluster/health?pretty ##浏览器访问
{
"cluster_name" : "my-elk",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,##节点数
"number_of_data_nodes" : 3, ##数据节点数
"active_primary_shards" : 4,
"active_shards" : 8,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}


>----------------------------问题总结----------------------------------
> 问题:集群节点意外挂掉,重启后无法加入集群
> 解答:
> 因为初始状态的时候我们启动es服务,很容易会默认启动一个集群,这个
> 集群中只有一个当前结点作为master,会生成一个cluster_uuid,这个参数不
> 会因为cluster_name的变化而变更。为了让结点能加入到集群中去,我们需要
> 删除掉之前的node信息,重新让node加入到集群中去。
> 对于每个结点做以下操作:
> systemctl stop elasticsearch  
> cd /var/lib/elasticsearch/nodes
> rm -fr 0
> systemctl start elasticsearch
> 问题就能解决

**部署kibana**,**yum源安装 #在任意节点上安装**

yum -y install kibana

**修改kibana配置文件**

vi /etc/kibana/kibana.yml

server.port: 5601
server.host: "0.0.0.0"
server.name: "elk-node2"
elasticsearch.hosts: ["http://192.168.1.123:9200","http://192.168.1.124:9200","http://192.168.1.125:9200"]
elasticsearch.username: "elastic"
elasticsearch.password: "123456"
i18n.locale: "en"


**启动服务**

systemctl start kibana

**浏览器访问**

http://192.168.243.162:5601/

**安装logstash**,**在主节点上进行部署**

>--------------------------------YUM安装------------------------------

yum -y install logstash

>-------------------------------二进制安装----------------------------

wget https://artifacts.elastic.co/downloads/logstash/logstash-7.4.1.tar.gz
tar -zvxf logstash-7.4.1.tar.gz -C /home/elk
mkdir -p /data/logstash/{logs,data}

**修改配置文件**

vi /etc/logstash/logstash.conf

input {
beats {
port => 5044
}
}

filter {
grok {
match => {
"message" => "(?<temMsg>(?<=logBegin ).?(?=logEnd))"
}
overwrite => ["temMsg"]
}
grok {
match => {
"temMsg" => "(?<reqId>(?<=reqId:).
?(?=,operatName))"
}
overwrite => ["reqId"]
}
grok {
match => {
"temMsg" => "(?<operatName>(?<=operatName:).?(?=,operatUser))"
}
overwrite => ["operatName"]
}
grok {
match => {
"temMsg" => "(?<operatUser>(?<=operatUser:).
?(?=,userType))"
}
overwrite => ["operatUser"]
}
grok {
match => {
"temMsg" => "(?<userType>(?<=userType:).?(?=,requestTime))"
}
overwrite => ["userType"]
}
grok {
match => {
"temMsg" => "(?<requestTime>(?<=requestTime:).
?(?=,method))"
}
overwrite => ["requestTime"]
}
grok {
match => {
"temMsg" => "(?<method>(?<=method:).?(?=,params))"
}
overwrite => ["method"]
}
grok {
match => {
"temMsg" => "(?<params>(?<=params:).
?(?=,operatIp))"
}
overwrite => ["params"]
}
grok {
match => {
"temMsg" => "(?<operatIp>(?<=operatIp:).?(?=,executionTime))"
}
overwrite => ["operatIp"]
}
grok {
match => {
"temMsg" => "(?<executionTime>(?<=executionTime:).
?(?=,operatDesc))"
}
overwrite => ["executionTime"]
}
grok {
match => {
"temMsg" => "(?<operatDesc>(?<=operatDesc:).?(?=result))"
}
overwrite => ["operatDesc"]
}
grok {
match => {
"temMsg" => "(?<result>(?<=result:).
?(?=,siteCode))"
}
overwrite => ["result"]
}
grok {
match => {
"temMsg" => "(?<siteCode>(?<=siteCode:).?(?=,module))"
}
overwrite => ["siteCode"]
}
grok {
match => {
"temMsg" => "(?<module>(?<=module:).
?(?= ))"
}
overwrite => ["module"]
}
grok {
match => [
"message", "%{NOTSPACE:temMsg}"
]
}
json {
source => "temMsg"

field_split => ","

value_split => ":"

remove_field => [ "@timestamp","message","path","@version","path","host" ]

    }
    urldecode {
            all_fields => true
            }

  mutate {
    rename => {"temMsg" => "message"}
    remove_field => [ "message" ]
    }

}
output {
elasticsearch {
hosts => ["192.168.1.123:9200","192.168.1.124:9200","192.168.1.125:9200"]
user => "elastic"
password => "123456"

flush_size => 20000 ##攒到 20000 条数据一次性发送出去

idle_flush_time => 10 ##如果 10 秒钟内也没攒够 20000 条,Logstash 还是会以当前攒到的数据量发一次

     #默认情况下,flush_size 是 500 条,idle_flush_time 是 1 秒。
    index => "logstash-%{+YYYY.MM.dd}"
}

}


**查看内容**

egrep "#|^$" /etc/logstash/conf.d/logstash_debug.conf

vi /etc/logstash/logstash.yml

http.host: "elk-master"
path.data: /home/elk/data/logstash/data
path.logs: /data/logstash/logstash/logs
xpack.monitoring.enabled: true #kibana监控插件中启动监控logstash
xpack.monitoring.elasticsearch.hosts: ["192.168.1.123:9200","192.168.1.124:9200","192.168.1.125:9200"]


**优化操作##Logstash会一直增长gc文件和不停增多的rolling日志文件,并且不会删除**

vi /etc/logstash/log4j2.properties

appender.rolling.strategy.type = DefaultRolloverStrategy
appender.rolling.strategy.action.type = Delete
appender.rolling.strategy.action.basepath = ${sys:ls.logs}
appender.rolling.strategy.action.condition.type = IfFileName
appender.rolling.strategy.action.condition.glob = ${sys:ls.logs}/logstash-${sys:ls.log.format}
appender.rolling.strategy.action.condition.nested_condition.type = IfLastModified
appender.rolling.strategy.action.condition.nested_condition.age = 15D


**启动logstash服务**

systemctl start logstash


**二进制启动方式**

/usr/share/logstash/bin/logstash -f /etc/logstash/logstash.conf


**启动成功后方可在最后加&放到后台执行**

/usr/share/logstash/bin/logstash -f /etc/logstash/logstash.conf &


**自动重新加载配置**

./bin/lagstash -f configfile.conf --config.reload.automatic > /dev/null 2>&1 &


**部署filebeat**

**下载filebeat**

yum -y install filebeat

**编辑配置文件**

vi /etc/filebaet/filebaet.conf

filebeat.inputs:

  • type: log
    enabled: true
    paths:
    • /srv/docker/produce///cloud.log #推送的日志路径
      include_lines: [".
      logBegin.",".logEnd.*"]

      multiline.pattern: ^[

      multiline.negate: true

      multiline.match: after

      filebeat.config.modules:
      path: ${path.config}/modules.d/*.yml
      reload.enabled: false
      setup.template.settings:
      index.number_of_shards: 1
      setup.kibana:
      hosts: ["192.168.1.123:5601"] #推到es主服务器-ip
      output.logstash:
      hosts: ["192.168.1.123:5044"] #推到es主服务器-ip
      processors:

    • add_host_metadata: ~
    • add_cloud_metadata: ~

> ----------------------------另一份配置---------------------------

filebeat.inputs:

  • type: log
    enabled: true
    paths:
    • /var/log/nginx/access.log
      fields:
      document_type: nginx
      tags: ["nginx_log","sj_access_log"] #推送的日志路径
  • type: log
    enabled: true
    paths:
    • /var/log/nginx/error.log
      tags: ["nginx_log","sj_error_log"] #推送的日志路径
      filebeat.config.modules:
      path: ${path.config}/modules.d/*.yml
      reload.enabled: false
      setup.template.settings:
      index.number_of_shards: 3
      setup.kibana:
      hosts: ["192.168.1.123:5601"] #推到es主服务器-ip
      output.logstash:
      hosts: ["192.168.1.123:5044"] #推到es主服务器-ip
      processors:
    • add_host_metadata: ~
    • add_cloud_metadata: ~
      **启动filebeat**

      systemctl start filebeat

以上是关于ELK集群部署的主要内容,如果未能解决你的问题,请参考以下文章

在k8s集群部署ELK

ELK 中的elasticsearch 集群的部署

ELK集群部署

在 K8S 上部署 ELK 7.14 集群实现采集容器日志

ELK集群部署

ELK简介 es集群部署 es插件应用