ELK 集群搭建总结

Posted 2023-05-16

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了ELK 集群搭建总结相关的知识，希望对你有一定的参考价值。

参考技术A 因公司开发人员查询线上日志困难需求，故计划搭建 ELK 系统解决这一问题。了解到之前搭建过单机单节点的 ELK，但由于负载内存过高，停止弃用了。所以这次准备了三台性能不错的服务器，开始搭建 ELK 集群。
过程曲折且艰辛，记录下来以备不时之需。

由于这种方案，每个 logstash 都需要占用较大内存，这对线上各日志收集的应用服务器，压力太大难以承受。

filebeat是一个轻量级的日志采集器，部署简单占用内存小。这一方案总体上比较好了，只是 logstash 这一节点的压力比较大，查询到filebeat可以负载均衡输出到多个logstash，所以后边考虑了在准备的三台 elk 服务器上都安装一个 logstash ，这样就实现了下边这一方案。

上边的方案其实已经能够满足一般公司的日志需求，但超大的日志数量可能会存在数据错乱缺失，节点脑裂等多个问题。要尽量解决这些问题，要做的工作还很多，这里收集部分网上的建议，记录如下：

在个服务器上通过 yum install -y ***.rpm 直接快速安装
安装后程序位置都在 /usr/share/ 下
配置文件都在 /etc/ 下

建议用 ansible 管理
启动：elasticsearch --- logstash --- filebeat --- kibana
停止：kibana --- filebeat --- logstash --- elasticsearch

ELK集群搭建过程记录—7.6.2版本

搭建过程

1、Elasticsearch集群搭建

2、Cerebro插件安装

tar -zxvf cerebro-0.9.1.tgz 解压cerebro压缩包

配置cerebro文件夹下 application.conf 配置集群的地址

./cerebro -Dhttp.port=8080 启动cerebro插件，配置端口

nohup ./cerebro -Dhttp.port=8080 & 解决ctrl+c退出的问题

http://127.0.0.1:8080/#/connect 插件浏览器访问地址

http://127.0.0.1:9200 连接集群的地址

3、Kibana安装

4、Kibana管理Elasticsearch

搭建过程中遇到的问题汇总（更新中）

1、Cerebro无法访问

2、ES集群脑裂问题

ES集群搭建了3个节点，node-1、node-2、node-3，搭建完以后，在cerebro中进行查看，生成了2个集群，node-1自己成了一个集群，node-2、node-3这2个节点成了一个集群

org.elasticsearch.transport.RemoteTransportException: [node-2][127.0.0.1:9300][internal:cluster/coordination/join/validate]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: join validation on cluster state with a different cluster uuid E4a-eGNOSCG7_VDmzQPs8w than local cluster uuid RluWhBm3RA-nOJzKIP-NSw, rejecting

org.elasticsearch.transport.RemoteTransportException: [node-2][127.0.0.1:9300][internal:cluster/coordination/join]
Caused by: org.elasticsearch.cluster.coordination.CoordinationStateRejectedException: incoming term 44 does not match current term 45

尝试解决：重启，修改elasticsearch.yml配置参数，都没解决

最后解决方法：删除data文件夹
原因：因为该节点之前启动过ES，已经创建了data文件夹，与要加入的集群冲突。

3、kibana启动报错问题

    log   [09:21:17.452] [warning][savedobjects-service] Unable to connect to Elasticsearch. Error: [resource_already_exists_exception] index [.kibana_1/-eb6AR3SQ3usPaLJN76t5w] already exists, with { index_uuid="-eb6AR3SQ3usPaLJN76t5w" & index=".kibana_1" }
  log   [09:21:17.453] [warning][savedobjects-service] Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_1 and restarting Kibana.
  log   [09:21:17.493] [info][savedobjects-service] Creating index .kibana_task_manager_1.
  log   [09:21:17.496] [warning][savedobjects-service] Unable to connect to Elasticsearch. Error: [resource_already_exists_exception] index [.kibana_task_manager_1/whkoKPqJSYGAzgXMGxQXbQ] already exists, with { index_uuid="whkoKPqJSYGAzgXMGxQXbQ" & index=".kibana_task_manager_1" }
  log   [09:21:17.497] [warning][savedobjects-service] Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_task_manager_1 and restarting Kibana.
 Generating a random key for xpack.encryptedSavedObjects.encryptionKey. To be able to decrypt encrypted saved objects attributes after restart, please set xpack.encryptedSavedObjects.encryptionKey in kibana.yml

解决方法：在elastic中删除kibana的索引

cd /u02/tomcat/elasticsearch-7.6.2
curl -XDELETE http://127.0.0.1:9200/.kibana*

原因：索引已经存在

4、kibana中如何管理ES的索引，并进行对应关联

以上是关于ELK 集群搭建总结的主要内容，如果未能解决你的问题，请参考以下文章

ELK集群搭建简略记录

一步一下搭建ELK集群

elk6.2集群搭建,cerebro集群管理

ELK介绍及搭建 Elasticsearch 分布式集群

Linux | ELK 8.2搭建ELKB集群Ⅰ—— 实验环境说明和搭建Elasticsearch集群

centos7搭建ELK Cluster集群日志分析平台