elasticsearch高级配置

Posted 2023-04-28

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了elasticsearch高级配置相关的知识，希望对你有一定的参考价值。

参考技术A

title: elasticsearch高级配置
date: 2020-10-16 09:00:39
categories: elk
tags:
- 配置
- elasticsearch

大多数设置可以使用集群更新API 来更改

Elasticsearch提供了三个主要的配置文件，我们所有的配置都通过这个三个文件：

您应该很少需要更改Java虚拟机（JVM）选项。如果这样做，最可能的更改是设置堆大小。

10. 堆转储路径

默认情况，配置jvm存储堆栈溢出到data文件夹里，如果该目录不支持则需要修改。

11. GC记录

默认情况下，Elasticsearch启用GC日志。这些配置在 jvm.options默认位置和默认位置与Elasticsearch日志相同。默认配置每64 MB轮换一次日志，最多可占用2 GB的磁盘空间。

某些设置是敏感的，依靠文件系统权限来保护其值是不够的。对于此用例，Elasticsearch提供了一个密钥库和elasticsearch-keystore管理密钥库中设置的工具。

在投入生产之前，必须考虑以下设置：

1.路径设置 
主要是数据和日志路径

对于数据，可以有多个路径如：

data中生成目录形式：elasticsearch/nodes/0/indices/uuid/shard/

2. 群集名称 
这是节点加入集群的唯一方法，默认的是elasticsearch

3. 节点名称 
节点名称有助于区别不同类型(可以使用环境变量如 node.name: $HOSTNAME )
 4. 网络主机 network.host 
一般设置本机ip ，一旦设置为非回环地址，即默认视为生产模式，就需要注意系统配置
一旦配置了类似的网络设置network.host，Elasticsearch就会假定您正在转向生产并将上述警告升级为异常。这些异常将阻止您的Elasticsearch节点启动。这是一项重要的安全措施，可确保您不会因服务器配置错误而丢失数据。
 5. 发现设置

①：如果不指定port，则会使用transport.tcp.port
②：如果是多个ip的主机，则会访问所有解析到的ip
discovery.zen.minimum_master_nodes
最好为（master_eligible_nodes / 2）+ 1 3个节点的情况设置2个

1. 资源限制： 
要将打开文件句柄（ulimit -n）的数量设置为65,536 /etc/security/limits.conf

2. Disable swapping

交换对性能，节点稳定性非常不利，应该不惜一切代价避免。它可能导致垃圾收集持续数分钟而不是毫秒，并且可能导致节点响应缓慢甚至断开与群集的连接。在弹性分布式系统中，让操作系统终止节点更有效。
三种方案：

3. Increase file descriptors

4. Ensure sufficient virtual memory

5. Ensure sufficient threads

6. JVM DNS cache settings 
 7. Temporary directory not mounted with noexec

8. 临时目录

Generate a private key and X.509 certificate.
生成节点证书

结论： 开源基础版，无法使用安全功能

elasticsearch-rest-high-level-client操作elasticsearch

摘要

elasticsearch-rest-high-level-client操作elasticsearch
闲的无聊，于是写了这一篇爽文，米娜桑可直接用，除非几乎不可能有bug，有bug当我没说（doge）
QA：无想的一刀欧为啥不用springboot封装的操作依赖涅?
欧认为springboot对操作类过度封装，实现普通简单操作还行，但是涉及到较为复杂的操作时，难以使用，尤其是不同版本的springboot推出的api变化频繁，更加难以使用，es官方推出的api更新不会让操作类变化太频繁，个人感觉spboot操作不如es官方推出的api灵活强大，之前在工作中遇到的需求使用springboot提供的报错难以琢磨，且难以满足需求，所以使用了官方api
elasticsearch版本：7.4
安装操作文档：https://blog.csdn.net/UnicornRe/article/details/121747039?spm=1001.2014.3001.5501

依赖

依赖最好保持与es版本一致，如果以下依赖报错，在maven < parent > 同级标签旁加上

<properties>
        <java.version>1.8</java.version>
        <!-- <spring-cloud.version>2020.0.2</spring-cloud.version> -->
        <!--解决版本问题-->
        <elasticsearch.version>7.4.0</elasticsearch.version>
</properties>

<!--elasticsearch-->
<dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>7.4.0</version>
</dependency>
<dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>7.4.0</version>
</dependency>

yml配置

可自行修改配置和代码增加多台es机器，address逗号隔开

elasticsearch:
  schema: http
  address: 192.168.52.43:9200
  connectTimeout: 5000
  socketTimeout: 5000
  connectionRequestTimeout: 5000
  maxConnectNum: 100
  maxConnectPerRoute: 100

连接配置

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestClientBuilder;
import org.elasticsearch.client.RestHighLevelClient;
import org.springframework.beans.factory.annotation.Value;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.context.annotation.Scope;
import java.time.Duration;
import java.util.ArrayList;
import java.util.List;

@Configuration
public class EsHighLevalConfigure 
    //协议 
    @Value("$elasticsearch.schema:http")
    private String schema="http";
    // 集群地址，如果有多个用“,”隔开 
    @Value("$elasticsearch.address")
    private String address;
    // 连接超时时间 
    @Value("$elasticsearch.connectTimeout:5000")
    private int connectTimeout;
    // Socket 连接超时时间 
    @Value("$elasticsearch.socketTimeout:10000")
    private int socketTimeout;
    // 获取连接的超时时间 
    @Value("$elasticsearch.connectionRequestTimeout:5000")
    private int connectionRequestTimeout;
    // 最大连接数 
    @Value("$elasticsearch.maxConnectNum:100")
    private int maxConnectNum;
    // 最大路由连接数 
    @Value("$elasticsearch.maxConnectPerRoute:100")
    private int maxConnectPerRoute;

    @Bean
    public static RestHighLevelClient restHighLevelClient() 
        List<HttpHost> hostLists = new ArrayList<>();
        String[] hostList = address.split(",");
        for (String addr : hostList) 
            String host = addr.split(":")[0];
            String port = addr.split(":")[1];
            hostLists.add(new HttpHost(host, Integer.parseInt(port), schema));
        
        HttpHost[] httpHost = hostLists.toArray(new HttpHost[]);
        // 构建连接对象
        RestClientBuilder builder = RestClient.builder(httpHost);
        // 异步连接延时配置
        builder.setRequestConfigCallback(requestConfigBuilder -> 
            requestConfigBuilder.setConnectTimeout(connectTimeout);
            requestConfigBuilder.setSocketTimeout(socketTimeout);
            requestConfigBuilder.setConnectionRequestTimeout(connectionRequestTimeout);
            return requestConfigBuilder;
        );
        // 异步连接数配置
        builder.setHttpClientConfigCallback(httpClientBuilder -> 
            httpClientBuilder.setMaxConnTotal(maxConnectNum);
            httpClientBuilder.setMaxConnPerRoute(maxConnectPerRoute);
            httpClientBuilder.setKeepAliveStrategy((response, context) -> Duration.ofMinutes(5).toMillis());
            return httpClientBuilder;
        );
        return new RestHighLevelClient(builder);

索引结构

虽然索引结构肯定不是和你们一样的，但是代码结构不需要伤经动骨,
我来简单说说这个结构吧，一条知识产权信息內包含n个文档annex，包含n个（申请人发明人）applicant，
所以使用了 “type”: “nested"嵌套类型，不晓得与"type”: "object"区别的小伙伴自行学习吧，这里就不多说了。
想要学习部分优化的，安装，数据迁移冷备份的可以看看我的文章：（东西太多，部分就没写）https://blog.csdn.net/UnicornRe/article/details/121747039?spm=1001.2014.3001.5501

PUT /intellectual

  "settings": 
    "number_of_shards": 1,
    "number_of_replicas": 1
  

 PUT /intellectual/_mapping

        "properties": 
            "id": 
            "type": "long"
            ,
            "name": 
            "type": "text",
            "analyzer": "ik_max_word",
            "search_analyzer": "ik_smart"
            ,
            "type": 
            "type": "keyword"
            ,
            "keycode": 
            "type": "text",
             "analyzer": "ik_max_word",
             "search_analyzer": "ik_smart"
            ,
            "officeId": 
            "type": "keyword"
            ,
            "officeName": 
            "type": "keyword"
            ,
            "titular": 
            "type": "keyword"
            ,
            "applyTime": 
            "type": "long"
            ,
            "endTime": 
            "type": "long"
            ,
            "status": 
            "type": "keyword"
            ,
            "agentName": 
             "type": "text",
             "analyzer": "ik_smart",
             "search_analyzer": "ik_smart"
            ,
            "annex": 
                "type": "nested",
                "properties": 
                    "id": 
                    "type": "long"
                    ,
                    "name": 
                     "type": "text",
                     "analyzer": "ik_max_word",
                     "search_analyzer": "ik_smart"
                    ,
                    "content": 
                     "type": "text",
                      "analyzer": "ik_max_word",
                      "search_analyzer": "ik_max_word"
                       ,
                    "createTime": 
                        "type": "long"
                    
                
            ,
            "applicant": 
                    "type": "nested",
                    "properties": 
                                "id": 
                                "type": "long"
                                ,
                                "applicantId": 
                                 "type": "long"
                                ,
                                "isOffice": 
                                  "type": "integer"
                                ,
                                "userName": 
                                 "type": "text",
                                 "analyzer": "ik_max_word",
                                 "search_analyzer": "ik_smart"
                                ,
                                "outUsername": 
                                     "type": "text",
                                     "analyzer": "ik_max_word",
                                     "search_analyzer": "ik_smart"

普通常见非嵌套结构的CRUD

先不管"type": "nested"嵌套的对象，只对普通字段操作
我先定义一个实体类IntellectualEntity字段和上面的mapping一致
所有操作都注入了RestHighLevelClient restHighLevelClient

新增

public void insertIntel(IntellectualEntity intellectualEntity) throws IOException 
        //intellectual为索引名
        IndexRequest indexRequest = new IndexRequest("intellectual")
        .source(JSON.toJSONString(intellectualEntity), XContentType.JSON)
        .setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
        .id(intellectualEntity.getId()+"");//手动指定es文档的id
        IndexResponse out = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
        log.info("状态：", out.status());

更新（根据id更新）

只会更新entity不为空的字段，如同mybatisplus默认自带的update
因为es文档的id一定唯一，所以方法最多只能更新一条

public void updateIntel(IntellectualEntity entity) throws IOException 
        //根据IntellectualEntity的id更新文档
        UpdateRequest updateRequest = new UpdateRequest("intellectual", entity.getId()+"");
        byte[] json = JSON.toJSONBytes(entity);
        updateRequest.doc(json, XContentType.JSON);
        UpdateResponse response = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);
        log.info("状态：", response.status());

更新（高级，根据搜索条件更新，采用无痛painless脚本）

painless脚本适用很多业务复杂的场合，比如如下更新值字段为map里的字段

private void updateByQuery(IntellectualEntity entity) throws IOException 
        UpdateByQueryRequest updateByQueryRequest = new UpdateByQueryRequest();
        updateByQueryRequest.indices("intellectual");
        //搜索条件为id(因为插入时指定doc的id和实体类id一致，这样就保证了搜索结果唯一)
        //如果搜索条件查出的结果很多，使用需谨慎
        updateByQueryRequest.setQuery(new TermQueryBuilder("id", entity.getId()));
        //map存储脚本实体参数值
        Map<String,Object> map=new HashMap<>();
        map.put("intelName", entity.getName());
        map.put("intelStatus", entity.getStatus());
        map.put("intelApplyTime", entity.getApplyTime());
        map.put("intelKeyCode", entity.getKeycode());
        map.put("intelEndTime", entity.getEndTime());
        map.put("intelType", entity.getType());
        map.put("intelTitular", entity.getTitular());
        //指定哪些字段需要更新,ctx._source.xxx为es的字段，使用map的值赋值更新
        updateByQueryRequest.setScript(new Script(ScriptType.INLINE,
                "painless",
                "ctx._source.intelName=params.intelName;" +
                        "ctx._source.intelStatus=params.intelStatus;"+
                        "ctx._source.intelApplyTime=params.intelApplyTime;"+
                        "ctx._source.intelKeyCode=params.intelKeyCode;"+
                        "ctx._source.intelType=params.intelType;"+
                        "ctx._source.intelTitular=params.intelTitular;"
                , map));
        BulkByScrollResponse bulkByScrollResponse = restHighLevelClient.updateByQuery(updateByQueryRequest, RequestOptions.DEFAULT);
        log.info("创建状态：", bulkByScrollResponse.getStatus());

删除

public void deleteIntel(IntellectualEntity entity) throws IOException 
        DeleteRequest deleteRequest=new DeleteRequest("intellectual",entity.getId()+"");
        DeleteResponse deleteResponse = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
        log.info("状态：", deleteResponse.status());

删除（根据搜索条件删除）

和更新搜索条件操作类似，结合删除操作替换DeleteRequest为DeleteByQueryRequest，相信机智的你已经会了

搜索高亮（普通高亮，空格多条件搜索）

这块代码暂时不涉及nested的字段的嵌套高亮
条件设置时，should=or，must=and
步骤：设置高亮构造器->搜索出结果->将高亮数据替换掉非高亮数据->返回结果
先写一个高亮构造器吧
高亮构造器：

private static void HighlightBuilder highlightBuilder;
    static 
        highlightBuilder = new HighlightBuilder();
        highlightBuilder.numOfFragments(0);//从第一个分片获取高亮片段
        highlightBuilder.preTags("<font color='#e75213'>");//自定义高亮标签
        highlightBuilder.postTags("</font>");
        highlightBuilder.highlighterType("unified");//高亮类型
        highlightBuilder
                .field("name")//需要高亮的属性值
                .field("keycode")
        ;
        highlightBuilder.requireFieldMatch(false);

搜索步骤：

public List<Map<String,Object>>  queryByContent(String content,Integer pageCurrent, Date startTimeApply,Date endTimeApply,Date startTimeEnd,Date endTimeEnd ) throws IOException 
        //空格分割多条件，本搜索支持多搜索词条空格分开，多词条搜索关系用and
        String[] manyStr = content.split("\\\\s+");
        //定义一个list<map>作为返回结果
        List<Map<String,Object>> list = new LinkedList<>();
        //首先构造条件构造器
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        if(manyStr.length>1)
                for (int i=0;i<manyStr.length;i++)
                	BoolQueryBuilder innerBoolQueryBuilder = QueryBuilders.boolQuery();
                	//nestedQuery，嵌套搜索条件
                    innerBoolQueryBuilder.should(QueryBuilders.nestedQuery("annex",QueryBuilders.matchQuery("annex.content", manyStr[i]) , ScoreMode.Max).boost(2));
                    innerBoolQueryBuilder

   
 (c)2006-2024 SYSTEM All Rights Reserved  IT常识