es reindex使用

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了es reindex使用相关的知识,希望对你有一定的参考价值。

参考技术A 同一大版本升级(如6.1.x->6.8.x或7.1.x->7.8.x),索引读写兼容,不需要重建索引
不同版本升级(如6.1.x->7.1.x),索引读写不兼容,需要重建索引

集群迁移,索引服务不停机,数据提前迁移

分片数量由少变多,由多变少

字段类型,字段属性变更、文档对象结构变更

索引频繁更新,产生很多内存碎片垃圾

reindex重建索引创建新索引,原索引保留,原有索引"_source"必须开启
url参数:
refresh:目标索引是否立即刷新
wait_for_active_shards:重建索引分片响应设置
scroll:快照查询时间
slices:重建并行任务切片(建议与分片数一致)

Max_docs:单次最大数据条数
requests_per_second:每秒数据量阈值控制,默认是-1(不限制),生产重建时建议控制在500-1000,控制重建的速度,防止集群io瞬间过大
req请求参数:
confilicts:重建索引冲突解决(覆盖、中断)
source:源索引配置信息
dest:新索引配置信息
script:处理脚本,处理原索引写入到新索引
routing:路由到指定分片
Multi index:多索引重建
Source field:限制重建索引的字段
field rename:索引字段重命名
remot:远程重建索引

elasticsearch部分常用操作

文章目录

集群搭建7.4.1版本,配置

3台机器组成一个集群,分别为:a,b,c
a:
编辑a的config/elasticsearch.yml配置文件,修改后如下

# ======================== Elasticsearch Configuration #=========================
#
# NOTE: Elasticsearch comes with reasonable defaults for most settings.
#       Before you set out to tweak and tune the configuration, make sure you
#       understand what are you trying to accomplish and the consequences.
#
# The primary way of configuring a node is via this file. This template lists
# the most important settings you may want to configure for a production cluster.
#
# Please consult the documentation for further information on configuration options:
# https://www.elastic.co/guide/en/elasticsearch/reference/index.html
#
# ---------------------------------- Cluster #—————————————————
#
# Use a descriptive name for your cluster:
#集群名称
cluster.name: my-application
#
# ------------------------------------ Node ##
#
# Use a descriptive name for the node:
#确定master
node.master: true
#节点名称
node.name: node-1
#
#discovery.zen.minimum_master_nodes: 3
# Add custom attributes to the node:
#
#node.attr.rack: r1
#
# ----------------------------------- Paths #
#
# Path to directory where to store the data (separate multiple locations by comma):
#es数据存放位置,需要手动创建目录和赋予权限
path.data: /opt/soft/data
#
# Path to log files:
#
#es日志存放位置,需要手动创建目录和赋予权限
path.logs: /opt/soft/log
#
# ----------------------------------- Memory #
#
# Lock the memory on startup:
#
#bootstrap.memory_lock: true
#bootstrap.memory_lock: true
#
# Make sure that the heap size is set to about half the memory available
# on the system and that the owner of the process is allowed to use this
# limit.
#
# Elasticsearch performs poorly when the system is swapping the memory.
#
# ---------------------------------- Network #
#
# Set the bind address to a specific IP (IPv4 or IPv6):
#允许自身各种ip访问
network.host: 0.0.0.0
#
# Set a custom port for HTTP:
#对外服务端口
http.port: 9200
#
# For more information, consult the network module documentation.
#
# --------------------------------- Discovery #
#
# Pass an initial list of hosts to perform discovery when this node is started:
# The default list of hosts is ["127.0.0.1", "[::1]"]
#指定集群里的所有节点,9300是集群间相互通信的端口
discovery.seed_hosts:  ["10.209.5.87:9300","10.209.5.88:9300","10.209.5.89:9300"]
#discovery.zen.ping.unicast.hosts: ["10.209.5.79","10.209.5.80","10.209.5.78"]
#
# Bootstrap the cluster using an initial set of master-eligible nodes:
#集群启动指定的可选举的master节点
cluster.initial_master_nodes: ["node-1"]
#
# For more information, consult the discovery and cluster formation module documentation.
#
# ---------------------------------- Gateway #
#
# Block initial recovery after a full cluster restart until N nodes are started:
#
#gateway.recover_after_nodes: 3
#
# For more information, consult the gateway module documentation.
#
# ---------------------------------- Various #—————————————————
#
# Require explicit names when deleting indices:
#
#action.destructive_requires_name: true
#这两行允许跨域
http.cors.enabled: true
http.cors.allow-origin: "*"
#reindex同步数据,数据迁移需要的其他机器的白名单,不然不能使用reindex,这表示当前节点可以#从以下白名单节点获取数据,通常是其他集群的节点
reindex.remote.whitelist: ["10.209.5.84:9200","10.209.5.78:9200","10.209.1.48:9200","10.209.1.35:5200","10.47.187.45:5200","10.47.195.38:5200"]
#指定冷归档数据的存放位置目录,冷归档的数据可以压缩文件夹后剪切移到其他机器,目录需要手#动创建并赋予权限
path.repo: ["/opt/soft/es_backups/backups", "/opt/soft/es_backups/longterm_backups"]

b:
机器的elasticsearch.yml
其他一样,修改
#注释
#node.master: true
#节点名称
node.name: node-2

c:
机器的elasticsearch.yml
其他一样,修改
#注释
#node.master: true
#节点名称
node.name: node-3

修改每一台机器的内存大小参数(64g为例)
修改config/jvm.options文件,最大不能超过31g,最好不超过整个机器的内存50%
-Xms30g
-Xmx30g

可安装ik分词器
需要指定版本

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.4.1/elasticsearch-analysis-ik-7.4.1.zip

linux优化

关闭交换分区,防止内存置换降低性能

swapoff -a

vim /etc/security/limits.conf

#文尾添加
* soft nofile 65535
* hard nofile 131072
* soft nproc 4096
* hard nproc 4096

vim /etc/sysctl.conf

vm.max_map_count=262145 

#刷新配置

sysctl -p 	

es不允许root启动
#增加用户

useradd esuser 

#切换用户

su esuser 

启动命令:
一定要检查防火墙是否开放9200,9300端口
在解压目录执行命令

./bin/elasticsearch -d

索引映射创建,优化

创建索引es_persist_3

创建索引 es_persist_3
url

put http://ip:port/es_persist_3

json


  "settings": 
    "number_of_shards": "12",
    "number_of_replicas": "1",
    "index.translog.durability": "async",
    "index.translog.sync_interval": "60s",
    "index.translog.flush_threshold_size": "1024mb"
  

创建映射mapping es_persist_3

创建mapping es_persist_3
url

post http://ip:port/es_persist_3/_mapping

json


  "properties": 
    "servCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "httpMethod": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "type": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "servVersionProxyType": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "exceptionStack": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "exceptionTime": 
      "type": "date"
    ,
    "@version": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "host": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "pAppName": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "id": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "receiveSize": 
      "type": "long"
    ,
    "authType": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "externalTime": 
      "type": "long"
    ,
    "cAppName": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "returnSize": 
      "type": "long"
    ,
    "authCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "statusDesc": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "platformTime": 
      "type": "long"
    ,
    "servName": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "componentPort": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "esbId": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "responseSize": 
      "type": "long"
    ,
    "message": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "logTime": 
      "type": "date"
    ,
    "tags": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "receiveTime": 
      "type": "long"
    ,
    "@timestamp": 
      "type": "date"
    ,
    "messageList": 
      "properties": 
        "sizeX": 
          "type": "text",
          "fields": 
            "keyword": 
              "ignore_above": 256,
              "type": "keyword"
            
          
        ,
        "serialNumber": 
          "type": "text",
          "fields": 
            "keyword": 
              "ignore_above": 256,
              "type": "keyword"
            
          
        ,
        "header": 
          "type": "text",
          "fields": 
            "keyword": 
              "ignore_above": 256,
              "type": "keyword"
            
          
        ,
        "time": 
          "type": "long"
        ,
        "body": 
          "type": "text",
          "fields": 
            "keyword": 
              "ignore_above": 256,
              "type": "keyword"
            
          
        ,
        "type": 
          "type": "text",
          "fields": 
            "keyword": 
              "ignore_above": 256,
              "type": "keyword"
            
          
        ,
        "url": 
          "type": "text",
          "fields": 
            "keyword": 
              "ignore_above": 256,
              "type": "keyword"
            
          
        
      
    ,
    "componentHost": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "cAppCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "fromIp": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "complete": 
      "type": "boolean"
    ,
    "requestSize": 
      "type": "long"
    ,
    "logtime": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "callTime": 
      "type": "long"
    ,
    "pAppCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "statusCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    
  

创建索引 es_persist_4

url

put http://ip:port/es_persist_34

json


  "settings": 
    "number_of_shards": "2",
    "number_of_replicas": "1",
    "index.translog.durability": "async",
    "index.translog.sync_interval": "30s",
    "index.translog.flush_threshold_size": "248mb"
  

创建mapping es_persist_4

url

post http://ip:port/es_persist_4/_mapping

json


  "properties": 
    "servCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "componentHost": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "exceptionCount": 
      "type": "long"
    ,
    "sumCallTime": 
      "type": "long"
    ,
    "maxCallTime": 
      "type": "long"
    ,
    "cAppCode": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    ,
    "minCallTime": 
      "type": "long"
    ,
    "startTime": 
      "type": "long"
    ,
    "endTime": 
      "type": "long"
    ,
    "sumFlowSize": 
      "type": "long"
    ,
    "totalCount": 
      "type": "long"
    ,
    "servVersionProxyType": 
      "type": "text",
      "fields": 
        "keyword": 
          "ignore_above": 256,
          "type": "keyword"
        
      
    
  

es的常用指令

删除指定索引,从物理上整个索引的数据删除
url

delete http://ip:port/指定的索引名称

关闭索引,依然占着硬盘,关闭后不可进行io读写
url

post http://ip:port/指定的索引名称/_close

打开索引,占着硬盘,打开后可进行io读写,正常使用
url

post http://ip:port/指定的索引名称/_open

跨集群数据迁移

reindex迁移

b集群请求获取a集群的数据到b集群里,(b集群配置文件需要加上a集群的白名单,见集群安装配置文件)
query可以指定想要的数据,下面是获取指定月份时间段的数据,去掉则是全部数据
“version_type”: "internal"代表覆盖替换冲突的id相同的数据
size是批量条数,太大可能会报错,太小执行较慢
wait_for_completion=false后台异步操作

POST http://bip:bport/_reindex?wait_for_completion=false

  "source": 
    "index": "a的索引",
    "remote": 
      "host": "http://aip:aport"
    ,
    "size": 1000,
    "query": 
      "range": 
        "receiveTime": 
          "gte": 1635696000000,
          "lt": 1638287999000
        
      
    
  ,
  "dest": 
    "index": "b的索引",
    "version_type": "internal"
  

reindex取消命令

reindex执行没结束不想再执行了,成功迁移复制过去的数据依然保留,后续未完成的不再继续

POST _tasks/node_id:task_id/_cancel

reindex查看进度(可以看到node_id:task_id,任务数等)

GET _tasks本地ES集群数据通过_reindex方式迁移到腾讯云服务器(亲测有效)

ES实战reindex API的使用

ES数据库重建索引——Reindex(数据迁移)

66.零停机下reindex

es如何修改es索引字段类型 reindex

elasticsearch部分常用操作