elasticsearch数据过期删除处理

Posted cuishuai

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了elasticsearch数据过期删除处理相关的知识,希望对你有一定的参考价值。

一、概述

使用elasticsearch收集日志进行处理,时间久了,很老的数据就没用了或者用途不是很大,这个时候就要对过期数据进行清理.这里介绍两种方式清理这种过期的数据。

1、curator

关于版本:

技术分享图片

 

安装:

https://www.elastic.co/guide/en/elasticsearch/client/curator/current/installation.html

我使用的是ubuntu系统,所以参考的是https://www.elastic.co/guide/en/elasticsearch/client/curator/current/apt-repository.html

wget -qO - https://packages.elastic.co/GPG-KEY-elasticsearch | sudo apt-key add -

vim  /etc/apt/sources.list.d/curator.list
deb [arch=amd64] https://packages.elastic.co/curator/5/debian stable main

sudo apt-get update && sudo apt-get install elasticsearch-curator

 我使用的是elasticsearch-6.5.1,所以安装的是curator5.

安装完成后会生成两个命令:curator、curator_cli,这里我们只先用到curator。

需要创建配置文件:有两个文件一个是config、一个是action

mkdir  /etc/curator

config:

# cat config_file.yml
client:
  hosts:
    - 127.0.0.1
  port: 9200
  url_prefix:
  use_ssl: False
  certficate:
  client_cert:
  client_key:
  ssl_no_validate: False
  http_auth:
  timeout:
  master_only: true
logging:
  loglevel: INFO
  logfile: "/data/curator/action.log"
  logformat: default

action:

# cat action_file.yml
---
actions:
  1:
    action: delete_indices
    description: >-
      Delete indices older than 7 days (based on index name), for logstash-
      prefixed indices. Ignore the error if the filter does not result in an
      actionable list of indices (ignore_empty_list) and exit cleanly.
    options:
      ignore_empty_list: True
      timeout_override:
      continue_if_exception: False
      disable_action: False
    filters:
    - filtertype: pattern
      kind: prefix
      value: fluentd-k8s-
      exclude:
    - filtertype: age
      source: name
      direction: older
      timestring: %Y.%m.%d
      unit: days
      unit_count: 15
      exclude: fluentd-k8s-2018.11.22,fluentd-k8s-2018.11.23

 

我这里设置的是保留15天的,这个历史数据重要的会先落地到hdfs,然后在删除。这个日期根据自己服务器的磁盘和日志的重要性自己规划。重要的比如双11的数据不想删除,想留下来可以写到exclude里面,

或者做一个snapshot备份。接下来设置一个定时任务去删除就好了。

crontab -e
*  *  */25 * *  curator --config /etc/curator/config_file.yml  /etc/curator/action_file.yml

 

2、使用脚本删除

 

# cat es-dele-indices.sh
#!/bin/bash
#delete elasticsearch indices
searchIndex=fluentd-k8s
elastic_url=127.0.0.1
elastic_port=9200

date2stamp(){
  date --utc --date "$1" +%s
}

dateDiff(){
  case $1 in
    -s)  sec=1;     shift;;
    -m)  sec=60;    shift;;
    -h)  sec=3600;  shift;;
    -d)  sec=86400; shift;;
     *)  sec=86400; shift;;
  esac
  dte1=$(date2stamp $1)
  dte2=$(date2stamp $2)
  diffSec=$((dte2-dte1))
  if ((diffSec < 0)); then abs=-1; else abs=1; fi
  echo $((diffSec/sec*abs))
}

for index in $(curl -s "${elastic_url}:${elastic_port}/_cat/indices?v" | grep -E " ${searchIndex}-20[0-9][0-9].[0-1][0-9].[0-3][0-9]" | awk {     print $3 });do
  date=$(echo ${index: -10}|sed s/./-/g)
  cond=$(date +%Y-%m-%d)
  diff=$(dateDiff -d $date $cond)
  echo -n "${index} (${diff})"
  if [ $diff -gt 1 ]; then
    #echo "/ DELETE"
    curl -XDELETE "${elastic_url}:${elastic_port}/${index}?pretty"
  else
    echo ""
  fi
done

 

以上是关于elasticsearch数据过期删除处理的主要内容,如果未能解决你的问题,请参考以下文章

如何使控制台中的视图缓存片段过期?

elasticsearch(es)定时删除7天前索引

批处理删除过期文件

ElasticSearch异常情况监控处理

Redis的过期删除策略

elasticsearch代码片段,及工具类SearchEsUtil.java