elasticsearch 中文分词(elasticsearch-analysis-ik)安装

Posted jiqing9006

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了elasticsearch 中文分词(elasticsearch-analysis-ik)安装相关的知识,希望对你有一定的参考价值。

elasticsearch 中文分词(elasticsearch-analysis-ik)安装

下载最新的发布版本
https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip

在elasticsearch的plugins目录下,创建ik目录

cd /usr/local/elasticsearch-6.3.0/plugins
mkdir ik

将解压的内容,放入其中
技术分享图片

重新启动elasticsearch服务

elasticsearch restart

这个时候中文分词就生效了,数据重新插入即可

GET /megacorp/employee/_search
{
    "query" : {
        "match" : {
            "about" : "程序员 编程"
        }
    }
}

搜索结果

{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.654172,
    "hits": [
      {
        "_index": "megacorp",
        "_type": "employee",
        "_id": "2",
        "_score": 1.654172,
        "_source": {
          "first_name": "张",
          "last_name": "三",
          "age": 24,
          "about": "一个php程序员,热爱编程,热爱生活,充满激情。",
          "interests": [
            "英雄联盟"
          ]
        }
      }
    ]
  }
}

或者通过(elasticsearch-plugin)在线安装,速度有点慢。

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
-> Downloading https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
[=================================================] 100%?? 
-> Installed analysis-ik

技术分享图片
发现多了一个文件夹

使用

GET _analyze?pretty
{
  "analyzer": "ik_smart",
  "text": "中华人民共和国国歌"
}
{
  "tokens": [
    {
      "token": "中华人民共和国",
      "start_offset": 0,
      "end_offset": 7,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "国歌",
      "start_offset": 7,
      "end_offset": 9,
      "type": "CN_WORD",
      "position": 1
    }
  ]
}

再一个例子

GET _analyze?pretty
{
  "analyzer": "ik_smart",
  "text": "王者荣耀是最好玩的游戏"
}
{
  "tokens": [
    {
      "token": "王者",
      "start_offset": 0,
      "end_offset": 2,
      "type": "CN_WORD",
      "position": 0
    },
    {
      "token": "荣耀",
      "start_offset": 2,
      "end_offset": 4,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "是",
      "start_offset": 4,
      "end_offset": 5,
      "type": "CN_CHAR",
      "position": 2
    },
    {
      "token": "最",
      "start_offset": 5,
      "end_offset": 6,
      "type": "CN_CHAR",
      "position": 3
    },
    {
      "token": "好玩",
      "start_offset": 6,
      "end_offset": 8,
      "type": "CN_WORD",
      "position": 4
    },
    {
      "token": "的",
      "start_offset": 8,
      "end_offset": 9,
      "type": "CN_CHAR",
      "position": 5
    },
    {
      "token": "游戏",
      "start_offset": 9,
      "end_offset": 11,
      "type": "CN_WORD",
      "position": 6
    }
  ]
}

以上是关于elasticsearch 中文分词(elasticsearch-analysis-ik)安装的主要内容,如果未能解决你的问题,请参考以下文章

ElasticSearch第一天

第129天学习打卡(Elasticsearch kibana安装 ES核心概念 IK分词器插件)

Elasticsearch 2.2.0 分词篇:中文分词

elasticsearch中文分词器详解

ElasticSearch 中文分词器对比

Elasticsearch之中文分词器插件es-ik