ES-入门

Posted 2022-11-23 wangyunzhong123

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了ES-入门相关的知识，希望对你有一定的参考价值。

参考：
https://es.xiaoleilu.com/010_Intro/10_Installing_ES.html

1. 安装

https://www.elastic.co/cn/downloads/
在上面ES官网下载ES7.5和kibana6.5版本。后者是可视化操作软件。同时下载页面也有配置启动的方法，很简单，es基本直接启动，kibana只需要改下elasticsearch.hosts即可。

一般首先在本地安装，之后打开http://localhost:5601/app/kibana
可以发现新版的（7.5）可以直接在界面上操作安装一些插件。

以下下的操作都在es7.5版本。改版本已经强制单索引单类型。类型推荐使用_doc，当然也可以指定。但只能有一个。

2. 概念

索引：index。可以对应于m数据库中数据库。
类型：type。可以对应与数据库中的表。但又有不同，参考Es
中type理解。type 字段会和文档的 _id 一起生成一个 _uid 字段，因此在同一个索引下的不同类型的文档的 _id 可以具有相同的值。参考:es中索引与类型前世今生
索引：倒排索引。

3. 创建索引

新版的es已经不要求使用type，而是直接操作index，用_doc统一表示。因为type也只是一个逻辑概念。

PUT /megacorp/_doc/1

    "first_name" : "John",
    "last_name" :  "Smith",
    "age" :        25,
    "about" :      "I love to go rock climbing",
    "interests": [ "sports", "music" ]

上面的操作可以自动创建索引megacorp，并且添加了一个文档。可以多次执行，后面的执行将会update该文档。返回如下：


  "_index" : "megacorp",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "result" : "updated",
  "_shards" : 
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  ,
  "_seq_no" : 1,
  "_primary_term" : 1

4. 检索

GET /megacorp/_doc/1


  "_index" : "megacorp",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 2,
  "_seq_no" : 1,
  "_primary_term" : 1,
  "found" : true,
  "_source" : 
    "first_name" : "John",
    "last_name" : "Smith",
    "age" : 25,
    "about" : "I love to go rock climbing",
    "interests" : [
      "sports",
      "music"
    ]

5. 简单搜索

GET /megacorp/_search

返回结果：


  "took" : 1,
  "timed_out" : false,
  "_shards" : 
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  ,
  "hits" : 
    "total" : 
      "value" : 2,
      "relation" : "eq"
    ,
    "max_score" : 1.0,
    "hits" : [
      
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : 
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        
      ,
      
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : 
          "first_name" : "John",
          "last_name" : "jack",
          "age" : 27,
          "about" : "I love to go swimming",
          "interests" : [
            "sports",
            "movie"
          ]
        
      
    ]

搜索last name包含smith的员工：

GET /megacorp/_search?q=last_name:Smith

将会返回id为1的文档。

6. 使用DSL查询

DSL(Domain Specific Language特定领域语言)以JSON请求体的形式出现。

之前的查询last name为smith的员工可以这样查询：

GET /megacorp/_search

    "query" : 
        "match" : 
            "last_name" : "Smith"

将得到和之前一样的结果。

7. 更复杂的查询

让搜索稍微再变的复杂一些。我们依旧想要找到姓氏为“Smith”的员工，但是我们只想得到年龄大于30岁的员工。我们的语句将添加过滤器(filter),它使得我们高效率的执行一个结构化搜索：

GET /megacorp/_search

    "query" : 
        "filtered" : 
            "filter" : 
                "range" : 
                    "age" :  "gt" : 30 
                
            ,
            "query" : 
                "match" : 
                    "last_name" : "smith"

到目前为止，上面的查询将会返回错误：


  "error": 
    "root_cause": [
      
        "type": "parsing_exception",
        "reason": "no [query] registered for [filtered]",
        "line": 3,
        "col": 22
      
    ],
    "type": "parsing_exception",
    "reason": "no [query] registered for [filtered]",
    "line": 3,
    "col": 22
  ,
  "status": 400

原因之后将会讲述。

8. 全文索引

搜索所有喜欢“rock climbing”的员工：

GET /megacorp/_search

    "query" : 
        "match" : 
            "about" : "rock climbing"

将得到id为1的文档。

9. 短语搜索

目前我们可以在字段中搜索单独的一个词，这挺好的，但是有时候你想要确切的匹配若干个单词或者短语(phrases)。例如我们想要查询同时包含"rock"和"climbing"（并且是相邻的）的员工记录。

要做到这个，我们只要将match查询变更为match_phrase查询即可:

GET /megacorp/_search

    "query" : 
        "match_phrase" : 
            "about" : "rock climbing"

10. 高亮搜索

在Elasticsearch中高亮片段是非常容易的。让我们在之前的语句上增加highlight参数：

GET /megacorp/_search

    "query" : 
        "match_phrase" : 
            "about" : "rock climbing"
        
    ,
    "highlight": 
        "fields" : 
            "about" :

得到的结果中将会增加一个字段highlight：


  "took" : 195,
  "timed_out" : false,
  "_shards" : 
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  ,
  "hits" : 
    "total" : 
      "value" : 1,
      "relation" : "eq"
    ,
    "max_score" : 1.3365866,
    "hits" : [
      
        "_index" : "megacorp",
        "_type" : "employee",
        "_id" : "1",
        "_score" : 1.3365866,
        "_source" : 
          "first_name" : "John",
          "last_name" : "Smith",
          "age" : 25,
          "about" : "I love to go rock climbing",
          "interests" : [
            "sports",
            "music"
          ]
        ,
        "highlight" : 
          "about" : [
            "I love to go <em>rock</em> <em>climbing</em>"
          ]
        
      
    ]

11. 分析

最后，我们还有一个需求需要完成：允许管理者在职员目录中进行一些分析。 Elasticsearch有一个功能叫做聚合(aggregations)，它允许你在数据上生成复杂的分析统计。它很像SQL中的GROUP BY但是功能更强大。

这段内容将放到后面来说。

以上是关于ES-入门的主要内容，如果未能解决你的问题，请参考以下文章