Elasticsearch 检索相关

Posted asker009

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Elasticsearch 检索相关相关的知识,希望对你有一定的参考价值。

1、 检索所有文档

GET bus/product/_search

2、 term检索

term是代表完全匹配,也就是精确查询,搜索前不会再对搜索词进行分词,所以我们的搜索词必须是文档分词集合中的一个,如果没有安装分词插件,汉字分词按每个汉字来分。

查询不到内容:
GET bus/product/_search
{
  "query": {
    "term": {
        "producer": "公交"
    }
  }
}
producer中所有带“公”的文档都会被查询出来
GET bus/product/_search
{
  "query": {
    "term": {
        "producer": "公"
    }
  }
}

3、 match检索

match查询会先对搜索词进行分词,分词完毕后再逐个对分词结果进行匹配,因此相比于term的精确搜索,match是分词匹配搜索

描述中带有机场酒店四个字的各种组合的文档都会被返回
GET bus/product/_search
{
  "query": {
    "match": {
      "desc": "机场酒店"
    }
  }
}

4、 分页

GET bus/_search
{
  "from": 0, 
  "size": 3, 
  "query": {
    "match": {
      "desc": "机场酒店"
    }
  }
}

GET bus/_search
{
  "from": 0,
  "size": 5,
  "query": {
    "match_all": {}
  }
}


5、 过滤字段,类似select a,b from table中a,b

GET bus/_search
{
  "_source": ["name","desc"] ,
  "query": {
    "match": {
      "desc": "机场"
    }
  }
}

result:
{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 12,
    "max_score" : 2.1208954,
    "hits" : [
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "9",
        "_score" : 2.1208954,
        "_source" : {
          "name" : "机场大巴A2线",
          "desc" : "机机场场"
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "10",
        "_score" : 2.1208954,
        "_source" : {
          "name" : "机场大巴A2线",
          "desc" : "机机场场"
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "6",
        "_score" : 0.62362677,
        "_source" : {
          "name" : "机场大巴A2线",
          "desc" : "机机场场"
        }
      }
    ]
  }
}

6、 检索结果显示版本

GET bus/_search
{
  "version": true, 
  "from": 0, 
  "size": 3, 
  "query": {
    "match": {
      "desc": "机场酒店"
    }
  }
}

7、 按评分检索过滤

GET bus/_search
{
  "version": true, 
  "min_score":"2.3", #大于2.3
  "from": 0, 
  "size": 3, 
  "query": {
    "match": {
      "desc": "机场酒店"
    }
  }
}

8、 高亮关键字

GET bus/_search
{
  "version": true, 
  "from": 0, 
  "size": 3, 
  "query": {
    "match": {
      "desc": "机场酒店"
    }
  }
  , "highlight": {
    "fields": {
      "desc": {}
    }
  }
}

9、  短语匹配match_phrase

与match query类似,但用于匹配精确短语,分词后所有词项都要出现在该字段中,字段中的词项顺序要一致。

GET bus/_search
{
  "query": {
    "match_phrase": {
      "name": "公交车122"
    }
  }
}

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 3.4102418,
    "hits" : [
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "3",
        "_score" : 3.4102418,
        "_source" : {
          "name" : "公交车122路",
          "desc" : "从前兴路枢纽到东站",
          "price" : 2,
          "producer" : "公交集团",
          "tags" : [
            "单层",
            "空调"
          ]
        }
      }
    ]
  }
}


对比match
GET bus/_search
{
  "query": {
    "match": {
      "name": "公交车122"
    }
  }
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 5.3417225,
    "hits" : [
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "2",
        "_score" : 5.3417225,
        "_source" : {
          "name" : "公交车5路",
          "desc" : "从巫家坝到梁家河",
          "price" : 1,
          "producer" : "公交集团",
          "tags" : [
            "双层",
            "普通",
            "热门"
          ]
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "3",
        "_score" : 3.4102418,
        "_source" : {
          "name" : "公交车122路",
          "desc" : "从前兴路枢纽到东站",
          "price" : 2,
          "producer" : "公交集团",
          "tags" : [
            "单层",
            "空调"
          ]
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "1",
        "_score" : 2.1597636,
        "_source" : {
          "name" : "公交车5路",
          "desc" : "从巫家坝到梁家河",
          "price" : 1,
          "producer" : "公交集团",
          "tags" : [
            "双层",
            "普通",
            "热门"
          ]
        }
      }
    ]
  }
}

10、短语前缀查询match_phrase_prefix

match_phrase_prefix与match_phrase相同,只是它允许在文本中的最后一个词的前缀匹配

GET bus/_search
{
  "query": {
    "match_phrase_prefix": {
      "name": "公交车1"
    }
  }
}

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 1,
    "max_score" : 6.8204837,
    "hits" : [
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "3",
        "_score" : 6.8204837,
        "_source" : {
          "name" : "公交车122路",
          "desc" : "从前兴路枢纽到东站",
          "price" : 2,
          "producer" : "公交集团",
          "tags" : [
            "单层",
            "空调"
          ]
        }
      }
    ]
  }
}

对比:
GET bus/_search
{
  "query": {
    "match_phrase": {
      "name": "公交车1"
    }
  }
}

{
  "took" : 0,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

11、 多字段查询multi_match

GET bus/_search
{
  "query": {
    "multi_match": {
      "query": "空港",
      "fields": ["desc","name"]
    }
  }
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 3,
    "successful" : 3,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 3.6836727,
    "hits" : [
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "16",
        "_score" : 3.6836727,
        "_source" : {
          "name" : "机场大巴A2线",
          "desc" : "空港",
          "price" : 21,
          "producer" : "大巴",
          "tags" : [
            "单层",
            "空调",
            "大巴"
          ]
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "18",
        "_score" : 3.5525968,
        "_source" : {
          "name" : "空港大巴A2线",
          "desc" : "机场",
          "price" : 21,
          "producer" : "大巴",
          "tags" : [
            "单层",
            "空调",
            "大巴"
          ]
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "19",
        "_score" : 3.1757839,
        "_source" : {
          "name" : "空港大巴A2线",
          "desc" : "空港快线",
          "price" : 21,
          "producer" : "大巴",
          "tags" : [
            "单层",
            "空调",
            "大巴"
          ]
        }
      }
    ]
  }
}

12、范围检索,range查询用于匹配数值型、日期型或字符串型字段在某一范围内的文档。

GET bus/_search
{
  "from": 0, 
  "size": 10, 
  "query": {
    "range": {
      "updateDate": {
        "gte": "2018-12-01",
        "lte": "2018-12-31",
"format": "yyyy-MM-dd" } } } } { "took" : 1, "timed_out" : false, "_shards" : { "total" : 3, "successful" : 3, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 2, "max_score" : 1.0, "hits" : [ { "_index" : "bus", "_type" : "product", "_id" : "13", "_score" : 1.0, "_source" : { "name" : "机场大巴A2线", "desc" : "机机场场", "price" : 21, "producer" : "机场大巴", "tags" : [ "单层", "空调", "大巴" ], "updateDate" : "2018-12-20" } }, { "_index" : "bus", "_type" : "product", "_id" : "12", "_score" : 1.0, "_source" : { "name" : "机场大巴A2线", "desc" : "机机场场", "price" : 21, "producer" : "机场大巴", "tags" : [ "单层", "空调", "大巴" ], "updateDate" : "2018-12-01" } } ] } }

可以不加索引,表示搜索所有索引。

GET /_search
{
  "from": 0, 
  "size": 10, 
  "query": {
    "range": {
      "updateDate": {
        "gte": "2018-12-01",
        "lte": "2018-12-31",
"format": "yyyy-MM-dd" } } } } { "took" : 14, "timed_out" : false, "_shards" : { "total" : 22, "successful" : 20, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : 4, "max_score" : 1.0, "hits" : [ { "_index" : "bus", "_type" : "product", "_id" : "13", "_score" : 1.0, "_source" : { "name" : "机场大巴A2线", "desc" : "机机场场", "price" : 21, "producer" : "机场大巴", "tags" : [ "单层", "空调", "大巴" ], "updateDate" : "2018-12-20" } }, { "_index" : "home", "_type" : "product", "_id" : "3", "_score" : 1.0, "_source" : { "title" : "My first post3", "memo" : "a test23", "updateDate" : "2018-12-13" } }, { "_index" : "bus", "_type" : "product", "_id" : "12", "_score" : 1.0, "_source" : { "name" : "机场大巴A2线", "desc" : "机机场场", "price" : 21, "producer" : "机场大巴", "tags" : [ "单层", "空调", "大巴" ], "updateDate" : "2018-12-01" } }, { "_index" : "home", "_type" : "product", "_id" : "1", "_score" : 1.0, "_source" : { "title" : "My first post2", "memo" : "a test2", "updateDate" : "2018-12-12" } } ] } }

13、检索字段或者值是否存在,exists查询

只能返回field字段存在,并且不为null的数据。

GET /_search
{
  "query": {
    "exists": {"field": "desc1"}
  }
}

{
  "took" : 1,
  "timed_out" : false,
  "_shards" : {
    "total" : 22,
    "successful" : 20,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 0,
    "max_score" : null,
    "hits" : [ ]
  }
}

14、prefix query前缀查询,Matches documents that have fields containing terms with a specified prefix (not analyzed)

GET _search
{
  "from": 0, 
  "size": 2, 
  "query": {
    "prefix": {
      "name": "公" #根据分词
    }
  }
}

{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
    "total" : 22,
    "successful" : 20,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : 3,
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "name" : "公交车5路",
          "desc" : "从巫家坝到梁家河",
          "price" : 1,
          "producer" : "公交集团",
          "tags" : [
            "双层",
            "普通",
            "热门"
          ],
          "updateDate" : "2018-03-01"
        }
      },
      {
        "_index" : "bus",
        "_type" : "product",
        "_id" : "1",
        "_score" : 1.0,
        "_source" : {
          "name" : "公交车5路",
          "desc" : "从巫家坝到梁家河",
          "price" : 1,
          "producer" : "公交集团",
          "tags" : [
            "双层",
            "普通",
            "热门"
          ],
          "updateDate" : "2018-02-01"
        }
      }
    ]
  }
}

15、wildcard查询(通配符查询)

支持*和?通配查询.

GET /_search
{
    "query": {
        "wildcard" : { "user" : "ki*y" }
    }
}

16、regexp查询(正则表达式查询)

https://www.elastic.co/guide/en/elasticsearch/reference//6.5/query-dsl-regexp-query.html

 

17、模糊查询fuzzy query

https://www.elastic.co/guide/en/elasticsearch/reference/6.5/query-dsl-fuzzy-query.html

18、ids查询

GET /_search
{
  "query": {
    "ids": {
      "type": "product", 
      "values": [1,2]
    }
  }
}

 
















































































































































































以上是关于Elasticsearch 检索相关的主要内容,如果未能解决你的问题,请参考以下文章

ElasticSearch 亿级数据检索深度优化

elasticsearch检索,聚合,建模,优化--《elasticsearch核心技术与实战》笔记

全文检索-ElasticSearch入门

MySQL数据同步至ElasticSearch的相关实现方法

《从Lucene到Elasticsearch:全文检索实战》学习笔记四

ElasticSearch 亿级数据检索深度优化