ElasticSearch多种搜索方式

Posted -wenli

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ElasticSearch多种搜索方式相关的知识,希望对你有一定的参考价值。

原文链接:ElasticSearch多种搜索方式

一、Query String Search(‘Query String’方式的搜索)

1.搜索全部商品

GET /shop_index/productInfo/_search

返回结果:

技术图片
{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "2",
        "_score": 1,
        "_source": {
          "test": "test"
        }
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "zyWpRGkB8mgaHjxk0Hfo",
        "_score": 1,
        "_source": {
          "name": "HuaWei P20",
          "desc": "Expen but easy to use",
          "price": 5300,
          "producer": "HuaWei Producer",
          "tags": [
            "Expen",
            "Fast"
          ]
        }
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "1",
        "_score": 1,
        "_source": {
          "name": "HuaWei Mate8",
          "desc": "Cheap and easy to use",
          "price": 2500,
          "producer": "HuaWei Producer",
          "tags": [
            "Cheap",
            "Fast"
          ]
        }
      }
    ]
  }
}
View Code

字段解释:

took:耗费了几毫秒
timed_out:是否超时,这里是没有
_shards:数据被拆到了5个分片上,搜索时使用了5个分片,5个分片都成功地返回了数据,失败了0个,跳过了0个
hits.total:查询结果的数量,3个document
max_score:就是document对于一个search的相关度的匹配分数,越相关,就越匹配,分数也越高
hits.hits:包含了匹配搜索的document的详细数据

2.搜索商品名称中包含HuaWei的商品,而且按照售价降序排序:
下面这种方法也是"Query String Search"的由来,因为search参数都是以http请求的query string来附带的.

GET /shop_index/productInfo/_search?q=name:HuaWei&sort=price:desc

返回结果:

技术图片
{
  "took": 23,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": null,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "zyWpRGkB8mgaHjxk0Hfo",
        "_score": null,
        "_source": {
          "name": "HuaWei P20",
          "desc": "Expen but easy to use",
          "price": 5300,
          "producer": "HuaWei Producer",
          "tags": [
            "Expen",
            "Fast"
          ]
        },
        "sort": [
          5300
        ]
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "1",
        "_score": null,
        "_source": {
          "name": "HuaWei Mate8",
          "desc": "Cheap and easy to use",
          "price": 2500,
          "producer": "HuaWei Producer",
          "tags": [
            "Cheap",
            "Fast"
          ]
        },
        "sort": [
          2500
        ]
      }
    ]
  }
}
View Code

 

二、Query DSL(DSL: Domain Specified Language,特定领域的语言)

这种方法是通过一个json格式的http request body请求体作为条件,可以完成多种复杂的查询需求,比query string的功能更加强大
1.搜索所有商品

GET /shop_index/productInfo/_search
{
  "query": {
    "match_all": {}
  }
}

返回结果省略...

2.查询名称中包含HuaWei的商品,并且按照价格降序排列

GET /shop_index/productInfo/_search
{
  "query": {
    "match": {
      "name": "HuaWei"
    }
  },
  "sort": [
    {
      "price": {
        "order": "desc"
      }
    }
  ]
}

返回结果省略...

3.分页查询第二页,每页1条记录

GET /shop_index/productInfo/_search
{
  "query": {
    "match_all": {}
  },
  "from": 1,
  "size": 1
}

返回结果:

技术图片
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 1,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "zyWpRGkB8mgaHjxk0Hfo",
        "_score": 1,
        "_source": {
          "name": "HuaWei P20",
          "desc": "Expen but easy to use",
          "price": 5300,
          "producer": "HuaWei Producer",
          "tags": [
            "Expen",
            "Fast"
          ]
        }
      }
    ]
  }
}
View Code

注意:
(1)在实际项目中,如果有条件查询之后再需要分页,不需要单独查询总条数,ES会返回满足条件的总条数,可以直接使用;
(2)ES的分页默认from是从0开始的;

4.只查询特定字段,比如:name,desc和price字段,其他字段不需要返回

技术图片
GET /shop_index/productInfo/_search
{
  "query": {
    "match": {
      "name": "HuaWei"
    }
  },
  "_source": ["name","desc","price"]
}
View Code

返回结果:

技术图片
{
  "took": 27,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "zyWpRGkB8mgaHjxk0Hfo",
        "_score": 0.2876821,
        "_source": {
          "price": 5300,
          "name": "HuaWei P20",
          "desc": "Expen but easy to use"
        }
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "price": 2500,
          "name": "HuaWei Mate8",
          "desc": "Cheap and easy to use"
        }
      }
    ]
  }
}
View Code

三.Query Filter(对查询结果进行过滤)

比如:查询名称中包含HuaWei,并且价格大于4000的商品记录:

GET /shop_index/productInfo/_search
{
  "query": {
    "bool": {
      "must": [
        {
          "match": {
            "name": "HuaWei"
          }
        }
      ], 
      "filter": {
        "range": {
          "price": {
            "gt": 4000
          }
        }
      }
    }
  }
}

返回结果:

技术图片
{
  "took": 195,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "zyWpRGkB8mgaHjxk0Hfo",
        "_score": 0.2876821,
        "_source": {
          "name": "HuaWei P20",
          "desc": "Expen but easy to use",
          "price": 5300,
          "producer": "HuaWei Producer",
          "tags": [
            "Expen",
            "Fast"
          ]
        }
      }
    ]
  }
}
View Code

四、全文索引(Full-Text Search)

搜索生产厂商字段中包含"HuaWei MateProducer"的商品记录:

GET /shop_index/productInfo/_search
{
  "query": {
    "match": {
      "producer": "HuaWei MateProducer"
    }
  }
}

返回结果:

技术图片
{
  "took": 8,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 4,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "SiUBRWkB8mgaHjxkJHyS",
        "_score": 0.5753642,
        "_source": {
          "name": "HuaWei Mate10",
          "desc": "Cheap and Beauti",
          "price": 2300,
          "producer": "HuaWei MateProducer",
          "tags": [
            "Cheap",
            "Beauti"
          ]
        }
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "1",
        "_score": 0.2876821,
        "_source": {
          "name": "HuaWei Mate8",
          "desc": "Cheap and easy to use",
          "price": 2500,
          "producer": "HuaWei Producer",
          "tags": [
            "Cheap",
            "Fast"
          ]
        }
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "zyWpRGkB8mgaHjxk0Hfo",
        "_score": 0.18232156,
        "_source": {
          "name": "HuaWei P20",
          "desc": "Expen but easy to use",
          "price": 5300,
          "producer": "HuaWei Producer",
          "tags": [
            "Expen",
            "Fast"
          ]
        }
      },
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "CSX8RGkB8mgaHjxkV3w1",
        "_score": 0.18232156,
        "_source": {
          "name": "HuaWei nova 4e",
          "desc": "cheap and look nice",
          "price": 1999,
          "producer": "HuaWei Producer",
          "tags": [
            "Cheap",
            "Nice"
          ]
        }
      }
    ]
  }
}
View Code

从以上结果中可以看到:
id为"SiUBRWkB8mgaHjxkJHyS"的记录score分数最高,表示匹配度最高;
原因:
producer分完词之后包括的词语有:
(1).HuaWei:
匹配到改词的记录ID:‘SiUBRWkB8mgaHjxkJHyS‘,‘1‘,‘CSX8RGkB8mgaHjxkV3w1‘,‘zyWpRGkB8mgaHjxk0Hfo‘
(2).MateProducer:
匹配到该词的记录ID:‘SiUBRWkB8mgaHjxkJHyS‘
由于"HuaWei MateProducer"两次匹配到ID为‘SiUBRWkB8mgaHjxkJHyS‘的记录,所以该记录的score分数最高。

五、Phrase Search(短语搜索)

短语索引和全文索引的区别:
(1)全文匹配:将要搜索的内容分词,然后挨个单词去倒排索引中匹配,只要匹配到任意一个单词,就算是匹配到记录;
(2)短语索引:输入的搜索串,必须在指定的字段内容中,完全包含一模一样的,才可以算匹配,才能作为结果返回;
例如:搜索name中包含"HuaWei MateProducer"短语的商品信息:

GET /shop_index/productInfo/_search
{
  "query": {
    "match_phrase": {
      "producer": "HuaWei MateProducer"
    }
  }
}

返回结果:

技术图片
{
  "took": 158,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.5753642,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "SiUBRWkB8mgaHjxkJHyS",
        "_score": 0.5753642,
        "_source": {
          "name": "HuaWei Mate10",
          "desc": "Cheap and Beauti",
          "price": 2300,
          "producer": "HuaWei MateProducer",
          "tags": [
            "Cheap",
            "Beauti"
          ]
        }
      }
    ]
  }
}
View Code

可以看到只有包含"HuaWei MateProducer"的记录才被返回。

六、Highlight Search(搜索高亮显示)

高亮搜索指的是搜索的结果中,将某些特别需要强调的词使用特定的样式展示出来。
例如:搜索商品名称中包含"Xiao‘Mi"的商品,并将搜索的关键词高亮显示:

GET /shop_index/productInfo/_search
{
  "query": {
    "match": {
      "name": "Xiao‘Mi"
    }
  },
  "highlight": {
    "fields": {
      "name": {}
    }
  }
}

返回结果:

技术图片
{
  "took": 348,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "HiX9RGkB8mgaHjxk4nxC",
        "_score": 0.2876821,
        "_source": {
          "name": "Xiao‘Mi 9",
          "desc": "Expen but nice and Beauti",
          "price": 3500,
          "producer": "XiaoMi Producer",
          "tags": [
            "Expen",
            "Beauti"
          ]
        },
        "highlight": {
          "name": [
            "<em>Xiao‘Mi</em> 9"
          ]
        }
      }
    ]
  }
}
View Code

可以看到,"Xiao‘Mi"使用了标签返回了,可以在html中直接以斜体展示。
如果想使用自定义高亮样式,可以使用pre_tags和post_tags进行自定义,比如:想使用红色展示,如下所示:

GET /shop_index/productInfo/_search
{
  "query": {
    "match": {
      "name": "Xiao‘Mi"
    }
  },
  "highlight": {
    "fields": {
      "name": {}
    },
    "pre_tags": [
      "<em style=‘color:red;‘>"
    ],
    "post_tags": [
      "</em>"
    ]
  }
}

返回结果:

技术图片
{
  "took": 10,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 0.2876821,
    "hits": [
      {
        "_index": "shop_index",
        "_type": "productInfo",
        "_id": "HiX9RGkB8mgaHjxk4nxC",
        "_score": 0.2876821,
        "_source": {
          "name": "Xiao‘Mi 9",
          "desc": "Expen but nice and Beauti",
          "price": 3500,
          "producer": "XiaoMi Producer",
          "tags": [
            "Expen",
            "Beauti"
          ]
        },
        "highlight": {
          "name": [
            "<em style=‘color:red;‘>Xiao‘Mi</em> 9"
          ]
        }
      }
    ]
  }
}
View Code

返回结果中的搜索关键字使用表示红色的css样式展示出来。

以上是关于ElasticSearch多种搜索方式的主要内容,如果未能解决你的问题,请参考以下文章

Elasticsearch学习之多种查询方式

ElasticSearch模板搜索API

Elasticsearch 5.4.3实战--Java API调用:搜索

大流量下的 ElasticSearch 搜索演进

ElasticSearch学习问题记录——Invalid shift value in prefixCoded bytes (is encoded value really an INT?)(代码片段

干货 | Elasticsearch 向量搜索的工程化实战