案例实战Elasticsearch基本操作

Posted 彭宇成

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了案例实战Elasticsearch基本操作相关的知识,希望对你有一定的参考价值。

问题

Elasticsearch集群的简单管理?面向文档(document)的搜索分析引擎?常用的搜索方式有哪些?

场景

本文以构建一个电商网站的后台系统为例,介绍es常用的搜索方式:query string search、query DSL、query filter、full-text search、phrase search、highlight search

注:电商网站商品管理案例背景介绍

1)对商品信息进行CRUD操作
2)执行简单的结构化查询
3)可以执行简单的全文检索,以及复杂的短语检索
4)对于全文检索的结果,可以进行高亮显示
5)对数据进行简单的聚合分析

二、商品的多种搜索方式:

分析

一、document数据格式

面向文档的搜索分析引擎

1)应用系统的数据结构都是面向对象的,复杂的
2)对象数据存储到关系型数据库中,只能拆解开来,变为扁平的多张表,每次查询的时候还得还原回对象格式,相当麻烦
3)ES是面向文档的,文档中存储的数据结构,与面向对象的数据结构是一样的,基于这种文档数据结构,ES可以提供复杂的索引、全文检索与分析聚合等功能
4)ES的document用json数据格式来表达

三、简单的集群管理

1)快速检查集群的健康状况

GET /_cat/health?v

green:每个索引的primary shard 和replica shard 都是active状态的
yellow:每个索引的primary shard都是active状态的,但是部分 replica shard 不是 active状态,处于不可用的
red:不是所有索引的 primary shard都是active状态的,部分索引有数据丢失

2)快速查看集群中有哪些索引

GET /_cat/indices?v

3)简单的索引操作

创建索引: PUT /test_index?pretty

删除索引:DELETE /test_index?pretty

四、商品的CRUD操作

1)新增商品:新增文档,建立索引

语法: PUT /index/type/id
应用:

PUT /ecommerce/product/1

"name":"gaolujie yagao",
"desc":"gaoxiao meibai",
"price":30,
"producer":"gaolujie producer",
"tags":["meibai","fangzhu"]


PUT /ecommerce/product/2

"name":"jiajieshi yagao",
"desc":"youxiao fangzhu",
"price":25,
"producer":"jiajieshi producer",
"tags":["fangzhu"]


PUT /ecommerce/product/3

"name":"zhonghua yagao",
"desc":"caoben zhiwu",
"price":40,
"producer":"zhonghua producer",
"tags":["qinxin"]

es会自动建立index和type,不需要提前创建,而且es默认会对document的每个field都建立倒排索引,让其可以被搜索。

2)查询商品:检索文档

语法:GET /index/type/id
应用:

GET /ecommerce/product/1

3) 修改商品:更新文档

语法:POST /index/type/id/_update“doc”:“fieldname”:”value”
应用:

POST /ecommerce/product/1/_update

  "doc": 
  
    "name":"jiaqiangban gaolujie yagao"
  

4)删除商品:删除文档

语法:DELETE /index/type/id?pretty
应用:

DELETE   /ecommerce/product/1/?pretty

五、搜索方式

1)query string search
应用)
搜索全部商品:GET /ecommerce/product/_search
搜索商品名称中包含 yagao 的商品,而且按照售价降序排序:

GET /ecommerce/product/_search?q=name:yagao&sort=price:desc

2) query DSL (domain specified language)
应用)
查询所有的商品:

GET /ecommerce/product/_search

  "query": 
  
    "match_all": 
  

查询名称包含 yagao的商品,同时按照价格降序排序:

GET /ecommerce/product/_search

  "query": 
  
    "match":
    
      "name": "yagao"
    
  ,
  "sort": 
  [
    
      "price": 
      
        "order": "desc"
      
    
  ]

或者:

 GET /ecommerce/product/_search

  "query": 
  
    "match":
    
      "name": "yagao"
    
  ,
  "sort": 
  [
      "price": "desc"
  ]

分页查询商品,总共3条商品,假设每页就显示1条商品,显示第一页且只显示商品的名称与价格

GET ecommerce/product/_search

  "query": "match_all": ,
  "_source": ["name","price"],
  "from": 0,
  "size": 1

3) query filter

搜索商品名称包含 yagao,而且售价大于等于40的商品

GET /ecommerce/product/_search

  "query": 
  
    "bool": 
    
      "must": 
      
        "match":
        
          "name":"yagao"
        
      ,
      "filter": 
      
        "range":
        
          "price":
          
            "gte": 40
          
        
      
    
  

4) full-text search 全文检索

全文检索:将输入的搜索串拆解开来,去倒排索引里面去一一匹配,只要能匹配上任意一个拆解后的单词,就可以作为结果返回

GET /ecommerce/product/_search

  "query": 
  
    "match":
    
      "producer": "yagao producer"
    
  

结果:


  "took": 10,
  "timed_out": false,
  "_shards": 
    "total": 5,
    "successful": 5,
    "failed": 0
  ,
  "hits": 
    "total": 4,
    "max_score": 0.51623213,
    "hits": [
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "5",
        "_score": 0.51623213,
        "_source": 
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "yagao producer",
          "tags": "meibai"
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 0.25811607,
        "_source": 
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 0.25811607,
        "_source": 
          "name": "jiaqiangban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 0.25811607,
        "_source": 
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qinxin"
          ]
        
      
    ]
  

5、phrase search 与 高亮显示

短语搜索:要求输入的搜索串,必须在指定的字段文本中,完全包含一模一样的,才算匹配。

GET /ecommerce/product/_search

  "query": 
  
    "match_phrase":
    
      "producer": "yagao producer"
    
  ,
  "highlight": 
  
    "fields": "producer": 
  

结果:


  "took": 45,
  "timed_out": false,
  "_shards": 
    "total": 5,
    "successful": 5,
    "failed": 0
  ,
  "hits": 
    "total": 1,
    "max_score": 0.51623213,
    "hits": [
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "5",
        "_score": 0.51623213,
        "_source": 
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "yagao producer",
          "tags": "meibai"
        ,
        "highlight": 
          "producer": [
            "<em>yagao</em> <em>producer</em>"
          ]
        
      
    ]
  

聚合分析

1)计算每个tag下的商品数量

PUT /ecommerce/_mapping/product

  "properties": 
  
    "tags":
    
      "type":"text",
      "fielddata": true
    
  


GET /ecommerce/product/_search

  "aggs":
  
    "group_by_tags":
    
      "terms":
      
        "field": "tags",
        "size": 10
      
    
  

结果:


  "took": 69,
  "timed_out": false,
  "_shards": 
    "total": 5,
    "successful": 5,
    "failed": 0
  ,
  "hits": 
    "total": 4,
    "max_score": 1,
    "hits": [
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "5",
        "_score": 1,
        "_source": 
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "yagao producer",
          "tags": "meibai"
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": 
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": 
          "name": "jiaqiangban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": 
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qinxin"
          ]
        
      
    ]
  ,
  "aggregations": 
    "group_by_tags": 
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        
          "key": "fangzhu",
          "doc_count": 2
        ,
        
          "key": "meibai",
          "doc_count": 2
        ,
        
          "key": "qinxin",
          "doc_count": 1
        
      ]
    
  

2) 名称包含gaolujie的商品,然后按照 tags分组

GET /ecommerce/product/_search

  "size": 0,
  "query": 
  
    "match": 
    
      "name": "gaolujie"
    
  , 
  "aggs":
  
    "group_by_tags":
    
      "terms":
      
        "field": "tags"
      
    
  

结果:


  "took": 7,
  "timed_out": false,
  "_shards": 
    "total": 5,
    "successful": 5,
    "failed": 0
  ,
  "hits": 
    "total": 1,
    "max_score": 0,
    "hits": []
  ,
  "aggregations": 
    "group_by_tags": 
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        
          "key": "fangzhu",
          "doc_count": 1
        ,
        
          "key": "meibai",
          "doc_count": 1
        
      ]
    
  

3) 计算每个tag下的商品的平均价格

GET /ecommerce/product/_search

  "size": 1,
  "aggs":
  
    "group_by_tags":
    
      "terms":
      
        "field": "tags",
        "size": 10
      
      , 
      "aggs":
      
        "avg_price":
        
          "avg": 
          
            "field": "price"
          
        
      
    
  


  "took": 4,
  "timed_out": false,
  "_shards": 
    "total": 5,
    "successful": 5,
    "failed": 0
  ,
  "hits": 
    "total": 4,
    "max_score": 1,
    "hits": [
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "5",
        "_score": 1,
        "_source": 
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "yagao producer",
          "tags": "meibai"
        
      
    ]
  ,
  "aggregations": 
    "group_by_tags": 
      "doc_count_error_upper_bound": 0,
      "sum_other_doc_count": 0,
      "buckets": [
        
          "key": "fangzhu",
          "doc_count": 2,
          "avg_price": 
            "value": 27.5 
        ,
        
          "key": "meibai",
          "doc_count": 2,
          "avg_price": 
            "value": 40 
        ,
        
          "key": "qinxin",
          "doc_count": 1,
          "avg_price": 
            "value": 40 
        
      ]
    
  

4 ) 计算每个tag下的商品的平均价格,并且降序排序

GET /ecommerce/product/_search

  "size":0,
  "aggs":
  
    "all_tags": 
    
      "terms": 
      
        "field": "tags",
        "size": 10,
        "order":
        
          "avg_price": "desc"
        
      ,
      "aggs":
      
        "avg_price":
        
          "avg":
          
            "field": "price"
          
        
      
    
  

5)按照指定的价格范围区间进行分组,然后在每组内再按照tag进行分组,最后计算每组的平均价格

GET /ecommerce/product/_search

  "size": 20,
  "aggs":
  
    "group_by_price": 
    
      "range":
      
        "field": "price",
        "ranges":
        [
          
            "from": 0,
            "to": 20
          ,
          
            "from": 20,
            "to": 40
          ,
           
            "from":40,
            "to": 50
          
        ]
      ,
      "aggs":
      
        "group_by_tags":
        
          "terms":
          
            "field":"tags"
          ,
          "aggs":
          
            "average_price":
            
              "avg":
              
                "field":"price"
              
            
          
        
      
    
  

结果:


  "took": 5,
  "timed_out": false,
  "_shards": 
    "total": 5,
    "successful": 5,
    "failed": 0
  ,
  "hits": 
    "total": 4,
    "max_score": 1,
    "hits": [
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "5",
        "_score": 1,
        "_source": 
          "name": "special yagao",
          "desc": "special meibai",
          "price": 50,
          "producer": "yagao producer",
          "tags": "meibai"
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "2",
        "_score": 1,
        "_source": 
          "name": "jiajieshi yagao",
          "desc": "youxiao fangzhu",
          "price": 25,
          "producer": "jiajieshi producer",
          "tags": [
            "fangzhu"
          ]
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "1",
        "_score": 1,
        "_source": 
          "name": "jiaqiangban gaolujie yagao",
          "desc": "gaoxiao meibai",
          "price": 30,
          "producer": "gaolujie producer",
          "tags": [
            "meibai",
            "fangzhu"
          ]
        
      ,
      
        "_index": "ecommerce",
        "_type": "product",
        "_id": "3",
        "_score": 1,
        "_source": 
          "name": "zhonghua yagao",
          "desc": "caoben zhiwu",
          "price": 40,
          "producer": "zhonghua producer",
          "tags": [
            "qinxin"
          ]
        
      
    ]
  ,
  "aggregations": 
    "group_by_price": 
      "buckets": [
        
          "key": "0.0-20.0",
          "from": 0,
          "to": 20,
          "doc_count": 0,
          "group_by_tags": 
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [] 
        ,
        
          "key": "20.0-40.0",
          "from": 20,
          "to": 40,
          "doc_count": 2,
          "group_by_tags": 
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [  "key": "fangzhu", "doc_count": 2, "average_price":  "value": 27.5  ,  "key": "meibai", "doc_count": 1, "average_price":  "value": 30   ] 
        ,
        
          "key": "40.0-50.0",
          "from": 40,
          "to": 50,
          "doc_count": 1,
          "group_by_tags": 
            "doc_count_error_upper_bound": 0,
            "sum_other_doc_count": 0,
            "buckets": [  "key": "qinxin", "doc_count": 1, "average_price":  "value": 40   ] 
        
      ]
    
  

总结

参考

以上是关于案例实战Elasticsearch基本操作的主要内容,如果未能解决你的问题,请参考以下文章

Elasticsearch 顶尖高手--快速入门案例实战:group by + avg + sort等聚合分析

案例实战Elasticsearch基本操作

ElasticSearch分布式搜索引擎从入门到实战应用(实战篇-仿京东首页搜索商品高亮显示)

Elasticsearch学习之深入聚合分析三---案例实战

Mqsql使用Sharding-JDBC案例实战

Elasticsearch技术解析与实战