检索数据

Posted 2021-03-14 qg000

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了检索数据相关的知识，希望对你有一定的参考价值。

检索数据

示例数据

测试数据中每个文档的格式如下：

{
    "account_number": 0,
    "balance": 16623,
    "firstname": "Bradshaw",
    "lastname": "Mckenzie",
    "age": 29,
    "gender": "F",
    "address": "244 Columbus Place",
    "employer": "Euron",
    "email": "bradshawmckenzie@euron.com",
    "city": "Hobucken",
    "state": "CO"
}

之后再accounts.json所在目录执行以下命令导入数据

curl -H "Content-Type: application/json" -XPOST "localhost:9200/bank/_bulk?pretty&refresh" --data-binary "@accounts.json"
curl http://localhost:9200/_cat/indices?v

技术图片

Search API

运行搜索有两种基本方法：一种是通过REST请求URI发送检索参数，另一种是通过REST请求体发送检索参数。相当于HTTP的GET和POST请求。

> curl -X GET "localhost:9200/bank/_search?q=*&sort=account_number:asc&pretty"

我们在"bank"索引中检索，q=*参数表示匹配所有文档；sort=account_number:asc表示每个文档的account_number字段升序排序；pretty参数表示返回漂亮打印的JSON结果。

技术图片

可以看到，响应由下列几部分组成：

took ： Elasticsearch执行搜索的时间（以毫秒为单位）
timed_out ：告诉我们检索是否超时
_shards ：告诉我们检索了多少分片，以及成功/失败的分片数各是多少
hits ：检索的结果
hits.total ：符合检索条件的文档总数
hits.hits ：实际的检索结果数组（默认为前10个文档）
hits.sort ：排序的key（如果按分值排序的话则不显示）
hits._score 和 max_score 现在我们先忽略这些字段

下面是一个和上面相同，但是用请求体的例子：

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": [
    { "account_number": "asc" }
  ]
}
'

区别在于，我们没有在URI中传递q=*，而是向_search API提供json风格的查询请求体

很重要的一点是，一旦返回搜索结果，Elasticsearch就完全完成了对请求的处理，不会在结果中维护任何类型的服务器端资源或打开游标。这是许多其他平台如SQL形成鲜明对比。

查询语言

Elasticsearch提供了一种JSON风格的语言，这被成为查询DSL。

curl -X GET "localhost:9200/bank/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} }
}
'

查询部分告诉我们查询定义是什么，match_all部分只是我们想要运行的查询类型。这里match_all查询只是在指定索引中搜索所有文档。

除了查询参数外，我们还可以传递其他参数来影响搜索结果。在上面部分的例子中，我们传的是sort参数，这里我们传size：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "size": 1
}
'

注意：如果size没有指定，则默认是10

下面的例子执行match_all，并返回第10到19条文档：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "from": 10,
  "size": 10
}
'

from参数（从0开始）指定从哪个文档索引开始，并且size参数指定从from开始返回多少条。这个特性在分页查询时非常有用。

注意：如果没有指定from，则默认从0开始

这个示例执行match_all，并按照帐户余额降序对结果进行排序，并返回前10个（默认大小）文档。

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "sort": { "balance": { "order": "desc" } }
}
'

搜索

默认情况下，会返回完整的JSON文档（PS：也就是返回所有字段）。这被成为source（hits._source）

如果我们不希望返回整个源文档，我们可以从源文档中只请求几个字段来返回。

返回文档中的两个字段：account_number 和 balance字段

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match_all": {} },
  "_source": ["account_number", "balance"]
}
'

相当于SELECT account_number， balance FROM bank

返回account_number为20的文档

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "account_number": 20 } }
}
'

相当于SELECT * FROM bank WHERE account_number = 20

返回address中包含"mill"的账户(不区分大小写)

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "address": "mill" } }
}
'

相当于SELECT * FROM bank WHERE address LIKE ‘%mill%‘

返回address中包含"mill"或者"lane"的账户：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": { "match": { "address": "mill lane" } }
}
'

相当于SELECT * FROM bank WHERE address LIKE ‘%mill‘ OR address LIKE ‘%lane%‘

让我们来引入bool查询，bool查询允许我们使用布尔逻辑将较小的查询组合成较大的查询。

下面的例子将两个match查询组合在一起，返回address中包含"mill"和"lane"的账户：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}
'

相当于SELECT * FROM bank WHERE address LIKE ‘%mill%lane%‘

上面是bool must查询，下面这个是bool shoud查询：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "should": [
        { "match": { "address": "mill" } },
        { "match": { "address": "lane" } }
      ]
    }
  }
}
'

must相当于and，shoud相当于or，must_not相当于！

逻辑运算符：与/或/非，and/or/not，在这里就是must/should/must_not

可以在bool查询中同时组合must、should和must_not子句。此外，我们可以在任何bool子句中编写bool查询，以模拟任何复杂的多级布尔逻辑。

下面的例子是一个综合应用：

curl -X GET "localhost:9200/bank/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "age": "40" } }
      ],
      "must_not": [
        { "match": { "state": "ID" } }
      ]
    }
  }
}
'

相当于SELECT * FROM bank WHERE age LIKE ‘%40%‘ AND state NOT LIKE ‘%ID%‘

以上是关于检索数据的主要内容，如果未能解决你的问题，请参考以下文章