分布式搜索引擎ElasticSearch之高级运用
Posted mirson
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了分布式搜索引擎ElasticSearch之高级运用相关的知识,希望对你有一定的参考价值。
一、过滤查询(分页、模糊、filter)
1. 搜索符合匹配条件的信息:
创建数据:
PUT account/_doc/1
{ "account": 10001, "balance": 10000, "name": "test1"}
PUT account/_doc/2
{ "account": 10002, "balance": 20000, "name": "test2"}
PUT account/_doc/3
{ "account": 10003, "balance": 30000, "name": "张三"}
PUT account/_doc/4
{ "account": 10004, "balance": 30000, "name": "王五"}
根据账号编号查找:
GET /account/_search
{
"query": {
"match": {
"accountNo": "10001"
}
}
}
返回结果:
{
...
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "account",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"account" : 10002,
"balance" : 20000,
"name" : "test2"
}
}
]
}
...
}
匹配成功,返回所要查询的数据。
2. 支持分页查询:
GET /account/_search
{
"query": {
"match_all": {}
},
"from": 0,
"size": 2
}
能够返回2条数据。
"hits" : {
"total" : {
"value" : 3,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "account",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"account" : 10001,
"balance" : 10000,
"name" : "test1"
}
},
{
"_index" : "account",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"account" : 10002,
"balance" : 20000,
"name" : "test2"
}
}
]
}
3. 模糊查询:
数值类型不利于模糊匹配, 这里通过字符类型进行测试:
GET /account/_search
{
"query": {
"match": {
"name": "三四"
}
}
}
返回结果:
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.2876821,
"hits" : [
{
"_index" : "account",
"_type" : "_doc",
"_id" : "4",
"_score" : 0.2876821,
"_source" : {
"accountNo" : 10009,
"balance" : 1000000,
"name" : "张三"
}
}
]
}
}
注意, 这里默认会采用单个汉字分词, 所查询的关键字“三四”会拆成“三”和“四”进行模糊匹配。
4. filter过滤查询:
GET /account/_search
{
"query": {
"bool": {
"filter": [
{
"term": {
"name": "张三"
}
}
]
}
}
}
term是精准查询, 代表完全匹配, 不需要查询评分计算。
返回结果:
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 0,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
}
}
可以看到没有匹配到任何结果,因为term是拿整个词“张三”进行匹配, 而ES默认是做单字分词, 将“张三”划分为了“张”和“三”, 所以匹配不到结果。
二、bool查询(should、must)
should查询: 只要其中一个为true则成立。
GET /movies/_search { "query":{ "bool": { "must": [ {"match": {"title": "good hearts sea"}}, {"match": {"overview": "good hearts sea"}} ] } } }
must查询: 必须所有条件都成立。
GET /movies/_search { "query":{ "bool": { "must": [ {"match": {"title": "good hearts sea"}}, {"match": {"overview": "good hearts sea"}} ] } } }
must_not查询:必须所有条件都不成立。
GET /movies/_search { "query":{ "bool": { "must_not": [ {"match": {"title": "good hearts sea"}}, {"match": {"overview": "good hearts sea"}} ] } } }
三、聚合查询操作(aggs)
根据用户的资金balance来做分组统计:
GET /account/_search { "query": { "bool": { "filter": [ { "range": { "account": { "gte": 10001 } } } ] } }, "sort": [ { "balance": { "order": "desc" } } ], "aggs":{ "group_by_balance": { "terms": { "field": "balance" } } } }
找出账户编号大于等于10001的数据, 根据balance做倒序排列,采用aggs根据balance做分组汇总统计:
"aggregations" : { "group_by_balance" : { "doc_count_error_upper_bound" : 0, "sum_other_doc_count" : 0, "buckets" : [ { "key" : 30000, "doc_count" : 2 }, { "key" : 10000, "doc_count" : 1 }, { "key" : 20000, "doc_count" : 1 } ] } }
可以看到, 最后会输出分组统计的汇总信息。
本文由mirson创作分享,如需进一步交流,请加QQ群:19310171或访问www.softart.cn
以上是关于分布式搜索引擎ElasticSearch之高级运用的主要内容,如果未能解决你的问题,请参考以下文章