Elasticsearch:Index boost
Posted 中国社区官方博客
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Elasticsearch:Index boost相关的知识,希望对你有一定的参考价值。
搜索多个索引时,你可以使用 indices_boost 参数来提升一个或多个指定索引的结果。 当来自某些索引的命中比来自其他索引的命中更重要时,这很有用。
注意:你不能对数据流使用 indices_boost。
下面,我来用一个例子来展示如何使用 indices_boost 来针对一些索引进行 boost。
例子
在今天的例子中,我们使用一个 twitter 的索引来进行展示。由于这个索引含有位置信息,所有,我们必须首先定义一个关于这个索引 bookdb_index 的 mapping,这样便于我们在导入数据时,location 是我们正确需要的 geo_point 数据类型:
PUT twitter
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
通过上面的命令,我们就创建了一个叫做 bookdb_index 的索引。我们接着使用 bulk API 来导入我们的数据:
POST _bulk
{ "index" : { "_index" : "twitter", "_id": 1} }
{"user":"双榆树-张三","message":"今儿天气不错啊,出去转转去","uid":2,"age":20,"city":"北京","province":"北京","country":"中国","address":"中国北京市海淀区","location":{"lat":"39.970718","lon":"116.325747"}}
{ "index" : { "_index" : "twitter", "_id": 2} }
{"user":"虹桥-老吴","message":"好友来了都今天我生日,好友来了,什么 birthday happy 就成!","uid":2,"age":90,"city":"上海","province":"上海","country":"中国","address":"中国上海市闵行区","location":{"lat":"31.175927","lon":"121.383328"}}
{ "index" : { "_index" : "twitter", "_id": 3} }
{"user":"东城区-李四","message":"happy birthday!","uid":4,"age":30,"city":"北京","province":"北京","country":"中国","address":"中国北京市东城区","location":{"lat":"39.893801","lon":"116.408986"}}
在上面, 我使用了 3 个索引数据。为了方便,我们使用 reindex API 来把上面的 twitter 索引导入到另外一个叫做 twitter1 的索引中。
PUT twitter1
{
"mappings": {
"properties": {
"location": {
"type": "geo_point"
}
}
}
}
POST _reindex
{
"source": {
"index": "twitter"
},
"dest": {
"index": "twitter1"
}
}
这样 twitter1 里含有和 twitter 一模一样的三个文档。
接着我们,做如下的搜索:
GET twitter*/_search
{
"indices_boost": [
{
"twitter": 10.0
},
{
"twitter": 2.0
}
]
}
在上面, 我们给 twitter 索引加权 10.0,而对 twitter1 的索引加权为 2.0。上面的搜索结果为:
"hits" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_score" : 10.0,
"_source" : {
"user" : "双榆树-张三",
"message" : "今儿天气不错啊,出去转转去",
"uid" : 2,
"age" : 20,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市海淀区",
"location" : {
"lat" : "39.970718",
"lon" : "116.325747"
}
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "2",
"_score" : 10.0,
"_source" : {
"user" : "虹桥-老吴",
"message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
"uid" : 2,
"age" : 90,
"city" : "上海",
"province" : "上海",
"country" : "中国",
"address" : "中国上海市闵行区",
"location" : {
"lat" : "31.175927",
"lon" : "121.383328"
}
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "3",
"_score" : 10.0,
"_source" : {
"user" : "东城区-李四",
"message" : "happy birthday!",
"uid" : 4,
"age" : 30,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市东城区",
"location" : {
"lat" : "39.893801",
"lon" : "116.408986"
}
}
},
{
"_index" : "twitter1",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"user" : "双榆树-张三",
"message" : "今儿天气不错啊,出去转转去",
"uid" : 2,
"age" : 20,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市海淀区",
"location" : {
"lat" : "39.970718",
"lon" : "116.325747"
}
}
},
{
"_index" : "twitter1",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"user" : "虹桥-老吴",
"message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
"uid" : 2,
"age" : 90,
"city" : "上海",
"province" : "上海",
"country" : "中国",
"address" : "中国上海市闵行区",
"location" : {
"lat" : "31.175927",
"lon" : "121.383328"
}
}
},
{
"_index" : "twitter1",
"_type" : "_doc",
"_id" : "3",
"_score" : 1.0,
"_source" : {
"user" : "东城区-李四",
"message" : "happy birthday!",
"uid" : 4,
"age" : 30,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市东城区",
"location" : {
"lat" : "39.893801",
"lon" : "116.408986"
}
}
}
]
从上面的结果中,我们可以看出来所有 twitter 中的文档都排在前面,而 twitter1 中的文档排在后面。
另外,也可以使用别名和索引模式。我们来创建如下的别名:
PUT twitter/_alias/city_shanghai
{
"filter": [
{
"term": {
"city.keyword": "上海"
}
}
]
}
上面定义了一个叫做 city_shanghai 的别名。我们接下来做如下的搜索:
GET twitter*/_search
{
"indices_boost": [
{
"city_shanghai": 10.0
},
{
"twitter1": 2.0
}
],
"query": {
"match": {
"country": "中国"
}
}
}
上面的搜索结果是:
"hits" : [
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "1",
"_score" : 2.6706278,
"_source" : {
"user" : "双榆树-张三",
"message" : "今儿天气不错啊,出去转转去",
"uid" : 2,
"age" : 20,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市海淀区",
"location" : {
"lat" : "39.970718",
"lon" : "116.325747"
}
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "2",
"_score" : 2.6706278,
"_source" : {
"user" : "虹桥-老吴",
"message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
"uid" : 2,
"age" : 90,
"city" : "上海",
"province" : "上海",
"country" : "中国",
"address" : "中国上海市闵行区",
"location" : {
"lat" : "31.175927",
"lon" : "121.383328"
}
}
},
{
"_index" : "twitter",
"_type" : "_doc",
"_id" : "3",
"_score" : 2.6706278,
"_source" : {
"user" : "东城区-李四",
"message" : "happy birthday!",
"uid" : 4,
"age" : 30,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市东城区",
"location" : {
"lat" : "39.893801",
"lon" : "116.408986"
}
}
},
{
"_index" : "twitter1",
"_type" : "_doc",
"_id" : "1",
"_score" : 0.53412557,
"_source" : {
"user" : "双榆树-张三",
"message" : "今儿天气不错啊,出去转转去",
"uid" : 2,
"age" : 20,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市海淀区",
"location" : {
"lat" : "39.970718",
"lon" : "116.325747"
}
}
},
{
"_index" : "twitter1",
"_type" : "_doc",
"_id" : "2",
"_score" : 0.53412557,
"_source" : {
"user" : "虹桥-老吴",
"message" : "好友来了都今天我生日,好友来了,什么 birthday happy 就成!",
"uid" : 2,
"age" : 90,
"city" : "上海",
"province" : "上海",
"country" : "中国",
"address" : "中国上海市闵行区",
"location" : {
"lat" : "31.175927",
"lon" : "121.383328"
}
}
},
{
"_index" : "twitter1",
"_type" : "_doc",
"_id" : "3",
"_score" : 0.53412557,
"_source" : {
"user" : "东城区-李四",
"message" : "happy birthday!",
"uid" : 4,
"age" : 30,
"city" : "北京",
"province" : "北京",
"country" : "中国",
"address" : "中国北京市东城区",
"location" : {
"lat" : "39.893801",
"lon" : "116.408986"
}
}
}
]
如果找到多个匹配项,将使用第一个匹配项。 例如,如果一个索引包含在 别名 中并且与 twitter* 模式匹配,则应用 10.0 的提升值。
以上是关于Elasticsearch:Index boost的主要内容,如果未能解决你的问题,请参考以下文章
elasticsearch之解除索引只读问题filtersort解除索引最大查询数的限制reindex迁移数据boost条件权重控制