仅获取ElasticSearch中的匹配值和相应字段

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了仅获取ElasticSearch中的匹配值和相应字段相关的知识,希望对你有一定的参考价值。

在elasticsearch中,假设我有类似的文档

{
  "name": "John",
  "department": "Biology",
  "address": "445 Mount Eden Road"
},
{
  "name": "Jane",
  "department": "Chemistry",
  "address": "32 Wilson Street"
},
{
  "name": "Laura",
  "department": "BioTechnology",
  "address": "21 Greens Road"
},
{
  "name": "Mark",
  "department": "Physics",
  "address": "Random UNESCO Bio-reserve"
}

有一个用例,如果我在搜索栏中输入“bio”,我应该从elasticsearch获得匹配的字段值以及字段名称。

对于这个例子,

输入:“生物”

预期产出:

{
  "field": "department",
  "value": "Biology"
},
{
  "field": "department",
  "value": "BioTechnology"
},
{
  "field": "address",
  "value": "Random UNESCO Bio-reserve"
}

我应该使用什么类型的查询?我可以考虑使用NGram Tokenizer然后使用匹配查询。但是,我不确定如何只获得匹配的字段值(不是整个文档)和相应的字段名称作为输出。

答案

在进一步阅读Completion SuggestersContext Suggesters之后,我可以通过以下方式解决这个问题:

1)为每个类型为“完成”的记录保留一个单独的“建议”字段,并使用“类别”类型的上下文映射。我创建的映射如下所示:

{
  "properties": {
    "suggest": {
      "type": "completion",
      "contexts": [
        {
          "name": "field_type",
          "type": "category",
          "path": "cat"
        }
      ]
    },
    "name": {
      "type": "text"
    },
    "department": {
      "type": "text"
    },
    "address": {
      "type": "text"
    }
  }
}

2)然后我插入如下所示的记录(将搜索元数据添加到具有适当“上下文”的“建议”字段)。

例如,要插入第一条记录,我执行以下操作:

POST:localhost:9200 / test_index / test_type / 1

{
    "suggest": [
        {
            "input": ["john"],
            "contexts": {
                "field_type": ["name"] 
            }
        },
        {
            "input": ["biology"],
            "contexts": {
                "field_type": ["department"] 
            }
        },
        {
            "input": ["445 mount eden road"],
            "contexts": {
                "field_type": ["address"] 
            }
        }
    ],
    "name": "john",
    "department": "biology",
    "address": "445 mount eden road"
}

3)如果我们想要搜索句子中间出现的术语(因为搜索术语“生物”出现在第4条记录的地址字段的中间,我们可以将条目索引如下:

POST:localhost:9200 / test_index / test_type / 4

{
    "suggest": [
        {
            "input": ["mark"],
            "contexts": {
                "field_type": ["name"] 
            }
        },
        {
            "input": ["physics"],
            "contexts": {
                "field_type": ["department"] 
            }
        },
        {
            "input": ["random unesco bio-reserve", "bio-reserve"],
            "contexts": {
                "field_type": ["address"] 
            }
        }
    ],
    "name": "mark",
    "department": "physics",
    "address": "random unesco bio-reserve"
}

4)然后搜索关键字“bio”,如下所示:

本地主机:9200 /的test_index / test_type / _search

{
    "_source": false,
    "suggest": {
        "suggestion" : {
            "text" : "bio",
            "completion" : {
                "field" : "suggest",
                "size": 10,
                "contexts": {
                    "field_type": [ "name", "department", "address" ]
                }
            }
        }
    }
}

响应:

{
    "hits": {
        "total": 0,
        "max_score": 0,
        "hits": []
    },
    "suggest": {
        "suggestion": [
            {
                "text": "bio",
                "offset": 0,
                "length": 3,
                "options": [
                    {
                        "text": "bio-reserve",
                        "_index": "test_index",
                        "_type": "test_type",
                        "_id": "4",
                        "_score": 1,
                        "contexts": {
                            "field_type": [
                                "address"
                            ]
                        }
                    },
                    {
                        "text": "biology",
                        "_index": "test_index",
                        "_type": "test_type",
                        "_id": "1",
                        "_score": 1,
                        "contexts": {
                            "field_type": [
                                "department"
                            ]
                        }
                    },
                    {
                        "text": "biotechnology",
                        "_index": "test_index",
                        "_type": "test_type",
                        "_id": "3",
                        "_score": 1,
                        "contexts": {
                            "field_type": [
                                "department"
                            ]
                        }
                    }
                ]
            }
        ]
    }
}

任何人都可以建议任何更好的方法?

以上是关于仅获取ElasticSearch中的匹配值和相应字段的主要内容,如果未能解决你的问题,请参考以下文章

当我们在 plsql 中动态传递列名的值和相应的列值时获取整行的存储过程

ElasticSearch_03_ES的基本筛选条件

ElasticSearch_03_ES的基本筛选条件

ElasticSearch_02_ES的基本筛选条件

ElasticSearch_03_ES的基本筛选条件

唯一值和匹配的数组