仅获取ElasticSearch中的匹配值和相应字段

Posted 2021-03-31

tags:

篇首语：本文由小常识网(cha138.com)小编为大家整理，主要介绍了仅获取ElasticSearch中的匹配值和相应字段相关的知识，希望对你有一定的参考价值。

在elasticsearch中，假设我有类似的文档

{
  "name": "John",
  "department": "Biology",
  "address": "445 Mount Eden Road"
},
{
  "name": "Jane",
  "department": "Chemistry",
  "address": "32 Wilson Street"
},
{
  "name": "Laura",
  "department": "BioTechnology",
  "address": "21 Greens Road"
},
{
  "name": "Mark",
  "department": "Physics",
  "address": "Random UNESCO Bio-reserve"
}

有一个用例，如果我在搜索栏中输入“bio”，我应该从elasticsearch获得匹配的字段值以及字段名称。

对于这个例子，

输入：“生物”

预期产出：

{
  "field": "department",
  "value": "Biology"
},
{
  "field": "department",
  "value": "BioTechnology"
},
{
  "field": "address",
  "value": "Random UNESCO Bio-reserve"
}

我应该使用什么类型的查询？我可以考虑使用NGram Tokenizer然后使用匹配查询。但是，我不确定如何只获得匹配的字段值（不是整个文档）和相应的字段名称作为输出。

答案

在进一步阅读Completion Suggesters和Context Suggesters之后，我可以通过以下方式解决这个问题：

1）为每个类型为“完成”的记录保留一个单独的“建议”字段，并使用“类别”类型的上下文映射。我创建的映射如下所示：

{
  "properties": {
    "suggest": {
      "type": "completion",
      "contexts": [
        {
          "name": "field_type",
          "type": "category",
          "path": "cat"
        }
      ]
    },
    "name": {
      "type": "text"
    },
    "department": {
      "type": "text"
    },
    "address": {
      "type": "text"
    }
  }
}

2）然后我插入如下所示的记录（将搜索元数据添加到具有适当“上下文”的“建议”字段）。

例如，要插入第一条记录，我执行以下操作：

POST：localhost：9200 / test_index / test_type / 1

{
    "suggest": [
        {
            "input": ["john"],
            "contexts": {
                "field_type": ["name"] 
            }
        },
        {
            "input": ["biology"],
            "contexts": {
                "field_type": ["department"] 
            }
        },
        {
            "input": ["445 mount eden road"],
            "contexts": {
                "field_type": ["address"] 
            }
        }
    ],
    "name": "john",
    "department": "biology",
    "address": "445 mount eden road"
}

3）如果我们想要搜索句子中间出现的术语（因为搜索术语“生物”出现在第4条记录的地址字段的中间，我们可以将条目索引如下：

POST：localhost：9200 / test_index / test_type / 4

{
    "suggest": [
        {
            "input": ["mark"],
            "contexts": {
                "field_type": ["name"] 
            }
        },
        {
            "input": ["physics"],
            "contexts": {
                "field_type": ["department"] 
            }
        },
        {
            "input": ["random unesco bio-reserve", "bio-reserve"],
            "contexts": {
                "field_type": ["address"] 
            }
        }
    ],
    "name": "mark",
    "department": "physics",
    "address": "random unesco bio-reserve"
}

4）然后搜索关键字“bio”，如下所示：

本地主机：9200 /的test_index / test_type / _search

{
    "_source": false,
    "suggest": {
        "suggestion" : {
            "text" : "bio",
            "completion" : {
                "field" : "suggest",
                "size": 10,
                "contexts": {
                    "field_type": [ "name", "department", "address" ]
                }
            }
        }
    }
}

响应：

{
    "hits": {
        "total": 0,
        "max_score": 0,
        "hits": []
    },
    "suggest": {
        "suggestion": [
            {
                "text": "bio",
                "offset": 0,
                "length": 3,
                "options": [
                    {
                        "text": "bio-reserve",
                        "_index": "test_index",
                        "_type": "test_type",
                        "_id": "4",
                        "_score": 1,
                        "contexts": {
                            "field_type": [
                                "address"
                            ]
                        }
                    },
                    {
                        "text": "biology",
                        "_index": "test_index",
                        "_type": "test_type",
                        "_id": "1",
                        "_score": 1,
                        "contexts": {
                            "field_type": [
                                "department"
                            ]
                        }
                    },
                    {
                        "text": "biotechnology",
                        "_index": "test_index",
                        "_type": "test_type",
                        "_id": "3",
                        "_score": 1,
                        "contexts": {
                            "field_type": [
                                "department"
                            ]
                        }
                    }
                ]
            }
        ]
    }
}

任何人都可以建议任何更好的方法？

以上是关于仅获取ElasticSearch中的匹配值和相应字段的主要内容，如果未能解决你的问题，请参考以下文章