从弹性搜索中的评分中删除提升项

Posted 2023-02-24

技术标签:

【中文标题】从弹性搜索中的评分中删除提升项【英文标题】：Removing boost term from scoring in elasticsearch 【发布时间】：2020-09-05 21:27:30 【问题描述】：

有什么方法可以从弹性相关性评分中删除默认的提升词或将其设为 1（因此它不会反映在评分中）。例如，

编辑：输入查询是

GET test_index/_search

  "query": 
    "match": 
      "text": "asia pacific"
    
,"explain": true

部分结果是

"details" : [
        
          "value" : 6.5127892,
          "description" : "weight(text:asia in 5417) [PerFieldSimilarity], result of:",
          "details" : [
            
              "value" : 6.5127892,
              "description" : "score(freq=6.0), product of:",
              "details" : [
                
                  "value" : 2.2,
                  "description" : "boost",
                  "details" : [ ]
                ,
                
                  "value" : 3.8113363,
                  "description" : "idf, computed as log(1 + (N - n + 0.5) / (n + 0.5)) from:",
                  "details" : [
                    
                      "value" : 462,
                      "description" : "n, number of documents containing term",
                      "details" : [ ]
                    ,
                    
                      "value" : 20909,
                      "description" : "N, total number of documents with field",
                      "details" : [ ]
                    
                  ]
                ,
                 
                  "value" : 0.7767246,
                  "description" : "tf, computed as freq / (freq + k1 * (1 - b + b * dl / avgdl)) from:",
                  "details" : [
                    
                      "value" : 6.0,
                      "description" : "freq, occurrences of term within document",
                      "details" : [ ]
                    ,
                    
                      "value" : 1.2,
                      "description" : "k1, term saturation parameter",
                      "details" : [ ]
                    ,
                    
                      "value" : 0.75,
                      "description" : "b, length normalization parameter",
                      "details" : [ ]
                    ,
                    
                      "value" : 1048.0,
                      "description" : "dl, length of field (approximate)",
                      "details" : [ ]
                    ,
                    
                      "value" : 662.01294,
                      "description" : "avgdl, average length of field",
                      "details" : [ ]
                    
                  ]
                
              ]
            
          ]
        ,

这是对简单匹配查询中的一个术语的解释。在上面的示例中，我想删除 2.2 的提升项或使其等于 1。请建议如何执行此操作。

【问题讨论】：

【参考方案1】：

在查询中使用 bool->filter 而不是 bool->must。这正是Filter is for。

【讨论】：

我目前正在使用这样的简单查询 GET test_index/_search "query": "match": "text": "asia pacific" 来获取结果，但如果我使用

GET test_index/_search    "query":      "bool":        "filter":          "term":            "text": "asia pacific"

，它会返回 0 个结果。关于我在这里做错了什么的任何建议。 Term vs match 已经在这里回答了：***.com/a/23151332/8160318 如果你想保持 1 分，使用这个："query":"bool":"filter":["match":"text":"asia pacific"] 感谢您的建议。但我试图将 boost 设为 1，并且仍然需要基于 BM25 的 TF 和 IDF 分数。使用这种 bool->filter->match 格式会给出一个恒定的分数，这与我想要达到的结果不同。哦，我看错了你的问题。我认为您不能挑选得分部分......您可以通过脚本或函数得分（elastic.co/guide/en/elasticsearch/reference/current/…）调整最终得分，但您不能打破match得分策略。

以上是关于从弹性搜索中的评分中删除提升项的主要内容，如果未能解决你的问题，请参考以下文章