Elasticsearch 没有突出显示所有匹配项

Posted

技术标签:

【中文标题】Elasticsearch 没有突出显示所有匹配项【英文标题】:Elasticsearch not highlighting all matches 【发布时间】:2020-08-19 22:03:12 【问题描述】:

我很难理解为什么以下查询对象不会使 ES 突出显示 _source 列中的所有单词。


    _source: [
        'baseline',
        'cdrp',
        'date',
        'description',
        'dev_status',
        'element',
        'event',
        'id'
    ],
    track_total_hits: true,
    query: 
        bool: 
            filter: [],
            should: [
                
                    multi_match:
                        query: "imposed calcs",
                        fields: ["cdrp","description","narrative.*","title","cop"]
                    
                
            ]
         
    ,
    highlight:  fields:  '*':   ,
    sort: [],
    from: 0,
    size: 50

通过运行此查询,我得到以下高亮对象返回。请注意,仅突出显示“calcs”一词。如何构建突出显示对象以使 ES 突出显示“强加”?

"highlight": 
    "description": [
        "GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN Imposed <em>calcs</em> mising"
    ]
 

我正在使用以下“描述”映射:

"description": 
    "type": "text",
    "analyzer": "search_synonyms"
,



"analysis": 
    "analyzer": 
        "search_synonyms": 
            "tokenizer": "whitespace",
            "filter": [
                "graph_synonyms"
            ],
            "normalizer": [
                "normalizer_1"
            ]
        
    ,
    "filter": 
        "graph_synonyms": 
            "type": "synonym_graph",
            "synonyms_path": "synonym.txt"
        
    ,
    "normalizer": 
        "normalizer_1": 
            "type": "custom",
            "char_filter": [],
            "filter": ["lowercase", "asciifolding"]
        
    

【问题讨论】:

【参考方案1】:

编辑

我认为您的 graph_synonyms 过滤器正在覆盖规范器的过滤器。试试这个:

PUT highlighter

  "settings": 
    "analysis": 
      "analyzer": 
        "search_synonyms": 
          "tokenizer": "whitespace",
          "filter": [
            "graph_synonyms",
            "lowercase",
            "asciifolding"
          ]
        
      ,
      "filter": 
        "graph_synonyms": 
          "type": "synonym_graph",
          "synonyms_path": "synonym.txt"
        
      
    
  ,
  "mappings": 
    "properties": 
      "description": 
        "type": "text",
        "analyzer": "search_synonyms"
      
    
  


原创

我怀疑您的映射中有某种设置阻止了匹配,因为我无法使用半默认映射复制它:

PUT highlighter

  "settings": 
    "analysis": 
      "analyzer": 
        "my_analyzer": 
          "tokenizer": "standard",
          "filter": [
            "lowercase"
          ]
        
      
    
  ,
  "mappings": 
    "properties": 
      "description": 
        "type": "text",
        "fields": 
          "lowercase": 
            "type": "text",
            "analyzer": "my_analyzer"
          
        
      
    
  


POST highlighter/_doc

  "description": "GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN Imposed calcs mising"

插入您的查询

GET highlighter/_search

  "_source": [
    "baseline",
    "cdrp",
    "date",
    "description",
    "dev_status",
    "element",
    "event",
    "id"
  ],
  "track_total_hits": true,
  "query": 
    "bool": 
      "filter": [],
      "should": [
        
          "multi_match": 
            "query": "imposed calcs",
            "fields": [
              "cdrp",
              "description.lowercase",
              "narrative.*",
              "title",
              "cop"
            ]
          
        
      ]
    
  ,
  "highlight": 
    "fields": 
      "*": 
    
  ,
  "sort": [],
  "from": 0,
  "size": 50

屈服

[
  
    "_index":"highlighter",
    "_type":"_doc",
    "_id":"Bf5F5HEBW-D5QnrWwTyh",
    "_score":0.5753642,
    "_source":
      "description":"GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN Imposed calcs mising"
    ,
    "highlight":
      "description":[
        "GAP Sub-window conn ONe-e: heve PP-BE Defined ASST requirem RV confsng, des MAN <em>Imposed</em> <em>calcs</em> mising"
      ]
    
  
]

【讨论】:

请查看我更新的答案。我需要更新我的 search_synonym 分析器吗? 感谢您的回复。不幸的是,它不起作用。我无法摆脱 normalizer_1,因为其他领域正在使用它 那么不要删除它,而是将"lowercase", "asciifolding" 添加到分析仪的filter[] 中。

以上是关于Elasticsearch 没有突出显示所有匹配项的主要内容,如果未能解决你的问题,请参考以下文章

如何使用 jQuery 突出显示 RegEx 匹配项?

使用 Elasticsearch,我可以为不同的匹配标记使用不同的 HTML 标签突出显示吗?

在 vim 中用于 inc-search 的 Emacs 样式突出显示

在 HTML 标记的文本内容中查找单词/文本并用突出显示标记替换匹配项的可靠方法是啥?

从文本输入中突出显示 div 中的所有匹配单词

如何在 Visual Studio 中突出显示文本中出现的搜索词?