MongoDB 索引文本搜索仅适用于完全匹配

Posted

技术标签:

【中文标题】MongoDB 索引文本搜索仅适用于完全匹配【英文标题】:MongoDB indexed text search only works for exact match 【发布时间】:2021-05-19 06:55:43 【问题描述】:

我的字段“用户名”填充了数据。 这段代码没有给我任何结果:

history = db.history
history.create_index([('user_name', 'text')])
history.find('$text' : '$search' : 'a')

但是当我指定确切的名称时,它会起作用

history.find('$text' : '$search' : 'exact name')

这里是 'a' 搜索的 explain() 输出:


    "executionSuccess": true,
    "nReturned": 0,
    "executionTimeMillis": 0,
    "totalKeysExamined": 0,
    "totalDocsExamined": 0,
    "executionStages": 
        "stage": "TEXT",
        "nReturned": 0,
        "executionTimeMillisEstimate": 0,
        "works": 1,
        "advanced": 0,
        "needTime": 0,
        "needYield": 0,
        "saveState": 0,
        "restoreState": 0,
        "isEOF": 1,
        "indexPrefix": ,
        "indexName": "user_name_text",
        "parsedTextQuery":  "terms": [], "negatedTerms": [], "phrases": [], "negatedPhrases": [] ,
        "textIndexVersion": 3,
        "inputStage": 
            "stage": "TEXT_MATCH",
            "nReturned": 0,
            "executionTimeMillisEstimate": 0,
            "works": 0,
            "advanced": 0,
            "needTime": 0,
            "needYield": 0,
            "saveState": 0,
            "restoreState": 0,
            "isEOF": 1,
            "docsRejected": 0,
            "inputStage": 
                "stage": "FETCH",
                "nReturned": 0,
                "executionTimeMillisEstimate": 0,
                "works": 0,
                "advanced": 0,
                "needTime": 0,
                "needYield": 0,
                "saveState": 0,
                "restoreState": 0,
                "isEOF": 1,
                "docsExamined": 0,
                "alreadyHasObj": 0,
                "inputStage":  "stage": "OR", "nReturned": 0, "executionTimeMillisEstimate": 0, "works": 0, "advanced": 0, "needTime": 0, "needYield": 0, "saveState": 0, "restoreState": 0, "isEOF": 1, "dupsTested": 0, "dupsDropped": 0 
            
        
    ,
    "allPlansExecution": []

这里是用户名 ('akkcess') 完全匹配的 explain() 输出:


    "executionSuccess": true,
    "nReturned": 39,
    "executionTimeMillis": 1,
    "totalKeysExamined": 39,
    "totalDocsExamined": 39,
    "executionStages": 
        "stage": "TEXT",
        "nReturned": 39,
        "executionTimeMillisEstimate": 0,
        "works": 40,
        "advanced": 39,
        "needTime": 0,
        "needYield": 0,
        "saveState": 0,
        "restoreState": 0,
        "isEOF": 1,
        "indexPrefix": ,
        "indexName": "user_name_text",
        "parsedTextQuery":  "terms": ["akkcess"], "negatedTerms": [], "phrases": [], "negatedPhrases": [] ,
        "textIndexVersion": 3,
        "inputStage": 
            "stage": "TEXT_MATCH",
            "nReturned": 39,
            "executionTimeMillisEstimate": 0,
            "works": 40,
            "advanced": 39,
            "needTime": 0,
            "needYield": 0,
            "saveState": 0,
            "restoreState": 0,
            "isEOF": 1,
            "docsRejected": 0,
            "inputStage": 
                "stage": "FETCH",
                "nReturned": 39,
                "executionTimeMillisEstimate": 0,
                "works": 40,
                "advanced": 39,
                "needTime": 0,
                "needYield": 0,
                "saveState": 0,
                "restoreState": 0,
                "isEOF": 1,
                "docsExamined": 39,
                "alreadyHasObj": 0,
                "inputStage": 
                    "stage": "OR",
                    "nReturned": 39,
                    "executionTimeMillisEstimate": 0,
                    "works": 40,
                    "advanced": 39,
                    "needTime": 0,
                    "needYield": 0,
                    "saveState": 0,
                    "restoreState": 0,
                    "isEOF": 1,
                    "dupsTested": 39,
                    "dupsDropped": 0,
                    "inputStage": 
                        "stage": "IXSCAN",
                        "nReturned": 39,
                        "executionTimeMillisEstimate": 0,
                        "works": 40,
                        "advanced": 39,
                        "needTime": 0,
                        "needYield": 0,
                        "saveState": 0,
                        "restoreState": 0,
                        "isEOF": 1,
                        "keyPattern":  "_fts": "text", "_ftsx": 1 ,
                        "indexName": "user_name_text",
                        "isMultiKey": false,
                        "isUnique": false,
                        "isSparse": false,
                        "isPartial": false,
                        "indexVersion": 2,
                        "direction": "backward",
                        "indexBounds": ,
                        "keysExamined": 39,
                        "seeks": 1,
                        "dupsTested": 0,
                        "dupsDropped": 0
                    
                
            
        
    ,
    "allPlansExecution": []

你知道它为什么会这样吗? 根据文档和教程,这应该可以工作。

【问题讨论】:

根据哪个文档? 【参考方案1】:

“a”几乎可以肯定是stop word。几乎所有自然语言文本都会包含它。因此,如果搜索它,您将获得结果集中的每一个文档。由于这不是很有用,文本搜索会从查询中删除像“a”这样的停用词。

另外,MongoDB 文本搜索确实包含精确匹配功能,但它需要引用您尚未完成的查询,因此您使用的是常规词干匹配,而不是您发布的精确匹配查询。

【讨论】:

我不想要完全匹配。我的数据库中有名为“berry”的用户。我寻找“berr”,它返回 0 个结果。但是当我搜索 berry 时,它可以工作。 “berr” 不是一个词,因此如果它不是源于“berry”,我不会感到惊讶。 那么我如何找到所有包含“berr”的用户名? 你可以试试正则表达式。 MongoDB 没有文本子字符串提取运算符。 我已经研究过正则表达式。我想也许索引会更有效。无论如何,谢谢你的帮助。

以上是关于MongoDB 索引文本搜索仅适用于完全匹配的主要内容,如果未能解决你的问题,请参考以下文章

mongodb文本搜索

Ruby操作MongoDB(进阶十)--文本搜索text search

Vim:当使用 \_ 匹配多行字符串时。在正则表达式中, :yank 命令仅适用于第一行

头文件中的 CSS/less 仅适用于索引页面

PHP MySQL:搜索查询仅适用于现有的完整术语

适用于mLab Mongodb查询的索引