MongoDB 索引和非索引性能

Posted 2023-02-23

技术标签:

【中文标题】MongoDB 索引和非索引性能【英文标题】：MongoDB index and non-index performance 【发布时间】：2017-02-07 11:10:16 【问题描述】：

我从 mongodb 文档那里听说

对于区分大小写的正则表达式查询，如果字段存在索引，则 MongoDB 会将正则表达式与索引中的值进行匹配，这可能比集合扫描更快。如果正则表达式是“前缀表达式”，则可以进行进一步优化，这意味着所有潜在匹配都以相同的字符串开头。这允许 MongoDB 从该前缀构造一个“范围”，并且只匹配索引中落在该范围内的那些值。

查询：

db.getCollection('contacts').find(username: $regex: 'an').explain()

这是没有索引的统计数据username

"cursor" : "BasicCursor",
"isMultiKey" : false,
"n" : 14234,
"nscannedObjects" : 107721,
"nscanned" : 107721,
"nscannedObjectsAllPlans" : 107721,
"nscannedAllPlans" : 107721,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 841,
"nChunkSkips" : 0,
"millis" : 108,
"server" : "random-ubunto:3001",
"filterSet" : false

以及带有索引的统计信息username

"cursor" : "BtreeCursor username_1",
"isMultiKey" : false,
"n" : 14234,
"nscannedObjects" : 14234,
"nscanned" : 106898,
"nscannedObjectsAllPlans" : 14234,
"nscannedAllPlans" : 106898,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 835,
"nChunkSkips" : 0,
"millis" : 142,
"indexBounds" : 
    "username" : [ 
        [ 
            "", 
            
        ], 
        [ 
            /an/, 
            /an/
        ]
    ]
,
"server" : "random-ubunto:3001",
"filterSet" : false

是的，我可以看到nscannedObjects 的不同之处。那很好，但问题是为什么索引的millis 比没有索引的要大。如果我们谈论性能，millis 应该反过来。目前

millis (Without Indexing) : 108
millis (With Indexing) : 142

【问题讨论】：

【参考方案1】：

你应该看看这个：

MongoDB, performance of query by regular expression on indexed fields

在上面的链接中提到：

对于 /Jon Skeet/ 正则表达式，mongo 将完整扫描索引中的键，然后将获取匹配的文档，这可能比集合扫描更快。

对于 /^Jon Skeet/ 正则表达式，mongo 将只扫描索引中以正则表达式开头的范围，这样会更快。

【讨论】：

是的，我已经尝试过了。但差别不大。 millis 在 120-130 左右。而没有指数的100-110。为什么？

以上是关于MongoDB 索引和非索引性能的主要内容，如果未能解决你的问题，请参考以下文章