_id字段的Mongodb排序描述非常慢
Posted
技术标签:
【中文标题】_id字段的Mongodb排序描述非常慢【英文标题】:Mongodb Sorting Desc for _id field very slow 【发布时间】:2019-07-12 14:35:42 【问题描述】:我有一个 mongodb 数据库,它有 3000 万个字典,并且每个月的每一天都有 100 万行,所以所有文档计数 30x1=3000 万,数据库只有 1 个月的数据,我想列出和排序之间的 desc 记录2018-07-01 和 2018-07-03 所以我在这两天之间有 200 万行我的每个集合如下所示:
"_id":"5c66cf5b67011aa76ca597b6",
"timestamp":"2018-07-01 15:45:37.000",
"category":"category_1"
我为时间戳列添加了排序描述索引
当我尝试 sort asc 时,我得到响应 0.1 秒,但我尝试 sort desc 我得到响应 702 秒
我正在构建python
from pymongo import MongoClient
import datetime
import time
client = MongoClient()
client = MongoClient('localhost', 27017)
db = client.MongoBencmarkTestDB
indicator_collections = db.IndicatorCollections
dstart = datetime.datetime(2018, 7, 1,0, 0, 0)
dfinish = datetime.datetime(2018, 7, 3,0, 0, 0)
for indicator_collection in indicator_collections.find(
"$and":
[
"timestamp": "$lte": dfinish, "$gte": dstart
,
]
).sort([("_id", -1)]).skip(0).limit(1000):
print(indicator_collection['_id'])
当我解释排序描述的_id字段时:
db.IndicatorCollections.find().sort(_id : -1).explain()
得到响应:
"queryPlanner" :
"plannerVersion" : 1,
"namespace" : "MongoBencmarkTestDB.IndicatorCollections",
"indexFilterSet" : false,
"parsedQuery" :
,
"winningPlan" :
"stage" : "FETCH",
"inputStage" :
"stage" : "IXSCAN",
"keyPattern" :
"_id" : 1
,
"indexName" : "_id_",
"isMultiKey" : false,
"multiKeyPaths" :
"_id" : [ ]
,
"isUnique" : true,
"isSparse" : false,
"isPartial" : false,
"indexVersion" : 2,
"direction" : "backward",
"indexBounds" :
"_id" : [
"[MaxKey, MinKey]"
]
,
"rejectedPlans" : [ ]
,
"serverInfo" :
"host" : "reterius-pc-MacBook-Pro.local",
"port" : 27017,
"version" : "4.0.3",
"gitVersion" : "7ea530946fa7880364d88c8d8b6026bbc9ffa48c"
,
"ok" : 1
我的索引:
[
"v" : 2,
"key" :
"_id" : 1
,
"name" : "_id_",
"ns" : "MongoBencmarkTestDB.IndicatorCollections"
,
"v" : 2,
"key" :
"timestamp" : -1
,
"name" : "timestamp_-1",
"ns" : "MongoBencmarkTestDB.IndicatorCollections"
]
我想尽快得到回复,因为它非常重要。
【问题讨论】:
你到底有哪些索引?请打印它们 (db.collection.getIndexes()) @RichieK [ "v" : 2, "key" : "id" : 1 , "name" : "_id", "ns" : " MongoBencmarkTestDB.IndicatorCollections" , "v" : 2, "key" : "timestamp" : -1 , "name" : "timestamp_-1", "ns" : "MongoBencmarkTestDB.IndicatorCollections" ] 我虽然你可能使用了复合索引,因为在单个索引上,方向并不重要。但是我猜你需要一个复合索引,然后在时间戳和 _id 字段上使用你想要的方向。 时间标记:1,_id:-1 请注意,您排序的字段必须是索引中的最后一个字段 @RichieK 我没有使用复合索引,我在下面显示我的索引。只是我想对 id 字段的 desc 进行排序。我的索引: [ "v" : 2, "key" : "_id" : 1 , "name" : "_id", "ns" : "MongoBencmarkTestDB.IndicatorCollections" , "v" :2,“key”:“timestamp”:-1,“name”:“timestamp_-1”,“ns”:“MongoBencmarkTestDB.IndicatorCollections”] 【参考方案1】:所以你需要运行 db.collection.createIndex( timestamp: 1, _id: -1 ) 并再次检查它是否更快(应该是这样)。正如我之前写的,Mongo 只使用一个索引进行查询,如果没有带有时间戳和降序 _id 字段的索引,则速度很慢。
【讨论】:
我补充说: db.collection.createIndex( timestamp: 1, _id: -1 ) 。但是没有变化,现在比以前慢了。响应时间 491 秒。没有解决 那么你能解释一下你的pythonscript产生的查询吗?你错过了查询的某些部分吗?至少在您的 sn-p 中有一些不应该存在的逗号。因此,如果您还查询类别,则还需要将其添加到索引中... 我在下面解释了我的查询作为答案。【参考方案2】:当我解释我对 pymongo 的查询时,结果如下:
'queryPlanner': 'plannerVersion': 1, 'namespace': 'MongoBencmarkTestDB.IndicatorCollections', 'indexFilterSet': False, 'parsedQuery': '$and': ['timestamp': '$ lte': datetime.datetime(2018, 7, 3, 0, 0), 'timestamp': '$gte': datetime.datetime(2018, 7, 1, 0, 0)], 'winningPlan': 'stage': 'SORT', 'sortPattern': '_id': -1, 'limitAmount': 1000, 'inputStage': 'stage': 'SORT_KEY_GENERATOR', 'inputStage': 'stage': 'FETCH', 'inputStage': 'stage': 'IXSCAN', 'keyPattern': 'timestamp': -1.0, 'indexName': 'timestamp_-1', 'isMultiKey': False, 'multiKeyPaths': 'timestamp': [], 'isUnique': False, 'isSparse': False, 'isPartial': False, 'indexVersion': 2, 'direction': 'forward', 'indexBounds': '时间戳':['[新日期(1530576000000),新日期(1530403200000)]'],'rejectedPlans':['stage':'SORT','sortPattern':'_id': -1,'limitAmount':1000,'inputStage':'stage':'SORT_KEY_GENERATOR','inputStage':'stage':'FETCH','inputStage':'stage':'IXSCAN',' keyPattern': '时间stamp':1.0,'_id':-1.0,'indexName':'timestamp_1__id_-1','isMultiKey':False,'multiKeyPaths':'timestamp':[],'_id':[],' isUnique': False, 'isSparse': False, 'isPartial': False, 'indexVersion': 2, 'direction': 'forward', 'indexBounds': 'timestamp': ['[new Date(1530403200000), new Date(1530576000000)]'], '_id': ['[MaxKey, MinKey]'], 'stage': 'LIMIT', 'limitAmount': 1000, 'inputStage': 'stage' : 'FETCH', 'filter': '$and': ['timestamp': '$lte': datetime.datetime(2018, 7, 3, 0, 0), 'timestamp': '$gte': datetime.datetime(2018, 7, 1, 0, 0)], 'inputStage': 'stage': 'IXSCAN', 'keyPattern': '_id': 1, ' indexName': 'id', 'isMultiKey': False, 'multiKeyPaths': '_id': [], 'isUnique': True, 'isSparse': False, 'isPartial': False , 'indexVersion': 2, 'direction': 'backward', 'indexBounds': '_id': ['[MaxKey, MinKey]']], 'executionStats': 'executionSuccess': True ,'nReturned':1000,'executionTimeMillis':552284,'totalKeysExamined':2000000,'totalDocsExami ned': 2000000, 'executionStages': 'stage': 'SORT', 'nReturned': 1000, 'executionTimeMillisEstimate': 137134, 'works': 2001003, 'advanced': 1000, 'needTime': 2000002, 'needYield ':0,'saveState':53750,'restoreState':53750,'isEOF':1,'invalidates':0,'sortPattern':'_id':-1,'memUsage':9056307,'memLimit' :33554432,'limitAmount':1000,'inputStage':'stage':'SORT_KEY_GENERATOR','nReturned':2000000,'executionTimeMillisEstimate':134910,'works':2000002,'advanced':2000000,'needTime': 1,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF':1,'invalidates':0,'inputStage':'stage':'FETCH','nReturned':2000000 ,'executionTimeMillisEstimate':132229,'works':2000001,'advanced':2000000,'needTime':0,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF':1,'无效':0,'docsExamined':2000000,'alreadyHasObj':0,'inputStage':'stage':'IXSCAN','nReturned':2000000,'executionTimeMillisEstimate':2077,'works':2000001,'advanced ':2000000,'needTime':0,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF':1,'invalidates':0,'keyPattern':'timestamp':- 1.0,'indexName':'timestamp_-1','isMultiKey':假,'multiKeyPaths':'timestamp':[],'isUnique':假,'isSparse':假,'isPartial':假, 'indexVersion': 2, 'direction': 'forward', 'indexBounds': 'timestamp': ['[new Date(1530576000000), new Date(1530403200000)]'], 'keysExamined': 2000000, 'seeks ':1,'dupsTested':0,'dupsDropped':0,'seenInvalidated':0,'allPlansExecution':['nReturned':101,'executionTimeMillisEstimate':137134,'totalKeysExamined':2000000, 'totalDocsExamined':2000000,'executionStages':'stage':'SORT','nReturned':101,'executionTimeMillisEstimate':137134,'works':2000103,'advanced':101,'needTime':2000002,' needYield':0,'saveState':53742,'restoreState':53742,'isEOF':0,'invalidates':0,'sortPattern':'_id':-1,'memUsage':9056307,'memLimit ':33554432,'limitAmount':1000,'inputS tage':'stage':'SORT_KEY_GENERATOR','nReturned':2000000,'executionTimeMillisEstimate':134910,'works':2000002,'advanced':2000000,'needTime':1,'needYield':0,'saveState ':53742,'restoreState':53742,'isEOF':1,'invalidates':0,'inputStage':'stage':'FETCH','nReturned':2000000,'executionTimeMillisEstimate':132229,'works' :2000001,“高级”:2000000,“needTime”:0,“needYield”:0,“saveState”:53742,“restoreState”:53742,“isEOF”:1,“无效”:0,“docsExamined”:2000000 , 'alreadyHasObj': 0, 'inputStage': 'stage': 'IXSCAN', 'nReturned': 2000000, 'executionTimeMillisEstimate': 2077, 'works': 2000001, 'advanced': 2000000, 'needTime': 0, 'needYield':0,'saveState':53742,'restoreState':53742,'isEOF':1,'invalidates':0,'keyPattern':'timestamp':-1.0,'indexName':'timestamp_- 1', 'isMultiKey': False, 'multiKeyPaths': 'timestamp': [], 'isUnique': False, 'isSparse': False, 'isPartial': False, 'indexVersion': 2, 'direction': '前进','indexBounds':'时间戳':['[新日期(1530576000000),新日期(1530403200000)]'],'keysExamined':2000000,'seeks':1,'dupsTested':0,'dupsDropped':0,'seenInvalidated': 0,'nReturned':101,'executionTimeMillisEstimate':286826,'totalKeysExamined':2000000,'totalDocsExamined':2000000,'executionStages':'stage':'SORT','nReturned':101 ,'executionTimeMillisEstimate':286826,'works':2000103,'advanced':101,'needTime':2000002,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF':0,'无效':0,'sortPattern':'_id':-1,'memUsage':9056307,'memLimit':33554432,'limitAmount':1000,'inputStage':'stage':'SORT_KEY_GENERATOR',' nReturned':2000000,'executionTimeMillisEstimate':284785,'works':2000002,'advanced':2000000,'needTime':1,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF' :1,'invalidates':0,'inputStage':'stage':'FETCH','nReturned':2000000,'executionTimeMillisEstimate':128225,'works':2000001,'advanced':2000000, 'needTime':0,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF':1,'invalidates':0,'docsExamined':2000000,'alreadyHasObj':0,'inputStage ':'stage':'IXSCAN','nReturned':2000000,'executionTimeMillisEstimate':1579,'works':2000001,'advanced':2000000,'needTime':0,'needYield':0,'saveState' :53750,'restoreState':53750,'isEOF':1,'invalidates':0,'keyPattern':'timestamp':1.0,'_id':-1.0,'indexName':'timestamp_1__id_-1', 'isMultiKey':假,'multiKeyPaths':'timestamp':[],'_id':[],'isUnique':假,'isSparse':假,'isPartial':假,'indexVersion':2, 'direction': 'forward', 'indexBounds': 'timestamp': ['[new Date(1530403200000), new Date(1530576000000)]'], '_id': ['[MaxKey, MinKey]'], 'keysExamined':2000000,'seeks':1,'dupsTested':0,'dupsDropped':0,'seenInvalidated':0,'nReturned':0,'executionTimeMillisEstimate':126373,'totalKeysExamined ':2000103,'totalDocsExamined':2000103,'executionStages':'stage':'LIMI T','nReturned':0,'executionTimeMillisEstimate':126373,'works':2000103,'advanced':0,'needTime':2000103,'needYield':0,'saveState':53750,'restoreState':53750 , 'isEOF': 0, 'invalidates': 0, 'limitAmount': 1000, 'inputStage': 'stage': 'FETCH', 'filter': '$and': ['timestamp': ' $lte': datetime.datetime(2018, 7, 3, 0, 0), 'timestamp': '$gte': datetime.datetime(2018, 7, 1, 0, 0)] ,'nReturned':0,'executionTimeMillisEstimate':126232,'works':2000103,'advanced':0,'needTime':2000103,'needYield':0,'saveState':53750,'restoreState':53750,' isEOF': 0, 'invalidates': 0, 'docsExamined': 2000103, 'alreadyHasObj': 0, 'inputStage': 'stage': 'IXSCAN', 'nReturned': 2000103, 'executionTimeMillisEstimate': 2205, 'works ':2000103,'advanced':2000103,'needTime':0,'needYield':0,'saveState':53750,'restoreState':53750,'isEOF':0,'invalidates':0,'keyPattern': '_id': 1, 'indexName': 'id', 'isMultiKey': False, 'multiKeyPaths': '_id': [], 'isUnique': Tr ue, 'isSparse': False, 'isPartial': False, 'indexVersion': 2, 'direction': 'backward', 'indexBounds': '_id': ['[MaxKey, MinKey]'], 'keysExamined ':2000103,'seeks':1,'dupsTested':0,'dupsDropped':0,'seenInvalidated':0],'serverInfo':'host':'reterius-pc-MacBook- Pro.local','port':27017,'version':'4.0.3','gitVersion':'7ea530946fa7880364d88c8d8b6026bbc9ffa48c','ok':1.0
【讨论】:
不知道为什么,但是获胜的查询没有使用您的复合索引,因此没有用于排序的索引。你需要时间戳的单一索引吗?也许尝试删除这个索引。但我真的不知道为什么它不使用复合索引... ***.com/questions/45497433/… 至少解释了一点…… hm 我在本地尝试过,但 queryplanner 仍然更喜欢不使用该复合索引...嗯.. 不知道为什么 @RichieK 很有趣,我不知道【参考方案3】:问题已经几个月了,但如果您仍然需要答案(或者如果其他人需要)
您正在对字段“_id”进行排序。
但它是索引 timestamp: 1, _id: -1
的第二个字段(非前缀)。对非前缀索引的排序不会使用索引,这就是您的排序不使用复合索引的原因。
将您的索引更改为 _id: -1, timestamp: 1
,然后检查您的说明查询是否使用该索引。或者,如果可以解决您的目的,您也可以尝试对“时间戳”字段进行排序。
参考:https://docs.mongodb.com/manual/tutorial/sort-results-with-indexes/#sort-and-non-prefix-subset-of-an-index
【讨论】:
以上是关于_id字段的Mongodb排序描述非常慢的主要内容,如果未能解决你的问题,请参考以下文章
用PHP查询mongo数据时,条件是某个字段(A为数组)不为空,但是有的记录中并没有字段A,这个条件怎么写?