Mongodb聚合:从键值对象返回不同值的计数

Posted

技术标签:

【中文标题】Mongodb聚合:从键值对象返回不同值的计数【英文标题】:Mongo aggregation: Return count of distinct values from Key Value object 【发布时间】:2021-07-21 21:13:09 【问题描述】:

我真的希望有人可以在这里帮助我,我正在为这个问题发疯:-d

所以我有多个文档 (+100,000),如下所示:

 "_id" : ObjectId("60e5ae42fcc92f14c3a41208"),
    "userId" : "xxxx",
    "projectCreator" : 
        "userId" : "xxx|xxxx"
    ,
    "hashTags" : [
        "Spring",
        "Java"
    ],
    "projectCategories" : 
        "60d76ef0597444095b8ab4b2" : "Backend",
        "60d76ef0597444095b8ab232" : "Infrastructure" 
    ,
    "createdDate" : ISODate("2021-07-07T13:38:10.655Z"),
    "updatedAt" : ISODate("2021-07-08T11:48:36.200Z"),

    "_class" : "xxxx.model.project.Project"

我想要一个执行以下操作的查询:

    从集合中的所有文档中提取所有唯一的 projectCategories 值(字符串值而不是 id) 计算每个值的出现次数

所以结果应该是这样的:

Backend : NUMBER OF OCCURRENCES
FrontEnd : NUMBER OF OCCURRENCES
Infrastructure: NUMBER OF OCCURRENCES

我“认为”我需要进行聚合并将值分组,然后进行计数,但老实说我无法理解这一点。

我试过这个查询:

db.projects.aggregate([  $match:  isDeleted : $ne: true , $match:  projectCategories:  $exists:true, $ne: null  , $project:  result:  $objectToArray: "$projectCategories"   , $unwind : "$result"])

这将返回:

 "_id" : ObjectId("60c313e2905d344c7dd117f1"), "result" :  "k" : "60d76f295974444b818ab4bc", "v" : "Apps"  
 "_id" : ObjectId("60c313e2905d344c7dd117f1"), "result" :  "k" : "60d76f1759744461468ab4b8", "v" : "Development Tools"  
 "_id" : ObjectId("60c313e2905d344c7dd117f1"), "result" :  "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend"  
 "_id" : ObjectId("60cfb59f30b2647610a6c931"), "result" :  "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend"  
 "_id" : ObjectId("60cfb59f30b2647610a6c931"), "result" :  "k" : "60d76ef659744422d68ab4b3", "v" : "Fullstack"  
 "_id" : ObjectId("60cfb69730b2647610a6c932"), "result" :  "k" : "60d76f295974444b818ab4bc", "v" : "Apps"  
 "_id" : ObjectId("60df83e84d8b6341d49cff4e"), "result" :  "k" : "60d76ef0597444095b8ab4b2", "v" : "Backend"  
 "_id" : ObjectId("60df83e84d8b6341d49cff4e"), "result" :  "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend"  
 "_id" : ObjectId("60df83e84d8b6341d49cff4e"), "result" :  "k" : "60d76ef659744422d68ab4b3", "v" : "Fullstack"  
 "_id" : ObjectId("60e5ae42fcc92f14c3a41208"), "result" :  "k" : "60d76ef0597444095b8ab4b2", "v" : "Backend"  
 "_id" : ObjectId("60f0abf9f5c82b27af712ad7"), "result" :  "k" : "60d76f2559744477168ab4bb", "v" : "Games"  
 "_id" : ObjectId("60f0abf9f5c82b27af712ad7"), "result" :  "k" : "60d76ef659744422d68ab4b3", "v" : "Fullstack"  
 "_id" : ObjectId("60f68d2df9710f58c1e9c872"), "result" :  "k" : "60d76f295974444b818ab4bc", "v" : "Apps"  
 "_id" : ObjectId("60f68d2df9710f58c1e9c872"), "result" :  "k" : "60d76f0e5974448f038ab4b7", "v" : "Open Source"  
 "_id" : ObjectId("60f68d2df9710f58c1e9c872"), "result" :  "k" : "60d76eeb597444b9da8ab4b1", "v" : "Frontend"  

所以我现在卡住的地方是我如何放松并获得如下输出:

Backend : NUMBER OF OCCURRENCES
FrontEnd : NUMBER OF OCCURRENCES
Infrastructure: NUMBER OF OCCURRENCES

有人可以帮我吗?

谢谢!

更新: 我已经设法通过这个查询关闭:

db.projects.aggregate([  $match:  isDeleted : $ne: true , $match:  projectCategories:  $exists:true, $ne: null  , $project:  result:  $objectToArray: "$projectCategories"   , $unwind : "$result",  $group:  _id: "$result.v", count:  $sum: 1     ] )

但是现在的输出是这样的:

 "_id" : "Development Tools", "count" : 1 
 "_id" : "Games", "count" : 1 
 "_id" : "Fullstack", "count" : 3 
 "_id" : "Open Source", "count" : 1 
 "_id" : "Frontend", "count" : 4 
 "_id" : "Apps", "count" : 3 
 "_id" : "Backend", "count" : 2 

是否可以删除_id?

【问题讨论】:

【参考方案1】:

您可以再次分组以将键和值推送到数组中,然后将 $replaceRoot 与 $arrayToObject 一起使用:

$group:
   _id:null,
   results:$push: k:"$_id", v:"$count"
,
$replaceRoot: newRoot: $arrayToObject:"$results"

【讨论】:

其实乔,如果你不介意的话。还有一个问题。如果我想同时获得 k 和 V 怎么办,它看起来像这样: "_id" : "60d76f295974444b818ab4bc", "name: "backend", "count" : 2 我怎样才能更改我的查询以取回它?

以上是关于Mongodb聚合:从键值对象返回不同值的计数的主要内容,如果未能解决你的问题,请参考以下文章

使用 Lodash 从键值对数组中创建对象

NSArray从键值中获取整个对象[重复]

MySQL JSON:如何从键值中查找对象

RestKit 对象映射:如何从键值映射到新对象/关系?

Python字典_术语

如何在 MongoDB 中推入具有精确键值的对象嵌套数组?