mongoDB聚合将相似的文档彼此相邻分组

Posted

技术标签:

【中文标题】mongoDB聚合将相似的文档彼此相邻分组【英文标题】:mongolDB aggregation grouping similar documents next to each other 【发布时间】:2016-03-20 12:55:56 【问题描述】:

我在 mongoDB 中有一个集合,每天都会向其中添加一个带有采样数据的文档。我想观察字段变化。

我想使用 mongoDB 聚合将彼此相邻的相似项目分组到第一个:

+--+-------------------------+
|id|field             | date |
+--+-------------------------+
| 1|hello             | date1|
+--+-------------------------+
| 2|foobar            | date2|  \_   Condense these into one row with date2
+--+-------------------------+  /
| 3|foobar            | date3|
+--+-------------------------+
| 4|hello             | date4|
+--+-------------------------+
| 5|world             | date5|  \__   Condense these into a row with date5
+--+-------------------------+  /
| 6|world             | date6|
+--+-------------------------+
| 7|puppies           | date7|
+--+-------------------------+
| 8|kittens           | date8|  \__   Condense these into a row with date8
+--+-------------------------+  /
| 9|kittens           | date9|
+--+-------------------------+

是否可以为这个问题创建一个 mongoDB 聚合?

这里是 mysql 中类似问题的答案: Grouping similar rows next to each other in MySQL

样本数据

数据已按日期排序。

这些文件:

 "_id" : "566ee064d56d02e854df756e", "date" : "2015-12-14T15:29:40.432Z", "score" : 59 ,
 "_id" : "566a8c70520d55771f2e9871", "date" : "2015-12-11T08:42:23.880Z", "score" : 60 ,
 "_id" : "566932f5572bd1720db7a4ef", "date" : "2015-12-10T08:08:21.514Z", "score" : 60 ,
 "_id" : "5667e652c021206f34e2c9e4", "date" : "2015-12-09T08:29:06.696Z", "score" : 60 ,
 "_id" : "5666a468cc45e9d9a82b81c9", "date" : "2015-12-08T09:35:35.837Z", "score" : 61 ,
 "_id" : "56653fe099799049b66dab97", "date" : "2015-12-07T08:14:24.494Z", "score" : 60 ,
 "_id" : "5663f6b3b7d0b00b74d9fdf9", "date" : "2015-12-06T08:49:55.299Z", "score" : 60 ,
 "_id" : "56629fb56099dfe31b0c72be", "date" : "2015-12-05T08:26:29.510Z", "score" : 60 

应该分组到:

 "_id" : "566ee064d56d02e854df756e", "date" : "2015-12-14T15:29:40.432Z", "score" : 59 
 "_id" : "566a8c70520d55771f2e9871", "date" : "2015-12-11T08:42:23.880Z", "score" : 60 
 "_id" : "5666a468cc45e9d9a82b81c9", "date" : "2015-12-08T09:35:35.837Z", "score" : 61 
 "_id" : "56653fe099799049b66dab97", "date" : "2015-12-07T08:14:24.494Z", "score" : 60 

【问题讨论】:

如何将两行定义为彼此相邻 @BatScream - 我添加了示例数据。它们按日期相邻定义。 记录在分组前是否按datedescending排序? @BatScream - 是的。在我的聚合管道中,我有排序。 使用$skip怎么样? docs.mongodb.org/v3.0/reference/operator/aggregation/skip 【参考方案1】:

如果您不坚持使用aggregation 框架,这可以通过遍历光标并将每个文档与前一个文档进行比较来完成:

var cursor = db.test.find().sort(date:-1).toArray();
var result = [];
result.push(cursor[0]); //first document must be saved
for(var i = 1; i < cursor.length; i++) 
    if (cursor[i].score != cursor[i-1].score) 
        result.push(cursor[i]);
    

结果:

[
    
        "_id" : "566ee064d56d02e854df756e",
        "date" : "2015-12-14T15:29:40.432Z",
        "score" : 59
    ,
    
        "_id" : "566a8c70520d55771f2e9871",
        "date" : "2015-12-11T08:42:23.880Z",
        "score" : 60
    ,
    
        "_id" : "5666a468cc45e9d9a82b81c9",
        "date" : "2015-12-08T09:35:35.837Z",
        "score" : 61
    ,
    
        "_id" : "56653fe099799049b66dab97",
        "date" : "2015-12-07T08:14:24.494Z",
        "score" : 60
    
]

【讨论】:

以上是关于mongoDB聚合将相似的文档彼此相邻分组的主要内容,如果未能解决你的问题,请参考以下文章

Python3 - 将相似的字符串分组在一起

像在 GOOGLE NEWS 中一样将相似的新闻内容分组在一起

有效地将相似的数字组合在一起[重复]

一种使用 javascript 为活动源聚合和分组项目的方法

SQL Server 或 C# - 将相似的记录归为一组

MongoDB,分组,聚合