MongoDB 按 ID 分组,然后按日期分组

Posted

技术标签:

【中文标题】MongoDB 按 ID 分组,然后按日期分组【英文标题】:MongoDB group by ID and then by date 【发布时间】:2019-06-06 04:20:20 【问题描述】:

我的 MongoDB 数据库中有一个集合,用于存储分组人员的持续时间,它看起来像这样:

[
    "_id": "5c378eecd11e570240a9b0ac",
    "state": "DRAFT",
    "groupId": "5c378eebd11e570240a9ae49",
    "personId": "5c378eebd11e570240a9aee1",
    "date": "2019-01-07T00:00:00.000Z",
    "duration": 480,
    "__v": 0
,

    "_id": "5c378eecd11e570240a9b0bb",
    "state": "DRAFT",
    "groupId": "5c378eebd11e570240a9ae58",
    "personId": "5c378eebd11e570240a9aeac",
    "date": "2019-01-07T00:00:00.000Z",
    "duration": 480,
    "__v": 0
,

    "_id": "5c378eecd11e570240a9b0c5",
    "state": "DRAFT",
    "groupId": "5c378eebd11e570240a9ae3e",
    "personId": "5c378eebd11e570240a9aef6",
    "date": "2019-01-07T00:00:00.000Z",
    "duration": 480,
    "__v": 0
]

我希望能够运行一个聚合查询,该查询返回personIdsduration 每天分组的集合以及相应的groupId,如下所示:

[
    "personId": "5c378eebd11e570240a9aee1",
    "time": [
        "date": "2019-01-07T00:00:00.000Z",
        "entries": [
            "groupId": "5c378eebd11e570240a9ae49",
            "duration": 480,
            "state": "DRAFT"
        ]
    ]
, 
    "personId": "5c378eebd11e570240a9aeac",
    "time": [
        "date": "2019-01-07T00:00:00.000Z",
        "entries": [
            "groupId": "5c378eebd11e570240a9ae58",
            "duration": 480,
            "state": "DRAFT"
        ]
    ]
, 
    "personId": "5c378eebd11e570240a9aef6",
    "time": [
        "date": "2019-01-07T00:00:00.000Z",
        "entries": [
            "groupId": "5c378eebd11e570240a9ae3e",
            "duration": 480,
            "state": "DRAFT"
        ]
    ]
]

到目前为止,我已经编写了以下聚合(我使用的是 Mongoose,因此是语法):

Time.aggregate()
    .match( date:  $gte: new Date(start), $lte: new Date(end)  )
    .group(
      _id: '$personId',
      time:  $push:  date: '$date', duration: '$duration', state: '$state'  ,
    )
    .project( _id: false, personId: '$_id', time: '$time' )

返回以下内容:

[
    "personId": "5c378eebd11e570240a9aed1",
    "time": [
        "date": "2019-01-11T00:00:00.000Z",
        "duration": 480,
        "state": "DRAFT"
    , 
        "date": "2019-01-11T00:00:00.000Z",
        "duration": 480,
        "state": "DRAFT"
    
    // ...
]

希望您可以看到 durations 被 personId 分组,但我无法弄清楚如何将另一个分组应用到 time 数组,因为 dates 重复如果personId 在给定日期有多个 duration

是否可以按 ID 分组,推送到数组,然后将该数组中的值分组为聚合,还是我的应用程序需要映射/减少结果?

【问题讨论】:

【参考方案1】:

我建议连续运行两个$group 操作:

db.time.aggregate(
  // first, group all entries by personId and date
  $group: 
    _id: 
      personId: "$personId",
      date: "$date"
    ,
    entries: 
      $push: 
        groupId: "$groupId",
        duration: "$duration",
        state: "$state"
      
    
  
, 
  // then, group previously aggregated entries by personId
  $group: 
    _id: "$_id.personId",
    time: 
      $push: 
        date: "$_id.date",
        entries: "$entries"
      
    
  
, 
  // finally, rename _id to personId
  $project: 
    _id: 0,
    personId: "$_id",
    time: "$time"
  
)

在 Mongoose 中应该是这样的:

Time.aggregate()
  .match(
    date: 
      $gte: new Date(start),
      $lte: new Date(end)
    
  )
  .group(
    _id: 
      personId: '$personId',
      date: '$date'
    ,
    entries: 
      $push: 
        groupId: '$groupId',
        duration: '$duration',
        state: '$state'
      
    
  )
  .group(
    _id: '$_id.personId',
    time: 
      $push: 
        date: '$_id.date',
        entries: '$entries'
      
    
  )
  .project(
    _id: false,
    personId: '$_id',
    time: '$time'
  )

【讨论】:

这很好用 - 谢谢!你知道是否有办法通过groupId 属性对entries 中的值进行排序,然后通过datetime 中的值进行排序? 设法弄明白了——我在比赛结束后按groupId排序,然后在第一组后按date排序。 @edcs 实际上,您只需要一个 $sort 电话。您可以同时按两个字段对数据进行排序: $sort: date: 1, groupId: 1 【参考方案2】:
db.getCollection("dummyCollection").aggregate(
[
     
        "$group" : 
            "_id" : "$personId", 
            "time" : 
                "$push" : 
                    "date" : "$date", 
                    "duration" : "$duration", 
                    "state" : "$state"
                
            
        
    , 
     
        "$project" : 
            "_id" : false, 
            "personId" : "$_id", 
            "time" : "$time"
        
    , 
     
        "$unwind" : "$time"
    , 
     
        "$group" : 
            "_id" : "$time.date", 
            "time" : 
                "$addToSet" : "$time"
            
        
    
]

);

使用 $addToSet 返回一个包含所有唯一值的数组,这些唯一值是通过将表达式应用于一组文档中的每个文档而产生的,这些文档按键共享同一组。

【讨论】:

以上是关于MongoDB 按 ID 分组,然后按日期分组的主要内容,如果未能解决你的问题,请参考以下文章

Mongodb:按元素分组并根据条件显示子文档计数并按日期对文档进行排序

按日期范围分组 mongodb

Mongodb,按日期差异分组并获取小时

如何在 MongoDB 中按日期分组

按外键和日期分组数据,按日期汇总

猫鼬日期比较没有时间和按多个属性分组?