MongoDB 按 ID 分组,然后按日期分组
Posted
技术标签:
【中文标题】MongoDB 按 ID 分组,然后按日期分组【英文标题】:MongoDB group by ID and then by date 【发布时间】:2019-06-06 04:20:20 【问题描述】:我的 MongoDB 数据库中有一个集合,用于存储分组人员的持续时间,它看起来像这样:
[
"_id": "5c378eecd11e570240a9b0ac",
"state": "DRAFT",
"groupId": "5c378eebd11e570240a9ae49",
"personId": "5c378eebd11e570240a9aee1",
"date": "2019-01-07T00:00:00.000Z",
"duration": 480,
"__v": 0
,
"_id": "5c378eecd11e570240a9b0bb",
"state": "DRAFT",
"groupId": "5c378eebd11e570240a9ae58",
"personId": "5c378eebd11e570240a9aeac",
"date": "2019-01-07T00:00:00.000Z",
"duration": 480,
"__v": 0
,
"_id": "5c378eecd11e570240a9b0c5",
"state": "DRAFT",
"groupId": "5c378eebd11e570240a9ae3e",
"personId": "5c378eebd11e570240a9aef6",
"date": "2019-01-07T00:00:00.000Z",
"duration": 480,
"__v": 0
]
我希望能够运行一个聚合查询,该查询返回personIds
和duration
每天分组的集合以及相应的groupId
,如下所示:
[
"personId": "5c378eebd11e570240a9aee1",
"time": [
"date": "2019-01-07T00:00:00.000Z",
"entries": [
"groupId": "5c378eebd11e570240a9ae49",
"duration": 480,
"state": "DRAFT"
]
]
,
"personId": "5c378eebd11e570240a9aeac",
"time": [
"date": "2019-01-07T00:00:00.000Z",
"entries": [
"groupId": "5c378eebd11e570240a9ae58",
"duration": 480,
"state": "DRAFT"
]
]
,
"personId": "5c378eebd11e570240a9aef6",
"time": [
"date": "2019-01-07T00:00:00.000Z",
"entries": [
"groupId": "5c378eebd11e570240a9ae3e",
"duration": 480,
"state": "DRAFT"
]
]
]
到目前为止,我已经编写了以下聚合(我使用的是 Mongoose,因此是语法):
Time.aggregate()
.match( date: $gte: new Date(start), $lte: new Date(end) )
.group(
_id: '$personId',
time: $push: date: '$date', duration: '$duration', state: '$state' ,
)
.project( _id: false, personId: '$_id', time: '$time' )
返回以下内容:
[
"personId": "5c378eebd11e570240a9aed1",
"time": [
"date": "2019-01-11T00:00:00.000Z",
"duration": 480,
"state": "DRAFT"
,
"date": "2019-01-11T00:00:00.000Z",
"duration": 480,
"state": "DRAFT"
// ...
]
希望您可以看到 duration
s 被 personId
分组,但我无法弄清楚如何将另一个分组应用到 time
数组,因为 date
s 重复如果personId
在给定日期有多个 duration
。
是否可以按 ID 分组,推送到数组,然后将该数组中的值分组为聚合,还是我的应用程序需要映射/减少结果?
【问题讨论】:
【参考方案1】:我建议连续运行两个$group
操作:
db.time.aggregate(
// first, group all entries by personId and date
$group:
_id:
personId: "$personId",
date: "$date"
,
entries:
$push:
groupId: "$groupId",
duration: "$duration",
state: "$state"
,
// then, group previously aggregated entries by personId
$group:
_id: "$_id.personId",
time:
$push:
date: "$_id.date",
entries: "$entries"
,
// finally, rename _id to personId
$project:
_id: 0,
personId: "$_id",
time: "$time"
)
在 Mongoose 中应该是这样的:
Time.aggregate()
.match(
date:
$gte: new Date(start),
$lte: new Date(end)
)
.group(
_id:
personId: '$personId',
date: '$date'
,
entries:
$push:
groupId: '$groupId',
duration: '$duration',
state: '$state'
)
.group(
_id: '$_id.personId',
time:
$push:
date: '$_id.date',
entries: '$entries'
)
.project(
_id: false,
personId: '$_id',
time: '$time'
)
【讨论】:
这很好用 - 谢谢!你知道是否有办法通过groupId
属性对entries
中的值进行排序,然后通过date
对time
中的值进行排序?
设法弄明白了——我在比赛结束后按groupId
排序,然后在第一组后按date
排序。
@edcs 实际上,您只需要一个 $sort
电话。您可以同时按两个字段对数据进行排序: $sort: date: 1, groupId: 1
。【参考方案2】:
db.getCollection("dummyCollection").aggregate(
[
"$group" :
"_id" : "$personId",
"time" :
"$push" :
"date" : "$date",
"duration" : "$duration",
"state" : "$state"
,
"$project" :
"_id" : false,
"personId" : "$_id",
"time" : "$time"
,
"$unwind" : "$time"
,
"$group" :
"_id" : "$time.date",
"time" :
"$addToSet" : "$time"
]
);
使用 $addToSet 返回一个包含所有唯一值的数组,这些唯一值是通过将表达式应用于一组文档中的每个文档而产生的,这些文档按键共享同一组。
【讨论】:
以上是关于MongoDB 按 ID 分组,然后按日期分组的主要内容,如果未能解决你的问题,请参考以下文章