在 MongoDB 中聚合双嵌套数组的文档

Posted

技术标签:

【中文标题】在 MongoDB 中聚合双嵌套数组的文档【英文标题】:Aggregate documents in MongoDB of double nested arrays 【发布时间】:2019-05-10 18:54:57 【问题描述】:

我正在尝试计算具有不同条件的文档。在这里,我有这样的简化文本表(文档):

 
  "teamId": "1",
  "stage": "0",
  "answeredBy": [userId_1, userId_2],
  "skippedBy": [userId_3],
  "answers": []
,
 
  "teamId": "1",
  "stage": "0",
  "answeredBy": [userId_2],
  "skippedBy": [userId_1],
  "answers": []
,
 
  "teamId" : "1",
  "stage": "0",
  "answeredBy": [userId_3],
  "skippedBy": [userId_2],
  "answers": []
,
 
  "teamId" : "1",
  "stage": "1",
  "answeredBy": [userId_3],
  "skippedBy": [userId_1, userId_2],
  "answers": [
               "readBy": [userId_1] ,
               "readBy": [userId_1, userId_2] ,
               "readBy": [userId_3, userId_1] ,       
  ]
,
 
  "teamId" : "1",
  "stage": "1",
  "answeredBy": [userId_3],
  "skippedBy": [userId_1, userId_2],
  "answers": [
               "readBy": [userId_1] ,
               "readBy": [userId_1, userId_2] ,
               "readBy": [userId_3] ,       
  ]
;

我想针对每个适当的用户 ID、阶段和团队 ID 计算一个查询(因此第一个 $match 必须是每个团队 ID 和阶段:“0”或“1”:

舞台上有多少文档:“0”包含 answerBy 或 skippedBy 数组中的用户 ID(我将此文档称为“Answered”) 舞台上有多少文档:“0”在 answerby 和 skippedBy 数组中均不包含用户 ID(我将此文档称为“未回答”) 有多少个 stage:"1" 的文档在 answers 数组中至少有一个 readBy 数组,其中不包含用户(我称之为“UnRead”文档)

所以我尝试通过多种方式实现它,但最困难的部分是遍历数组 answers 的嵌套数组 (readBy) 并找出哪个没有t 包含适当的用户并将此文档视为未读

可能的结果:

 
   answered: 2,
   unanswered: 1,
   unread: 1,
 ;

 [
    _id: 'answered', count: 2 ,
    _id: 'unanswered', count: 1 ,
    _id: 'unread', count: 1 
 ]

我写完这个查询后卡住了,不知道如何遍历 readBy 数组:

db.texts.aggregate([
       $match: teamId: 1, $or: [currStage: 0, currStage: 1],
       $project:  'stage':  $switch:  branches: [
       case: 
              $and: [  $eq: [ '$currStage', 0 ] ,
              $not: [  $or: [  $in: [ userId_1, '$answeredBy' ] ,
              $in: [ userId_1, '$skippedBy' ]  ]  ]  ] ,
        then: 'unanswered',
       case: 
              $and: [  $eq: [ '$currStage', 0 ] ,
              $or: [  $in: [ userId_1, '$answeredBy' ] ,
              $in: [ userId_1, '$skippedBy' ]  ]  ] ,
        then: 'answered',
       case:
              $and: [  $eq: [ '$currStage', 1 ] ,
              $not: [  $in: [ userId_1, '$answers.readBy' ]  ]  ] ,
        then: 'unread',
                 ]    ,
       $group:  _id: '$stage', count:  $sum: 1   ,
 ]); 

【问题讨论】:

这些条件的标准是什么。 how many texts was answered or skipped (I called it "Answered") how many texts was unanswered or not skipped (I called it "Unanswered") how many texts has all answers read by set user (I called it "Read")你能解释一下我们可以为unansweredskippedRead取哪个字段吗? @AnthonyWinzlet 我有用户 ID(当前为 userId_1)。所以我想为这个合适的用户计算他回答/跳过了多少文本,并且至少有一个答案未读(我犯了一个错误,没有读) 好的,所以回答我会查看answeredBy 和未读answers.readBy 和跳过我会查看skippedBy 是存在于answeredByskippedByanswers.readBy 中(我需要检查一个文本的所有readBy 数组的最大问题(集合的文档) @AnthonyWinzlet 我恳请您再次查看我打开的问题(问题)我对预期结果进行了一些更改。对于给您带来的不便,我深表歉意。 【参考方案1】:

试试这个,我假设 userid = userId_1

db.getCollection('answers').aggregate([
       $match: teamId: '1', $or: [stage: '0', stage: '1'],
      $project:
          counts :$cond: [
              $or:[$in:["userId_1", "$answeredBy"], $in:["userId_1", "$skippedBy"]], 
              $literal:answered: 1, unaswered: 0, 
              $literal:answered: 0, unaswered: 1
          ],
          unread : $cond:[
                  $gt:[$reduce: 
                      input: "$answers", 
                      initialValue: 1, 
                      in: $multiply:["$$value", 
                          $cond:[
                              $in:["userId_1", "$$this.readBy"],
                              $literal: 0,
                              $literal: 1
                           ]
                       ],
                       0
                     ],
                     $literal: 1,
                     $literal: 0
               ]

      ,
      $group: _id: null, answered: $sum: "$counts.answered", unanswered: $sum: "$counts.unanswered", unread: $sum: "$unread"
])

【讨论】:

这个解决方案看起来很有趣,但没有为我提供必要的结果。例如,在对当前数据库运行查询后,我得到了结果:"_id":null,"answered":5,"unanswered":0,"read":2。因此,根据您的回答,我的集合中有 7 个文档,但我只有 5 我没有看到您在每个数组中查找“userId_1”的位置 - 这是主要任务,查找此用户所在的位置 我正在打印与其他建议解决方案相同的评论。当其中一个文档在每个 readBy 数组中不包含 userId_1 时,您的解决方案将集合的最后两个文档都计为已读。因此,当至少一个 readBy 数组不包含 userId_1 - 它是未读文档 我恳请您再次查看我打开的问题(问题)我对预期结果进行了一些更改。对于给您带来的不便,我深表歉意。 这是我运行查询后得到的结果:"_id":null,"answered":4,"unanswered":0,"unread":3。这不是我所期望的。我想我的问题并不清楚。再次编辑它。也许现在会更清楚。我再次感到抱歉。我没有太多在这里发布问题的经验。【参考方案2】:

这是我的工作解决方案。感谢所有试图解决它并帮助我的人。

db.test.aggregate([
   $match: teamId: "1", $or: [stage: "0", stage: "1", "answers": $elemMatch: "readBy": $nin: ["userId_1"]],
   $project:  'stage':  $switch:  branches: [
   case: 
          $and: [  $eq: [ '$stage', "0" ] ,
          $not: [  $or: [  $in: [ "userId_1", '$answeredBy' ] ,
          $in: [ "userId_1", '$skippedBy' ]  ]  ]  ] ,
    then: 'unanswered',
   case: 
          $and: [  $eq: [ '$stage', "0" ] ,
          $or: [  $in: [ "userId_1", '$answeredBy' ] ,
          $in: [ "userId_1", '$skippedBy' ]  ]  ] ,
    then: 'answered',
   case:
          $eq: [ '$stage', "1" ]  ,
    then: 'unread',
             ]    ,
   $group:  _id: '$stage', count:  $sum: 1   ,
])

也许我应该找到更好的解决方案,但目前这是我需要的。

【讨论】:

以上是关于在 MongoDB 中聚合双嵌套数组的文档的主要内容,如果未能解决你的问题,请参考以下文章

MongoDB $reduce(aggregation) 组与数组中嵌套文档的总和并按组计数

MongoDB聚合和嵌套数字数组

MongoDB Mongoose 聚合查询深度嵌套数组删除空结果并填充引用

MongoDB基础教程系列--第七篇 MongoDB 聚合管道

Mongodb聚合:投影没有最后一个元素的数组

MongoDB Aggregation - $unwind order 文档是不是与嵌套数组 order 相同