MongoDB $reduce(aggregation) 组与数组中嵌套文档的总和并按组计数

Posted

技术标签:

【中文标题】MongoDB $reduce(aggregation) 组与数组中嵌套文档的总和并按组计数【英文标题】:MongoDB $reduce(aggregation) group with the sum of the nested document in arrays and count by group 【发布时间】:2020-01-13 06:30:07 【问题描述】:

MongoDB 聚合框架查询:$group、$project、$addFields 和 $reduce。

用例: 我在集合中有多个带有嵌套文档数组的文档,需要一个结果分组依据和每个分组项的总和作为累积量。此外,在年份(日期)上具有匹配参数,如果年份匹配,则只有该年份文档应分组,并且返回卷的总和(嵌套文档数组)。

以下是集合中的文档:


    "_id": "1",
    "LSD": 
        "name": "TDL 05",
        "LSDNumber": "031"
    ,
    "POD": [           
            "Volume": 35.40,
            "VolUnit": "m3"
        ,
                   
            "Volume": 20.75,
            "VolUnit": "m3"
        ,
        
            "Volume": 15,
            "VolUnit": "m3"
        
    ],
     "createdon": 
        "$date": "2014-08-02T18:49:17.000Z"
    
,

    "_id": "2",
    "LSD": 
        "name": "Stock Watering",
        "LSDNumber": "01"
    ,

    "POD": [
            "Volume": 105,
            "VolUnit": "m3"
        ,
        
            "Volume": 70,
            "VolUnit": "m3"
        ,
        
            "Volume": 35,
            "VolUnit": "m3"
        
    ],
     "createdon": 
        "$date": "2014-08-02T18:49:17.000Z"
    
,

    "_id": "3",
    "LSD": 
        "name": "TDL 30 Stock Water",
        "LSDNumber": "030"
    ,

    "POD": [
        "Volume": 87,
        "VolUnit": "m3"
    ],
     "createdon": 
        "$date": "2019-08-02T18:49:17.000Z"
    
,

    "_id": "4",
    "LSD": 
        "name": "TDL 30 Stock Water",
        "LSDNumber": "030"
    ,
    "POD": [
        "Volume": 25.12,
        "VolUnit": "m3"
    ],
     "createdon": 
        "$date": "2019-08-02T18:49:17.000Z"
    
,

    "_id": "5",
    "LSD": 
        "name": "TDL 05",
        "LSDNumber": "031"
    ,
    "POD": [
        
            "Volume": 21,
            "VolUnit": "m3"
        
    ],
     "createdon": 
        "$date": "2014-08-02T18:49:17.000Z"
    

我有一个查询(C# Driver 2.0),按“LSD.LSDNumber”和“POD.Volume”的总和分组。此处未添加匹配参数。这很好用。

查询:


    aggregate([
        "$group": 
            "_id": "$LSD.LSDNumber",            
            "doc": 
                "$push": "$POD"
            ,
            "data": 
                "$first": "$$ROOT"
            
        
    , 
        "$addFields": 
            "LSDNumber": "$_id",            
            "GroupByDocCount": 
                "$size": "$doc"
            ,
            "Cumulative": 
                "$reduce": 
                    "input": "$doc",
                    "initialValue": [],
                    "in": 
                        "$concatArrays": ["$$value", "$$this"]
                    
                
            
        
    , 
        "$project": 
            "LSDNumber": 1,
            "GroupByDocCount": 1,           
            "CumulativeVol": 
                "$sum": "$Cumulative.Volume"
            
        
    ])

下面是结果。

    
    "LSDNumber":"031",
    "GroupByDocCount": 2,
    "CumulativeVol": 92.15
,
    
    "LSDNumber":"030",
    "GroupByDocCount": 2,
    "CumulativeVol": 112.12
,
    
    "LSDNumber":"01",
    "GroupByDocCount": 1,
    "CumulativeVol": 210

但是,我想按年份(在“createdon”)日期以及按(LSD.LSDNumber)和总和(POD.Volume)分组来获取文档匹配。 例如,如果年份是 2014 年,那么下面的结果应该是。

    
    "LSDNumber":"031",
    "GroupByDocCount": 2,
    "CumulativeVol": 92.15,
    "Year": 2014
,
    
    "LSDNumber":"01",
    "GroupByDocCount": 1,
    "CumulativeVol": 210,
    "Year": 2014

我尝试的查询总是什么都不返回。


    aggregate([
        "$project": 
            "LSDNumber": 1,
            "GroupByDocCount": 1,
            "CumulativeVol": 
                "$sum": "$Cumulative.Volume"
            ,
            "year": 
                "$year": "$data.createdon"
            
        
    , 
        "$match": 
            "year": 2014
        
    , 
        "$group": 
            "_id": "$LSD.LSDNumber",
            "year": 
                "$first": "$year"
            ,
            "doc": 
                "$push": "$POD"
            ,
            "data": 
                "$first": "$$ROOT"
            
        
    , 
        "$addFields": 
            "LSDNumber": "$_id",
            "yearCreate": "$year",
            "GroupByDocCount": 
                "$size": "$doc"
            ,
            "Cumulative": 
                "$reduce": 
                    "input": "$doc",
                    "initialValue": [],
                    "in": 
                        "$concatArrays": ["$$value", "$$this"]
                    
                
            
        
    ])

这里出了什么问题。任何帮助将不胜感激!

【问题讨论】:

为什么项目中"$year": "$data.createdon"中有$data?应该只是$createdon @DaveStSomeWhere 抱歉回复晚了。无论如何,用$data.createdon 更正了$createdon,但结果相同。 【参考方案1】:

您可以在 $addField 管道中添加 Year 变量,然后添加 $match


    "$group": 
        "_id": "$LSD.LSDNumber",            
        "doc": 
            "$push": "$POD"
        ,
        "data": 
            "$first": "$$ROOT"
        
    
, 
    "$addFields": 
        "LSDNumber": "$_id",            
        "GroupByDocCount": 
            "$size": "$doc"
        ,
        "Cumulative": 
            "$reduce": 
                "input": "$doc",
                "initialValue": [],
                "in": 
                    "$concatArrays": ["$$value", "$$this"]
                
            
        ,
        "Year": 
            "$year": "$data.createdon"
        
    
, 
    "$match" : "Year" : 2014
, 
    "$project": 
        "LSDNumber": 1,
        "GroupByDocCount": 1,           
        "CumulativeVol": 
            "$sum": "$Cumulative.Volume"
        ,
        "Year" : "$Year"
    

=== 结果 ===

/* 1 */

    "_id" : "01",
    "LSDNumber" : "01",
    "GroupByDocCount" : 1,
    "CumulativeVol" : 210,
    "Year" : 2014


/* 2 */

    "_id" : "031",
    "LSDNumber" : "031",
    "GroupByDocCount" : 2,
    "CumulativeVol" : 92.15,
    "Year" : 2014

【讨论】:

【参考方案2】:

有点晚了,但这是我的答案。我们只需要在最后阶段向管道添加一个项目阶段(额外)。但是,@Valijon 的回答符合同样的要求。


    aggregate([
        "$project": 
            "LSDNumber": "$LSD.LSDNumber",
            "year": 
                "$year": "$createdon"
            ,
            "PointOfDiversionVolumeDetails": 1
        
    , 
        "$match": 
            "year": 2014
        
    , 
        "$group": 
            "_id": "$LSDNumber",
            "doc": 
                "$push": "$PointOfDiversionVolumeDetails"
            
        
    , 
        "$addFields": 
            "GroupByDocCount": 
                "$size": "$doc"
            ,
            "Cumulative": 
                "$reduce": 
                    "input": "$doc",
                    "initialValue": [],
                    "in": 
                        "$concatArrays": ["$$value", "$$this"]
                    
                
            
        
    , 
        "$project": 
            "CumulativeVol": 
                "$sum": "$Cumulative.Volume"
            ,
            "LSDNumber": 1,
            "GroupByDocCount": 1
        
    , 
        "$sort": 
            "GroupByDocCount": -1
        
    ])

【讨论】:

以上是关于MongoDB $reduce(aggregation) 组与数组中嵌套文档的总和并按组计数的主要内容,如果未能解决你的问题,请参考以下文章

MongoDB下Map-Reduce使用简单翻译及示例

MongoDB Map Reduce

Mongodb聚合框架比map/reduce更快吗?

MongoDB $reduce(aggregation) 组与数组中嵌套文档的总和并按组计数

mongodb中的联合查询不使用map/reduce

MongoDB MapReduce - 发出一个键/一个值不调用reduce