Mongodb聚合管道如何限制组推送

Posted

技术标签:

【中文标题】Mongodb聚合管道如何限制组推送【英文标题】:Mongodb aggregation pipeline how to limit a group push 【发布时间】:2014-08-26 23:00:58 【问题描述】:

我无法使用聚合管道限制组函数中推送元素的数量。这可能吗?小例子:

数据:

[
    
        "submitted": date,
        "loc":  "lng": 13.739251, "lat": 51.049893 ,
        "name": "first",
        "preview": "my first"
    ,
    
        "submitted": date,
        "loc":  "lng": 13.639241, "lat": 51.149883 ,
        "name": "second",
        "preview": "my second"
    ,
    
        "submitted": date,
        "loc":  "lng": 13.715422, "lat": 51.056384 ,
        "name": "nearpoint2",
        "preview": "my nearpoint2"
    
]

这是我的聚合管道:

var pipeline = [
    //I want to limit the data to a certain area
     $match: 
        loc: 
            $geoWithin: 
                $box: [
                    [locBottomLeft.lng, locBottomLeft.lat],
                    [locUpperRight.lng, locUpperRight.lat]
                ]
            
        
    ,
    // I just want to get the latest entries  
     $sort:  submitted: -1  ,
    // I group by name
    
      $group: 
          _id: "$name",
          // get name
          submitted:  $max: "$submitted" ,
          // get the latest date
          locs:  $push: "$loc" ,
          // push every loc into an array THIS SHOULD BE LIMITED TO AN AMOUNT 5 or 10
          preview:  $first: "$preview" 
      
    ,
    // Limit the query to at least 10 entries.
     $limit: 10 
];

如何将locs 数组限制为10 或任何其他大小?我用$each$slice 尝试了一些东西,但这似乎不起作用。

【问题讨论】:

【参考方案1】:

Mongo 5.2release schedule开始,这是新的$topN聚合累加器的完美用例:

//  submitted: ISODate("2021-12-05"), group: "group1", value: "plop" 
//  submitted: ISODate("2021-12-07"), group: "group2", value: "smthg" 
//  submitted: ISODate("2021-12-06"), group: "group1", value: "world" 
//  submitted: ISODate("2021-12-12"), group: "group1", value: "hello" 
db.collection.aggregate([
   $group: 
    _id: "$group",
    top:  $topN:  n: 2, sortBy:  submitted: -1 , output: "$value"  
  
])
//  _id: "group1", top: [ "hello", "world" ] 
//  _id: "group2", top: [ "smthg" ] 

这应用了$topN 组累积:

为每个组获取前 2 个 (n: 2) 元素 前 2 名,由 sortBy: submitted: -1 定义(倒序) 并为每个分组记录提取字段value (output: "$value")

【讨论】:

【参考方案2】:

我通过 (1) 允许在小组阶段推送所有值,然后 (2) 在后续的 $project 阶段添加一个 $filter 来解决这个问题。在 $filter 中,消除所有具有不合格值的数组成员。

https://docs.mongodb.com/manual/reference/operator/aggregation/filter/

【讨论】:

【参考方案3】:

您可以通过将$slice 运算符直接传递给$push 来实现此目的。

  var pipeline = [
    //I want to limit the data to a certain area
    $match: 
        loc: 
            $geoWithin: 
                $box: [
                    [locBottomLeft.lng, locBottomLeft.lat],
                    [locUpperRight.lng, locUpperRight.lat]
                ]
            
        
    
,
// I just want to get the latest entries  

    $sort: 
        submitted: -1
    
,
// I group by name

    $group: 
        _id: "$name",
        < --get name
        submitted: 
            $max: "$submitted"
        ,
        < --get the latest date
        locs: 
            $push: 
              $slice: 10
            
        ,
        < --push every loc into an array THIS SHOULD BE LIMITED TO AN AMOUNT 5 or 10
        preview: 
            $first: "$preview"
        
    
,
//Limit the query to at least 10 entries.

    $limit: 10

];

【讨论】:

这会导致以下错误:MongoError: Expression $slice takes at least 2 arguments, and at most 3, but 1 were passed in【参考方案4】:

假设左下坐标和右上坐标分别为[0, 0][100, 100]。从 MongoDB 3.2 开始,您可以使用 $slice 运算符返回您想要的数组子集。

db.collection.aggregate([
     "$match":  
        "loc":  
            "$geoWithin":   
                "$box": [ 
                    [0, 0], 
                    [100, 100]
                ]
            
        
    ,
     "$group":  
        "_id": "$name",
        "submitted":  "$max": "$submitted" , 
        "preview":  "$first": "$preview" 
        "locs":  "$push": "$loc" 
    , 
     "$project":  
        "locs":  "$slice": [ "$locs", 5 ] ,
        "preview": 1,
        "submitted": 1
    ,
     "$limit": 10 
])

【讨论】:

有没有办法直接在push的时候做这个? 目前无法在推送时执行此操作@Nickpick

以上是关于Mongodb聚合管道如何限制组推送的主要内容,如果未能解决你的问题,请参考以下文章

MongoDB聚合,如何在组管道中addToSet数组的每个元素

MongoDB聚合管道组

mongoDB表与表的关系及聚合管道查询

使用聚合管道聚合 MongoDB 中的时间戳集合

MongoDB - 聚合查询

mongodb Aggregation聚合操作之$facet