在 MongoDB 中使用 group by、inner join 和嵌套条件进行聚合

Posted

技术标签:

【中文标题】在 MongoDB 中使用 group by、inner join 和嵌套条件进行聚合【英文标题】:aggregation with group by, inner join and nested conditions in MongoDB 【发布时间】:2021-12-17 07:34:42 【问题描述】:

首先,如果这是一个基本问题,我很抱歉,我是 MongoDB 查询的新手。好吧,我需要的是在我的WorkerLocationContext 文档中找到worker 的最新寄存器,在我的HeatMeasureContext 文档中找到每个sensor 的最新寄存器,然后通过他们的location 加入它,然后应用一些过滤器。这是我的架构:

HeatMeasureContext:

const heatMeasureContextSchema  = new mongoose.Schema(
    sensor:  type: Schema.Types.ObjectId, ref: 'MeasureSensor', required: true ,
    humid:  type: Schema.Types.Number, required: true ,
    globe:  type: Schema.Types.Number, required: true ,
    mercury:  type: Schema.Types.Number, required: true ,
    internal:  type: Schema.Types.Number, required: true ,
    external:  type: Schema.Types.Number, required: true 
,  timestamps: true )

MeasureSensor:

const measureSensorSchema = new mongoose.Schema(
    name:  type: String, required: true ,
    description:  type: String, required: false ,
    type:  type: String, required: false, uppercase: true,
        enumValues: ['MEASURE'], default: 'MEASURE' ,
    location:  type: Schema.Types.ObjectId, ref: 'Location' ,
    sensorType:  type: String, required: false, uppercase: true,
        enumValues: ['WORKER_ATTACHED', 'ENVIRONMENT'], default: 'ENVIRONMENT' ,
    measurerType:  type: String, required: false, uppercase: true,
        enumValues: ['HEAT', 'RUID'] ,
    placementType:  type: String, required: false, uppercase: true,
        enumValues: ['INTERNAL', 'EXTERNAL'], default: 'INTERNAL' 
)

WorkerLocationContext:

const workerLocationContextSchema  = new mongoose.Schema(
    sensor:  type: Schema.Types.ObjectId, ref: 'LocationSensor', required: true ,
    worker:  type: Schema.Types.ObjectId, ref: 'Worker', required: true 
,  timestamps: true )

Location

const locationSchema = new mongoose.Schema(
    name:  type: String, required: true ,
    description:  type: String, required: false ,
    type:  type: String, required: false, uppercase: true,
    enumValues: ['REST', 'ROOM', 'COURTYARD'], default: 'ROOM' 
)

Worker

const workerSchema = new mongoose.Schema(
    name:  type: String, required: true ,
    workGroup:  type: Schema.Types.ObjectId, ref: 'WorkGroup', required: false 
)

我已经建立了这样的查询:

WorkerLocationContext.aggregate([ 
    
        "$lookup": 
            "from": "HeatMeasureContext",
            "localField": "sensor.location._id",
            "foreignField": "sensor.location._id",
            "as": "HMContext"
        
    ,
    
        "$match": 
            "$and": [
                 "$or": [
                     "$and": [ 
                         
                            "HMContext.sensor.placementType":  "$eq": "INTERNAL" , 
                            "HMContext.internal":  "$gte": limit 
                        ,
                         
                            "HMContext.sensor.placementType":  "$eq": "EXTERNAL" , 
                            "HMContext.external":  "$gte": limit 
                        ,
                    ],
                ],
                 "WorkerLocationContext.worker.location.type":  "$ne": "REST"  
            ]
        
    ,
    
        "$group": 
            "_id": "null",
            "workers": 
              "$count": 
            ,
            "hmDatetime": 
                "$max": "$HMContext.createdAt"
            ,
            "wlDatetime": 
                "$max": "$WorkerLocationContext.createdAt"
            
        
    
]);

基本上,我的目标是根据当前位置计算有多少工人适合该条件,从而计算上下文表中的最新寄存器。我在mongoplayground 中尝试了一些模拟,但没有成功。可以在MongoDB中完成吗?你能帮帮我吗?

提前致谢!

编辑 1

样本数据


- Worker
[
     "_id": "6181de993fca98374cf901f6", "name": "Worker 1", "workGroup": "6181de3e3fca98374cf901f4", "__v": 0 ,
     "_id": "6181dec33fca98374cf901f7", "name": "Worker 2", "workGroup": "6181de4a3fca98374cf901f5", "__v": 0 ,
     "_id": "6181decc3fca98374cf901f8", "name": "Worker 3", "workGroup": "6181de4a3fca98374cf901f5", "__v": 0 ,
     "_id": "6181ded13fca98374cf901f9", "name": "Worker 4", "workGroup": "6181de4a3fca98374cf901f5", "__v": 0 
]

- Location
[
     "_id": "6181df293fca98374cf901fa", "name": "Location 1", "description": "Rest place", "__v": 0, "type": "ROOM" ,
     "_id": "6181df3b3fca98374cf901fb", "name": "Location 2", "description": "Room 1", "__v": 0, "type": "ROOM" 
]

- MeasureSensor
[
     "_id": "6181e5ae3fca98374cf901fc", "name": "Sensor 1", "description": "Heat Sensor 1", "location": "6181df3b3fca98374cf901fb", "measurerType": "HEAT", "__v": 0, "placementType": "INTERNAL", "sensorType": "ENVIRONMENT", "type": "MEASURE" 
]

- LocationSensor
[
     "_id": "6181e5f83fca98374cf901fd", "name": "Location Sensor 1", "description": "Location sensor for Location 2", "location": "6181df3b3fca98374cf901fb", "trackerType": "RFID",  "__v": 0, "sensorType": "ENVIRONMENT", "type": "LOCATION" 
]

- WorkerLocationContext
[
     "_id": "615676c885ccad55a493503b", "updatedAt": "2021-10-01T02:47:36.207Z", "createdAt": "2021-10-01T02:47:36.207Z", "sensor": "615657572079a55f7814947b", "worker": "6153dcfb58ad722c747eb42d", "__v": 0 ,
     "_id": "618311b56b77f445ecf73277", "updatedAt": "2021-11-03T22:48:21.887Z", "createdAt": "2021-11-03T22:48:21.887Z", "sensor": "6181e5f83fca98374cf901fd", "worker": "6181de993fca98374cf901f6", "__v": 0 ,
     "_id": "618311c86b77f445ecf73278", "updatedAt": "2021-11-03T22:48:40.507Z", "createdAt": "2021-11-03T22:48:40.507Z", "sensor": "6181e5f83fca98374cf901fd", "worker": "6181decc3fca98374cf901f8", "__v": 0 
]

- HeatMeasureContext
[
     "_id": "61831b796b77f445ecf7327b", "updatedAt": "2021-11-03T23:30:01.640Z", "createdAt": "2021-11-03T23:30:01.640Z", "sensor": "6181e5ae3fca98374cf901fc", "mercury": 25.8, "humid": 23.5, "globe": 25.5, "external": 24.13, "internal": 24.1, "__v": 0 ,
     "_id": "61831bc96b77f445ecf7327c", "updatedAt": "2021-11-03T23:31:21.080Z", "createdAt": "2021-11-03T23:31:21.080Z", "sensor": "6181e5ae3fca98374cf901fc", "mercury": 28.6, "humid": 27.8, "globe": 27, "external": 27.72, "internal": 27.56, "__v": 0 
]

编辑 2

我不得不稍微简化一下我的查询,因为像 heatMeasureContex.sensor.location 这样的一些表达式在那里不起作用(据我所知),但这是一个不起作用的简单试验,甚至不是一半我需要什么:mongopplaygroung.net

【问题讨论】:

可以添加示例数据吗? 您的游乐场链接为空。如果您可以用您的示例数据和当前的试验填充它,将会很有帮助。 @mohammadNaimi 我刚刚添加了一些示例数据 @ray 我刚刚添加了一个包含一些数据和一个简单查询的链接,这只是我需要做的一部分并且不起作用.-. this 你在找什么吗? 【参考方案1】:

您可以从HeatMeasureContext 集合启动聚合管道:

    $matchinternalexternal 字段中 $lookup 使用子管道的WorkerLocationContext 集合。在子管道中,$sum 工人计数并获得$max wlDatetime $unwind结果待进一步处理 $group 再次在HeatMeasureContext.location 上,使用$first 获得子管道中的结果,$max 获得hmDatetime
db.HeatMeasureContext.aggregate([
  
    $match: 
      $expr: 
        $or: [
          
            $gte: [
              "$internal",
              27
            ]
          ,
          
            $gte: [
              "$external",
              27
            ]
          
        ]
      
    
  ,
  
    "$lookup": 
      "from": "WorkerLocationContext",
      let: 
        loc: "$location"
      ,
      pipeline: [
        
          $match: 
            $expr: 
              $eq: [
                "$$loc",
                "$location"
              ]
            
          
        ,
        
          $group: 
            _id: "$location",
            "workers": 
              "$sum": 1
            ,
            "wlDatetime": 
              "$max": "$createdAt"
            
          
        
      ],
      "as": "workerAggResult"
    
  ,
  
    $unwind: "$workerAggResult"
  ,
  
    $group: 
      _id: "$location",
      "hmDatetime": 
        $max: "$createdAt"
      ,
      "wlDatetime": 
        $first: "$workerAggResult.wlDatetime"
      ,
      "workers": 
        $first: "$workerAggResult.workers"
      
    
  
])

这里是Mongo playground 供您参考。

【讨论】:

以上是关于在 MongoDB 中使用 group by、inner join 和嵌套条件进行聚合的主要内容,如果未能解决你的问题,请参考以下文章

MongoDB 相当于 SQL COUNT GROUP BY

MongoDB 使用group by 并显示其他列max值

mongodb GROUP BY 和 COUNT 文档中的数组

解决Mysql5.7以上版本, 使用group by抛出Expression #1 of SELECT list is not in GROUP BY clause and contains no异常

Hive中提示Expression Not In Group By Key的解决办法

SQL CE 4 错误:ntext 和 image 数据类型不能在 WHERE、HAVING、GROUP BY、ON 或 IN 子句中使用