Mongoose：find() 忽略重复值

Posted 2023-03-11

技术标签:

【中文标题】Mongoose：find() 忽略重复值【英文标题】：Mongoose: find() ignore duplicate values 【发布时间】：2020-09-10 03:46:37 【问题描述】：

我有一个“聊天”猫鼬Schema，它具有以下属性：

const schema = mongoose.Schema(
    ...
    recipient: 
        type: mongoose.Types.ObjectId,
        required: true,
        ref: 'User',
    ,
    sender: 
        type: mongoose.Types.ObjectId,
        required: true,
        ref: 'User',
    ,
    content: 
        type: String,
    ,
    ...
, 
    timestamps: true,
);

一般来说，我想获取用户拥有的每个覆盖的最后一条消息。这意味着我需要提供一个用户 ID（可以存储在 sender 或 recipient 字段中）并取回用户与其他每个用户的最后一条消息（由 createdAt 表示）。

示例： 假设我有以下documents：

[
  
    recipient: "One",
    sender: "Two",
    createdAt: ISODate("2014-01-01T08:00:00Z"),

  ,
  
    recipient: "One",
    sender: "Three",
    createdAt: ISODate("2014-02-15T08:00:00Z")
  ,
  
    recipient: "Two",
    sender: "One",
    createdAt: ISODate("2014-02-16T12:05:10Z")
  
]

将“One”作为输入 - Model.find(...) 的期望结果是：

[
  
    recipient: "One",
    sender: "Three",
    createdAt: ISODate("2014-02-15T08:00:00Z")
  ,
  
    recipient: "Two",
    sender: "One",
    createdAt: ISODate("2014-02-16T12:05:10Z")
  
]

【问题讨论】：

这能回答你的问题吗？ Get distinct records values 【参考方案1】：

您可以通过聚合来做到这一点，如下面的查询所示

工作示例 - https://mongoplayground.net/p/wEi4Y6IZJ2v

db.collection.aggregate([
  
    $sort: 
      recipient: 1,
      createdAt: 1
    
  ,
  
    $group: 
      _id: "$recipient",
      createdAt: 
        $last: "$createdAt"
      
    
  ,
  
    $project: 
      _id: 0,
      recipient: "$_id",
      createdAt: "$createdAt"
    
  
])

如果你有两个字段要匹配，那么你可以使用下面的查询

工作示例 - https://mongoplayground.net/p/Rk5MxuphLOT

db.collection.aggregate([
  
    $match: 
      $or: [
        
          sender: "One"
        ,
        
          recipient: "One"
        
      ]
    
  ,
  
    $addFields: 
      other: 
        $cond: 
          if: 
            $eq: [
              "$recipient",
              "One"
            ]
          ,
          then: "$sender",
          else: "$recipient"
        
      
    
  ,
  
    $sort: 
      createdAt: 1
    
  ,
  
    $group: 
      _id: "$other",
      createdAt: 
        $last: "$createdAt"
      ,
      recipient: 
        $last: "$recipient"
      ,
      sender: 
        $last: "$sender"
      
    
  ,
  
    $project: 
      _id: 0,
      recipient: "$recipient",
      sender: "$sender",
      createdAt: "$createdAt"
    
  
])

【讨论】：

mongoplayground.net/p/E0G67cotQvL 谢谢。如果我想要相同的内容，但对于两个字段怎么办 - 我的意思是我想搜索特定用户并获取用户存在于 sender 或 recipient 中的文档 - 但在找到它们之后，只取那些最后”。为了更清楚 - 我收集了人与人之间的消息（私人聊天） - 我想在他进行的每个对话中获取最后一个用户的消息（无论他是发件人还是收件人） @RonRofe 更新的答案为评论中提到的用例添加了一个新示例【参考方案2】：

使用示例数据：

[
  
    recipient: "One",
    sender: "Two",
    createdAt: ISODate("2014-01-01T08:00:00Z"),
    content: "Hi Mr. One! - Two"
  ,
  
    recipient: "One",
    sender: "Three",
    createdAt: ISODate("2014-02-15T08:00:00Z"),
    content: "Hello One! - Three"
  ,
  
    recipient: "Two",
    sender: "One",
    createdAt: ISODate("2014-02-16T12:05:10Z"),
    content: "Whats up, Two? - One"
  
]

看看下面的聚合：https://mongoplayground.net/p/DTSDWX3aLWe

它...

使用 $match 按收件人或发件人过滤所有邮件。返回匹配当前用户的那些 (One) 使用 $addFields 添加一个 conversationWith 字段，如果它是一条消息给用户 One 则包含 recipient 或如果它是一条消息则包含 sender 由用户一发送 使用 $sort 按日期对邮件进行排序使用 $group 将所有消息按新的conversationWith 字段分组，并将最新消息返回为firstMessage

完整的聚合管道：

db.collection.aggregate([
  
    $match: 
      $and: [
        
          $or: [
            
              recipient: "One"
            ,
            
              sender: "One"
            
          ],

        ,

      ]
    
  ,
  
    $addFields: 
      conversationWith: 
        $cond: 
          if: 
            $eq: [
              "$sender",
              "One"
            ]
          ,
          then: "$recipient",
          else: "$sender"
        
      
    
  ,
  
    $sort: 
      createdAt: -1
    
  ,
  
    $group: 
      _id: "$conversationWith",
      firstMessage: 
        $first: "$$ROOT"
      
    
  
])

使用 mongoplayground，您可以逐个删除聚合步骤，以查看每个步骤的作用。

试试：

仅$match步骤 $match + $addFields $match + $addFields + $sort [..]

为了更好的理解。

【讨论】：

【参考方案3】：

如果您想依靠 mongodb 过滤掉重复项，最好不要通过创建 unique index 来允许有重复项。

由于您的收件人似乎嵌套在父方案中，我会在 nodejs 中过滤重复项，因为在 mongodb 查询中很难解开这个问题。如果您需要使用 mongodb，请使用 distinct 函数或 aggregate pipeline 并受到 this article 的启发

【讨论】：

所以你的建议是获取所有documents 并使用js 进行过滤？

以上是关于Mongoose：find() 忽略重复值的主要内容，如果未能解决你的问题，请参考以下文章