在 Django ORM 中按查询分组

Posted

技术标签:

【中文标题】在 Django ORM 中按查询分组【英文标题】:Group by query in Django ORM 【发布时间】:2021-10-19 13:38:19 【问题描述】:

我对如何编写 django 查询来获取我的数据感到困惑。我有 2 张桌子 'ticket' 和 'ticket_details'。下面是他们的架构。

Ticket(id, name, type, user)
TicketDetails(ticket_id, message, created_time)

注意:多条消息可以关联到一个工单 ID。

ticket_idTicket 表的外键

我想从两个表中获取所有列,其中只有 TicketDetails 表中的最新消息应该为特定的票证 ID 选择。

Example:
Ticket
id, name, type, user
1,install, application, usr1

TicketDetails
ticket_id, message, creted_time
1, <message1>, 12:00 PM
1, <message2>, 04:00 PM
2, <message3>, 05:00 PM -->latest entry

Expected Output:
id, name, type, user, message, created_time
1, install, application, usr1, <message3>, 05:00PM

提前致谢

【问题讨论】:

你能举个正式的例子吗? 你的意思是要根据ticket_id从两个表中获取数据? Ticket.objects.filter(id=1234).select_related('TicketDetails').latest('created_time') ???让我知道它是否有效 @k33da_the_bug 不,它不起作用。如果可以帮助您理解我的要求,我添加了一个示例 【参考方案1】:

我对你的模型做了一些假设,你没有提供任何假设:

class Ticket(models.Model):
    name = models.CharField(max_length=50)
    type = models.CharField(max_length=50)
    user = models.ForeignKey('auth.User', on_delete=models.CASCADE)


# Model names should NEVER end with "s"
class TicketDetail(models.Model):
    ticket = models.ForeignKey(Ticket, on_delete=models.CASCADE)
    message = models.CharField(max_length=50)
    created_time = models.DateTimeField(auto_now_add=True)

你有两个选择:

    你可以用纯sql写,你就失去了过滤能力

    sql = """
    SELECT ticket.id, ticket.name, ticket.type, ticket.user_id, detail.message
      FROM ticket ticket
      LEFT JOIN (
        SELECT detail.ticket_id, detail.message 
          FROM detail detail
         INNER JOIN (
            SELECT MAX(id) id, ticket_id 
              FROM detail 
             GROUP BY ticket_id
        ) detail_message ON detail.id = detail_message.id
    ) detail ON detail.ticket_id = ticket.id
    """.format(ticket=Ticket._meta.db_table, detail=TicketDetail._meta.db_table)
    tickets = Ticket.objects.raw(sql)
    
    for ticket in tickets:
        print(ticket.id, ticket.message)
    

    用“django”的方式写

    latest_messages = TicketDetail.objects.values('ticket_id').annotate(id=models.Max('id')).values('id')
    tickets = Ticket.objects.prefetch_related(models.Prefetch('ticketdetail_set', TicketDetail.objects.filter(id__in=latest_messages))).order_by('id')
    
    for ticket in tickets:
        print(ticket.id)
        # this iteration will only ever yield 1 result.. or nothing.
        for detail in ticket.ticketdetail_set.all():
            print(detail.message)
    

这里是测试:

 # uses factoryboy and faker to fill in the data

 class UserFactory(factory.django.DjangoModelFactory):
    class Meta:
        model = auth.models.User
        django_get_or_create = ('username',)

    first_name = fake.first_name()
    last_name = fake.last_name()
    email = factory.LazyAttribute(lambda obj: ".@gmail.com".format(obj.last_name, obj.first_name).lower())
    username = factory.Sequence(lambda n: 'user' + str(n))


class SimpleTestCase(TestCase):
    
    def setUp(self):
        ticket1 = Ticket.objects.create(user=UserFactory(), type='A', name='Number 1')
        TicketDetail.objects.create(ticket=ticket1, message='you wont see this')
        TicketDetail.objects.create(ticket=ticket1, message='you wont see this either')
        TicketDetail.objects.create(ticket=ticket1, message='YES!!')

        ticket2 = Ticket.objects.create(user=UserFactory(), type='B', name='Number 2')
        TicketDetail.objects.create(ticket=ticket2, message='you also wont see this')
        TicketDetail.objects.create(ticket=ticket2, message='you also wont see this either')
        TicketDetail.objects.create(ticket=ticket2, message='also YES!!')

    def test_flatten_pure_sql(self):
        sql = """
        SELECT ticket.id, ticket.name, ticket.type, ticket.user_id, detail.message
          FROM ticket ticket
          LEFT JOIN (
            SELECT detail.ticket_id, detail.message 
              FROM detail detail
             INNER JOIN (
                SELECT MAX(id) id, ticket_id 
                  FROM detail 
                 GROUP BY ticket_id
            ) detail_message ON detail.id = detail_message.id
        ) detail ON detail.ticket_id = ticket.id
        """.format(ticket=Ticket._meta.db_table, detail=TicketDetail._meta.db_table)
        self.assertEquals(['YES!!', 'also YES!!'], [x.message for x in Ticket.objects.raw(sql)])

    def test_orm_way(self):
        latest_messages = TicketDetail.objects.values('ticket_id').annotate(id=models.Max('id')).values('id')
        tickets = Ticket.objects.prefetch_related(models.Prefetch('ticketdetail_set', TicketDetail.objects.filter(id__in=latest_messages))).order_by('id')
        self.assertEquals(['Number 1', 'Number 2'], [x.name for x in tickets])
        self.assertEquals(['YES!!'], [x.message for x in tickets[0].ticketdetail_set.all()])
        self.assertEquals(['also YES!!'], [x.message for x in tickets[1].ticketdetail_set.all()])

【讨论】:

感谢您的回答。我现在明白了。

以上是关于在 Django ORM 中按查询分组的主要内容,如果未能解决你的问题,请参考以下文章

python 之 Django框架(orm单表查询orm多表查询聚合查询分组查询F查询 Q查询事务Django ORM执行原生SQL)

[Django框架之ORM操作:多表查询,聚合查询分组查询F查询Q查询choices参数]

Python - Django - ORM 分组查询补充

Django中的聚合/分组查询/F/Q查询/orm执行原生sql语句/ ORM事务和锁

使用SQL语言了解Django ORM中的分组(group by)和聚合(aggregation)查询

相当于按年分组的Django ORM查询集?