Django ORM:相当于 SQL `NOT IN`? `exclude` 和 `Q` 对象不起作用
Posted
技术标签:
【中文标题】Django ORM:相当于 SQL `NOT IN`? `exclude` 和 `Q` 对象不起作用【英文标题】:Django ORM: Equivalent of SQL `NOT IN`? `exclude` and `Q` objects do not work 【发布时间】:2020-06-25 13:25:42 【问题描述】:问题
我正在尝试使用 Django ORM 来执行相当于 SQL NOT IN
子句的操作,在子选择中提供 ID 列表以从日志记录表中返回一组记录。我不知道这是否可能。
模型
class JobLog(models.Model):
job_number = models.BigIntegerField(blank=True, null=True)
name = models.TextField(blank=True, null=True)
username = models.TextField(blank=True, null=True)
event = models.TextField(blank=True, null=True)
time = models.DateTimeField(blank=True, null=True)
我的尝试
我的第一次尝试是使用exclude
,但这确实NOT
否定整个Subquery
,而不是所需的NOT IN
:
query = (
JobLog.objects.values(
"username", "job_number", "name", "time",
)
.filter(time__gte=start, time__lte=end, event="delivered")
.exclude(
job_number__in=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)
)
)
不幸的是,这会产生以下 SQL:
SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log"
WHERE (
"view_job_log"."event" = 'delivered'
AND "view_job_log"."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
AND "view_job_log"."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
AND NOT (
"view_job_log"."job_number" IN (
SELECT U0."job_number"
FROM "view_job_log" U0
WHERE (
U0."event" = 'finished' AND U0."time" >= '2020-03-12T11:22:28.300590+00:00'::timestamptz
AND U0."time" <= '2020-03-13T11:22:28.300600+00:00'::timestamptz
)
)
AND "view_job_log"."job_number" IS NOT NULL
)
)
我需要的是第三个AND
子句是AND "view_job_log"."job_number" NOT IN
而不是AND NOT (
。
我也尝试过先使用exclude
将子选择作为自己的查询进行,如下所示:
Django equivalent of SQL not in
但是,这会产生同样有问题的结果。然后我尝试了一个Q
对象,它产生了一个类似的查询:
query = (
JobLog.objects.values(
"username", "subscriber_code", "job_number", "name", "time",
)
.filter(
~models.Q(job_number__in=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)),
time__gte=start,
time__lte=end,
event="delivered",
)
)
对Q
对象的尝试再次产生以下SQL,但没有NOT IN
:
SELECT "view_job_log"."username", "view_job_log"."group", "view_job_log"."job_number", "view_job_log"."name", "view_job_log"."time"
FROM "view_job_log" WHERE (
NOT (
"view_job_log"."job_number" IN (
SELECT U0."job_number"
FROM "view_job_log" U0
WHERE (
U0."event" = 'finished'
AND U0."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
AND U0."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)
)
AND "view_job_log"."job_number" IS NOT NULL
)
AND "view_job_log"."event" = 'delivered'
AND "view_job_log"."time" >= '2020-03-12T11:33:28.098653+00:00'::timestamptz
AND "view_job_log"."time" <= '2020-03-13T11:33:28.098678+00:00'::timestamptz
)
有什么方法可以让 Django 的 ORM 做与AND job_number NOT IN (12345, 12346, 12347)
等效的事情吗?还是我必须使用原始 SQL 来完成此操作?
提前感谢您阅读整个文字墙问题。显式优于隐式。 :)
【问题讨论】:
您的模型缺少event
字段
@Lotram 啊,我已经从模型中删除了不相关的字段,并且有点过分热心。谢谢你的收获!
不确定这是否是预期的,但由于JobLog.job_number
可以为空,所以一旦您的子查询返回NULL
(***.com/questions/129077/…),您的NOT IN
就不会通过。这正是 ORM 添加的 "view_job_log"."job_number" IS NOT NULL
SQL 在您执行 exclude(field__in)
时试图保护的内容,但如果您使用自己的 notin
查找,您将独自一人。
【参考方案1】:
我认为最简单的方法是定义自定义查找,类似于 this one 或 the in lookup
from django.db.models.lookups import In as LookupIn
class NotIn(LookupIn):
lookup_name = "notin"
def get_rhs_op(self, connection, rhs):
return "NOT IN %s" % rhs
Field.register_lookup(NotIn)
或
class NotIn(models.Lookup):
lookup_name = "notin"
def as_sql(self, compiler, connection):
lhs, params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params.extend(rhs_params)
return "%s NOT IN %s" % (lhs, rhs), params
然后在您的查询中使用它:
query = (
JobLog.objects.values(
"username", "job_number", "name", "time",
)
.filter(time__gte=start, time__lte=end, event="delivered")
.filter(
job_number__notin=models.Subquery(
JobLog.objects.values_list("job_number", flat=True).filter(
time__gte=start, time__lte=end, event="finished",
)
)
)
)
这会生成 SQL:
SELECT
"people_joblog"."username",
"people_joblog"."job_number",
"people_joblog"."name",
"people_joblog"."time"
FROM
"people_joblog"
WHERE ("people_joblog"."event" = delivered
AND "people_joblog"."time" >= 2020 - 03 - 13 15:24:34.691222 + 00:00
AND "people_joblog"."time" <= 2020 - 03 - 13 15:24:41.678069 + 00:00
AND "people_joblog"."job_number" NOT IN (
SELECT
U0. "job_number"
FROM
"people_joblog" U0
WHERE (U0. "event" = finished
AND U0. "time" >= 2020 - 03 - 13 15:24:34.691222 + 00:00
AND U0. "time" <= 2020 - 03 - 13 15:24:41.678069 + 00:00)))
【讨论】:
谢谢你,@Lotram!奇迹般有效。我正在考虑向 Django 提交notin
PR。 :)
@Lotram,这很好,但它没有按预期工作。您将得到与 ~Q 不同的 SQL,并且在例如执行 job_number__notin=[]
时会得到错误的结果。
@MurphyAdam 问题确实是关于不生成相同的 SQL,而是在生成的 SQL 中引入 NOT IN
子句,这就是我的答案。我没有测试每一个边缘情况,所以它可能并不总是按预期工作。如果你想更安全,你可以使用@simon-charrette 的解决方案,作为一个 django 核心开发者,他可能最了解【参考方案2】:
使用Exists
和特殊大小写NULL
s 可能会获得相同的结果。
.filter(
~Exists(
JobLog.objects.filter(
Q(jobnumber=None) | Q(jobnumber=OuterRef('jobnumber')),
time__gte=start,
time__lte=end,
event='finished',
)
)
)
【讨论】:
感谢您的帮助,但是对于应该相当简单的 ORM 操作来说,这不是相当多的体操吗?【参考方案3】:你可以试试这个:
JobLog.objects.filter(time__gte=start, time__lte=end, event="delivered").exclude(time__gte=start, event='finished').exclude(time__lte=end, event='finished')
【讨论】:
以上是关于Django ORM:相当于 SQL `NOT IN`? `exclude` 和 `Q` 对象不起作用的主要内容,如果未能解决你的问题,请参考以下文章
django orm补充 only/defer/ selectd_related