多对多的 Django Inline 生成重复查询

Posted

技术标签:

【中文标题】多对多的 Django Inline 生成重复查询【英文标题】:Django Inline for ManyToMany generate duplicate queries 【发布时间】:2017-04-01 15:17:15 【问题描述】:

我的 django 管理员遇到了一些重大的性能问题。大量重复查询取决于我有多少内联。

models.py

class Setting(models.Model):
    name = models.CharField(max_length=50, unique=True)

    class Meta:
        ordering = ('name',)

    def __str__(self):
        return self.name


class DisplayedGroup(models.Model):
    name = models.CharField(max_length=30, unique=True)
    position = models.PositiveSmallIntegerField(default=100)

    class Meta:
        ordering = ('priority',)

    def __str__(self):
        return self.name


class Machine(models.Model):
    name = models.CharField(max_length=20, unique=True)
    settings = models.ManyToManyField(
        Setting, through='Arrangement', blank=True
    )

    class Meta:
        ordering = ('name',)

    def __str__(self):
        return self.name


class Arrangement(models.Model):
    machine = models.ForeignKey(Machine, on_delete=models.CASCADE)
    setting = models.ForeignKey(Setting, on_delete=models.CASCADE)
    displayed_group = models.ForeignKey(
        DisplayedGroup, on_delete=models.PROTECT,
        default=1)
    priority = models.PositiveSmallIntegerField(
        default=100,
        help_text='Smallest number will be displayed first'
    )

    class Meta:
        ordering = ('priority',)
        unique_together = (("machine", "setting"),)

admin.py

class ArrangementInline(admin.TabularInline):
    model = Arrangement
    extra = 1


class MachineAdmin(admin.ModelAdmin):
    inlines = (ArrangementInline,)

如果我在内联表单上添加了 3 个设置和 1 个额外设置,我有大约 10 个重复查询

SELECT "corps_setting"."id", "corps_setting"."name", "corps_setting"."user_id", "corps_setting"."tagged", "corps_setting"."created", "corps_setting"."modified" FROM "corps_setting" ORDER BY "corps_setting"."name" ASC
- Duplicated 5 times

SELECT "corps_displayedgroup"."id", "corps_displayedgroup"."name", "corps_displayedgroup"."color", "corps_displayedgroup"."priority", "corps_displayedgroup"."created", "corps_displayedgroup"."modified" FROM "corps_displayedgroup" ORDER BY "corps_displayedgroup"."priority" ASC
- Duplicated 5 times.

有人可以告诉我我在这里做错了什么吗?我花了 3 天时间试图自己解决问题,但没有运气。

当我有大约 50 个机器的内联设置时,问题会变得更糟,我将有大约 100 个查询。

Here is the screenshot

【问题讨论】:

【参考方案1】:

2020 年编辑:

查看下面 @isobolev 的答案,谁接受了这个答案并对其进行了改进以使其更通用。 :)


这在 Django 中几乎是正常的行为——它不会为您进行优化,但它为您提供了不错的工具来自己完成。别担心,100 个查询并不是一个真正需要立即修复的大问题(我在一页上看到过 16k 个查询)。但是,如果您的数据量会迅速增加,那么处理它当然是明智的。

您将配备的主要武器是查询集方法select_related()prefetch_related()。真的没有必要深入研究它们,因为它们有很好的记录here,但只是一个一般指针:

当您查询的对象只有一个相关对象(FK 或 one2one)时,请使用 select_related()

当您查询的对象有多个相关对象(FK 或 M2M 的另一端)时,使用 prefetch_related()

你问如何在 Django admin 中使用它们?小学,我亲爱的华生。覆盖管理页面方法get_queryset(self, request),使其看起来像这样:

from django.contrib import admin

class SomeRandomAdmin(admin.ModelAdmin):
    def get_queryset(self, request):
        return super().get_queryset(request).select_related('field1', 'field2').prefetch_related('field3')    

编辑:阅读您的评论后,我意识到我对您问题的最初解释是绝对错误的。对于您的问题,我也有多种解决方案,如下所示:

    我最常使用并推荐的简单方法:只需将 Django 默认选择小部件替换为 raw_id_field 小部件,不会进行任何查询。只需在内联管理员中设置raw_id_fields = ('setting', 'displayed_group') 即可。

    但是,如果你不想去掉选择框,我可以给出一些半hacky的代码来解决这个问题,但是相当冗长而且不是很漂亮。这个想法是覆盖创建表单的表单集,并在表单集中为这些字段指定选项,以便它们只从数据库中查询一次。

这里是:

from django import forms
from django.contrib import admin
from app.models import Arrangement, Machine, Setting, DisplayedGroup


class ChoicesFormSet(forms.BaseInlineFormSet):
    setting_choices = list(Setting.objects.values_list('id', 'name'))
    displayed_group_choices = list(DisplayedGroup.objects.values_list('id', 'name'))

    def _construct_form(self, i, **kwargs):
        kwargs['setting_choices'] = self.setting_choices
        kwargs['displayed_group_choices'] = self.displayed_group_choices
        return super()._construct_form(i, **kwargs)


class ArrangementInlineForm(forms.ModelForm):
    class Meta:
        model = Arrangement
        exclude = ()

    def __init__(self, *args, **kwargs):
        setting_choices = kwargs.pop('setting_choices', [((), ())])
        displayed_group_choices = kwargs.pop('displayed_group_choices', [((), ())])

        super().__init__(*args, **kwargs)

        # This ensures that you can still save the form without setting all 50 (see extra value) inline values.
        # When you save, the field value is checked against the "initial" value
        # of a field and you only get a validation error if you've changed any of the initial values.
        self.fields['setting'].choices = [('-', '---')] + setting_choices
        self.fields['setting'].initial = self.fields['setting'].choices[0][0]
        self.fields['setting'].empty_values = (self.fields['setting'].choices[0][0],)

        self.fields['displayed_group'].choices = displayed_group_choices
        self.fields['displayed_group'].initial = self.fields['displayed_group'].choices[0][0]


class ArrangementInline(admin.TabularInline):
    model = Arrangement
    extra = 50
    form = ArrangementInlineForm
    formset = ChoicesFormSet

    def get_queryset(self, request):
        return super().get_queryset(request).select_related('setting')


class MachineAdmin(admin.ModelAdmin):
    inlines = (ArrangementInline,)


admin.site.register(Machine, MachineAdmin)

如果您发现可以改进的地方或有任何疑问,请告诉我。

【讨论】:

过去几天我一直在尝试在 MachineAdmin、ArrangementAdmin、SettingAdmin、ArrangementInline 等许多地方使用 select_related 和 prefetch_related,但没有运气。问题在于 Select/Choices 内联查询集:对于每个内联,它对数据库进行 1 次查询以获取“设置”,并进行 1 次查询以获取“显示组”。如果我有 10 个内联,它将有 20 个查询。同时,MachineAdmin 查询集本身似乎对内联 Select/Choices 查询集没有任何影响 @HBui 嘿,对不起,我误解了你的问题,我更新了答案。在 Django 1.10 上测试。 @makeveli 你。是。天才。我发布了这个问题,不希望有人能像这样详细回答。一切正常。我什至不需要改变任何东西。天才。你太棒了。现在到最后一个问题:我怎样才能投票给你或做一些有利于你的事情来帮助你?我是新手,所以我不知道这些东西。 @HBui 我觉得你给我的表扬已经够多了,谢谢你的好话。 :) 我想多打扰你一点。一切仍然很好。除了我刚刚意识到每次单击“添加另一个排列”时,两个下拉框(设置和显示的组)都是空的,只有一个“()”选项。难道我做错了什么。 “extra = some_number”虽然有效,但在您单击“添加另一个安排”时无效。如果能解决就完美了!【参考方案2】:

如今,(感谢that question),BaseFormset 收到了form_kwargs attribute。

接受答案中的ChoicesFormSet 代码可以稍作修改:

class ChoicesFormSet(forms.BaseInlineFormSet):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        setting_choices = list(Setting.objects.values_list('id', 'name'))
        displayed_group_choices = list(DisplayedGroup.objects.values_list('id', 'name'))
        self.form_kwargs['setting_choices'] = self.setting_choices
        self.form_kwargs['displayed_group_choices'] = self.displayed_group_choices

其余代码保持不变,如接受的答案中所述:

class ArrangementInlineForm(forms.ModelForm):
    class Meta:
        model = Arrangement
        exclude = ()

    def __init__(self, *args, **kwargs):
        setting_choices = kwargs.pop('setting_choices', [((), ())])
        displayed_group_choices = kwargs.pop('displayed_group_choices', [((), ())])

        super().__init__(*args, **kwargs)

        # This ensures that you can still save the form without setting all 50 (see extra value) inline values.
        # When you save, the field value is checked against the "initial" value
        # of a field and you only get a validation error if you've changed any of the initial values.
        self.fields['setting'].choices = [('-', '---')] + setting_choices
        self.fields['setting'].initial = self.fields['setting'].choices[0][0]
        self.fields['setting'].empty_values = (self.fields['setting'].choices[0][0],)

        self.fields['displayed_group'].choices = displayed_group_choices
        self.fields['displayed_group'].initial = self.fields['displayed_group'].choices[0][0]


class ArrangementInline(admin.TabularInline):
    model = Arrangement
    extra = 50
    form = ArrangementInlineForm
    formset = ChoicesFormSet

    def get_queryset(self, request):
        return super().get_queryset(request).select_related('setting')


class MachineAdmin(admin.ModelAdmin):
    inlines = (ArrangementInline,)


admin.site.register(Machine, MachineAdmin)

【讨论】:

【参考方案3】:

我已经根据@makaveli 的回答组装了一个通用解决方案,似乎没有 cmets 中提到的问题:

class CachingModelChoicesFormSet(forms.BaseInlineFormSet):
    """
    Used to avoid duplicate DB queries by caching choices and passing them all the forms.
    To be used in conjunction with `CachingModelChoicesForm`.
    """

    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        sample_form = self._construct_form(0)
        self.cached_choices = 
        try:
            model_choice_fields = sample_form.model_choice_fields
        except AttributeError:
            pass
        else:
            for field_name in model_choice_fields:
                if field_name in sample_form.fields and not isinstance(
                    sample_form.fields[field_name].widget, forms.HiddenInput):
                    self.cached_choices[field_name] = [c for c in sample_form.fields[field_name].choices]

    def get_form_kwargs(self, index):
        kwargs = super().get_form_kwargs(index)
        kwargs['cached_choices'] = self.cached_choices
        return kwargs


class CachingModelChoicesForm(forms.ModelForm):
    """
    Gets cached choices from `CachingModelChoicesFormSet` and uses them in model choice fields in order to reduce
    number of DB queries when used in admin inlines.
    """

    @property
    def model_choice_fields(self):
        return [fn for fn, f in self.fields.items()
            if isinstance(f, (forms.ModelChoiceField, forms.ModelMultipleChoiceField,))]

    def __init__(self, *args, **kwargs):
        cached_choices = kwargs.pop('cached_choices', )
        super().__init__(*args, **kwargs)
        for field_name, choices in cached_choices.items():
            if choices is not None and field_name in self.fields:
                self.fields[field_name].choices = choices

您需要做的就是从 CachingModelChoicesForm 子类化您的模型并在您的内联类中使用 CachingModelChoicesFormSet:

class ArrangementInlineForm(CachingModelChoicesForm):
    class Meta:
        model = Arrangement
        exclude = ()


class ArrangementInline(admin.TabularInline):
    model = Arrangement
    extra = 50
    form = ArrangementInlineForm
    formset = CachingModelChoicesFormSet

【讨论】:

以上是关于多对多的 Django Inline 生成重复查询的主要内容,如果未能解决你的问题,请参考以下文章

Django查询集获得精确的多对多查询[重复]

django-blog:多对多查询

django-blog:多对多查询

Django:表多对多查询聚合分组FQ查询事务

Django 补充ORM多对多正向查询

Django多对多的增删改查