优化使用子选择进行分页的一对多查询

Posted

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了优化使用子选择进行分页的一对多查询相关的知识,希望对你有一定的参考价值。

我希望得到一些专家关注我的查询,看看为什么我收到不同的表现。

我试图解决的问题是我需要可以有一到多项的订单。这些订单需要分页。

为此,我采取了以下方法。我正在使用子查询按所需的项属性过滤订单。然后我重新加入项目以获得他们所需的字段。这意味着在分页时,当订单包含2个或更多项时,我不会错误地过滤订单行。

我看到间歇性的慢查询。第二次运行它们会更快。我认为这是因为Postgres正在将索引等加载到内存中?

我不完全理解发生了什么从解释。它看起来需要扫描每个订单,看看它们是否有适合子查询的项目?我对以下几行感到困惑。它说它需要扫描286853行但也只需要165行?

Index Scan Backward using orders_created_at_idx on orders  (cost=0.42..2708393.65 rows=286853 width=301) (actual time=64.598..2114.676 rows=165 loops=1)

有没有办法让Postgres首先按项目过滤,或者我是否正确地阅读了这个并且它已经这样做了?

查询:

SELECT 
  "orders"."id_orders" as "orders.id_orders", 
  "items"."id_items" as "items"."id_items",
  ..., 
  orders.created_at, orders.updated_at 
FROM (
  SELECT 
    orders.id_orders,
    orders.created_at,
    orders.updated_at
  FROM orders 
  WHERE orders.status in ('completed','pending') AND 
  (
    SELECT fk_vendor_id FROM items
    WHERE (
      items.fk_order_id = orders.id_orders AND
      items.fk_vendor_id = '0012800001YVccUAAT' AND
      items.fk_offer = '0060I00000RAKFYQA5' AND
      items.status IN ('completed','cancelled')
    ) LIMIT 1
  ) IS NOT NULL ORDER BY orders.created_at DESC LIMIT 50 OFFSET 150
) as orders INNER JOIN items ON items.fk_order_id = orders.id_orders;

第一个解释:

Nested Loop  (cost=1417.11..2311.77 rows=67 width=1705) (actual time=2785.221..17025.325 rows=17 loops=1)
  ->  Limit  (cost=1416.68..1888.77 rows=50 width=301) (actual time=2785.216..17024.918 rows=15 loops=1)
        ->  Index Scan Backward using orders_created_at_idx on orders  (cost=0.42..2708393.65 rows=286853 width=301) (actual time=1214.013..17024.897 rows=165 loops=1)
              Filter: ((status = ANY ('{completed,pending}'::orders_status_enum[])) AND ((SubPlan 1) IS NOT NULL))
              Rows Removed by Filter: 313631
              SubPlan 1
                ->  Limit  (cost=0.42..8.45 rows=1 width=19) (actual time=0.047..0.047 rows=0 loops=287719)
                      ->  Index Scan using items_fk_order_id_index on items items_1  (cost=0.42..8.45 rows=1 width=19) (actual time=0.047..0.047 rows=0 loops=287719)
                            Index Cond: (fk_order_id = orders.id_orders)
                            Filter: ((status = ANY ('{completed,cancelled}'::items_status_enum[])) AND (fk_vendor_id = '0012800001YVccUAAT'::text) AND (fk_offer = '0060I00000RAKFYQA5'::text))
                            Rows Removed by Filter: 1
  ->  Index Scan using items_fk_order_id_index on items  (cost=0.42..8.44 rows=1 width=1404) (actual time=0.002..0.026 rows=1 loops=15)
        Index Cond: (fk_order_id = orders.id_orders)
Planning time: 1.791 ms
Execution time: 17025.624 ms
(15 rows)

第二个解释:

Nested Loop  (cost=1417.11..2311.77 rows=67 width=1705) (actual time=115.659..2114.739 rows=17 loops=1)
  ->  Limit  (cost=1416.68..1888.77 rows=50 width=301) (actual time=115.654..2114.691 rows=15 loops=1)
        ->  Index Scan Backward using orders_created_at_idx on orders  (cost=0.42..2708393.65 rows=286853 width=301) (actual time=64.598..2114.676 rows=165 loops=1)
              Filter: ((status = ANY ('{completed,pending}'::orders_status_enum[])) AND ((SubPlan 1) IS NOT NULL))
              Rows Removed by Filter: 313631
              SubPlan 1
                ->  Limit  (cost=0.42..8.45 rows=1 width=19) (actual time=0.006..0.006 rows=0 loops=287719)
                      ->  Index Scan using items_fk_order_id_index on items items_1  (cost=0.42..8.45 rows=1 width=19) (actual time=0.006..0.006 rows=0 loops=287719)
                            Index Cond: (fk_order_id = orders.id_orders)
                            Filter: ((status = ANY ('{completed,cancelled}'::items_status_enum[])) AND (fk_vendor_id = '0012800001YVccUAAT'::text) AND (fk_offer = '0060I00000RAKFYQA5'::text))
                            Rows Removed by Filter: 1
  ->  Index Scan using items_fk_order_id_index on items  (cost=0.42..8.44 rows=1 width=1404) (actual time=0.002..0.002 rows=1 loops=15)
        Index Cond: (fk_order_id = orders.id_orders)
Planning time: 2.011 ms
Execution time: 2115.052 ms
(15 rows)

订单索引:

"cart_pkey" PRIMARY KEY, btree (id_orders)
"orders_legacy_id_uindex" UNIQUE, btree (legacy_id_orders)
"orders_transaction_key_uindex" UNIQUE, btree (transaction_key)
"orders_created_at_idx" btree (created_at)
"orders_customer_email_idx" gin (customer_email gin_trgm_ops)
"orders_customer_full_name_idx" gin (customer_full_name gin_trgm_ops)
Referenced by:
TABLE "items" CONSTRAINT "items_fk_order_id_fkey" FOREIGN KEY (fk_order_id) REFERENCES orders(id_orders) ON DELETE RESTRICT
TABLE "items_log" CONSTRAINT "items_log_fk_order_id_fkey" FOREIGN KEY (fk_order_id) REFERENCES orders(id_orders)

物品索引:

"items_pkey" PRIMARY KEY, btree (id_items)
"items_fk_vendor_id_booking_number_unique" UNIQUE, btree (fk_vendor_id, booking_number) WHERE legacy_id_items IS NULL
"items_legacy_id_uindex" UNIQUE, btree (legacy_id_items)
"items_transaction_key_uindex" UNIQUE, btree (transaction_key)
"items_booking_number_index" btree (booking_number)
"items_fk_order_id_index" btree (fk_order_id)
"items_fk_vendor_id_index" btree (fk_vendor_id)
"items_status_index" btree (status)

Foreign-key constraints:
"items_fk_order_id_fkey" FOREIGN KEY (fk_order_id) REFERENCES orders(id_orders) ON DELETE RESTRICT
答案

执行时间的差异可能实际上是缓存的影响。您可以使用EXPLAIN (ANALYZE, BUFFERS)查看数据库缓存中找到的页数。

为了使您的查询更具可读性,您应该重写

WHERE (
   SELECT fk_vendor_id FROM items
   WHERE (
     items.fk_order_id = orders.id_orders AND
     items.fk_vendor_id = '0012800001YVccUAAT' AND
     items.fk_offer = '0060I00000RAKFYQA5' AND
     items.status IN ('completed','cancelled')
   ) LIMIT 1
) IS NOT NULL

WHERE NOT EXISTS
   (SELECT 1 FROM items
    WHERE items.fk_order_id = orders.id_orders
      AND items.fk_vendor_id = '0012800001YVccUAAT'
      AND items.fk_offer = '0060I00000RAKFYQA5'
      AND items.status IN ('completed','cancelled')
   )

您可以做的最好的事情是加速查询:创建一个索引:

CREATE INDEX ON items(fk_order_id, fk_vendor_id, fk_offer);

以上是关于优化使用子选择进行分页的一对多查询的主要内容,如果未能解决你的问题,请参考以下文章

使用Mybatis分页插件PageHelper时的分页问题

一对多分页的SQL到底应该怎么写?

Mybatis Plus一对多联表查询及分页解决方案

mysql的分页使用子查询?

mySql分页Iimit优化

django分页后查询丢失