Postgresql查询因添加WHERE约束而无法解释
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Postgresql查询因添加WHERE约束而无法解释相关的知识,希望对你有一定的参考价值。
我有以下PostgreSQL查询,其中包含一些子查询。查询几乎立即运行,直到我添加WHERE lb.type = 'Marketing'
约束,这导致它花费大约3分钟。我发现添加这样一个简单的约束导致如此极端的减速是令人费解的,但我猜它必须指出我的方法中的一个根本缺陷。
我希望在几个方面提供帮助:
- 我是否使用子查询从适当的特定表中选择最新记录,还是会导致性能问题?
- 在尝试诊断问题时,我应该在执行计划中寻找什么?
- 我应该如何确定为这些复杂查询创建哪些索引?
- 为什么额外的WHERE约束会导致如此大规模的减速?
表结构如下:
CREATE TABLE sales.leads
(
lead_id integer NOT NULL DEFAULT nextval('sales.leads_lead_id_seq'::regclass),
batch_id integer,
expired integer NOT NULL DEFAULT 0,
closed integer NOT NULL DEFAULT 0,
merged integer NOT NULL DEFAULT 0,
CONSTRAINT leads_pkey PRIMARY KEY (lead_id)
)
CREATE TABLE sales.lead_batches
(
batch_id integer NOT NULL DEFAULT nextval('sales.lead_batches_batch_id_seq'::regclass),
inserted_datetime timestamp without time zone,
type character varying(100) COLLATE pg_catalog."default",
uploaded smallint NOT NULL DEFAULT '0'::smallint,
CONSTRAINT lead_batches_pkey PRIMARY KEY (batch_id)
)
CREATE TABLE sales.lead_results
(
lead_result_id integer NOT NULL DEFAULT nextval('sales.lead_results_lead_result_id_seq'::regclass),
lead_id integer,
assigned_datetime timestamp without time zone NOT NULL,
user_id character varying(255) COLLATE pg_catalog."default" NOT NULL,
resulted_datetime timestamp without time zone,
result character varying(255) COLLATE pg_catalog."default",
CONSTRAINT lead_results_pkey PRIMARY KEY (lead_result_id)
)
CREATE TABLE sales.personal_details
(
lead_id integer,
title character varying(50) COLLATE pg_catalog."default",
first_name character varying(100) COLLATE pg_catalog."default",
surname character varying(255) COLLATE pg_catalog."default",
email_address character varying(100) COLLATE pg_catalog."default",
updated_date date NOT NULL
)
CREATE TABLE sales.users
(
user_id character varying(50) COLLATE pg_catalog."default" NOT NULL,
surname character varying(255) COLLATE pg_catalog."default",
name character varying(255) COLLATE pg_catalog."default"
)
查询:
SELECT l.*, pd.*, lr.resulted_datetime, u.name
FROM sales.leads l
INNER JOIN sales.lead_batches lb ON l.batch_id = lb.batch_id
LEFT JOIN (
SELECT pd_sub.*
FROM sales.personal_details pd_sub
INNER JOIN (
SELECT lead_id, MAX(updated_date) AS updated_date
FROM sales.personal_details
GROUP BY lead_id
) sub ON pd_sub.lead_id = sub.lead_id AND pd_sub.updated_date = sub.updated_date
) pd ON l.lead_id = pd.lead_id
LEFT JOIN (
SELECT lr_sub.*
FROM sales.lead_results lr_sub
INNER JOIN (
SELECT lead_id, MAX(resulted_datetime) AS resulted_datetime
FROM sales.lead_results
GROUP BY lead_id) sub
ON lr_sub.lead_id = sub.lead_id AND lr_sub.resulted_datetime = sub.resulted_datetime
) lr ON l.lead_id = lr.lead_id
LEFT JOIN sales.users u ON u.user_id = lr.user_id
WHERE lb.type = 'Marketing'
AND lb.uploaded = 1
AND l.merged = 0
执行计划:
Nested Loop Left Join (cost=10485.51..17604.18 rows=34 width=158) (actual time=717.862..168709.593 rows=18001 loops=1)
Join Filter: (l.lead_id = pd_sub.lead_id)
Rows Removed by Join Filter: 687818215
-> Nested Loop Left Join (cost=6487.82..12478.42 rows=34 width=135) (actual time=658.141..64951.950 rows=18001 loops=1)
Join Filter: (l.lead_id = lr_sub.lead_id)
Rows Removed by Join Filter: 435482960
-> Hash Join (cost=131.01..1816.10 rows=34 width=60) (actual time=1.948..126.067 rows=17998 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.032..69.763 rows=32621 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=130.96..130.96 rows=4 width=20) (actual time=1.894..1.894 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on lead_batches lb (cost=0.00..130.96 rows=4 width=20) (actual time=1.078..1.884 rows=4 loops=1)
Filter: (((type)::text = 'Marketing'::text) AND (uploaded = 1))
Rows Removed by Filter: 3866
-> Materialize (cost=6356.81..10661.81 rows=1 width=79) (actual time=0.006..1.362 rows=24197 loops=17998)
-> Nested Loop Left Join (cost=6356.81..10661.81 rows=1 width=79) (actual time=96.246..633.701 rows=24197 loops=1)
Join Filter: ((u.user_id)::text = (lr_sub.user_id)::text)
Rows Removed by Join Filter: 1742184
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=96.203..202.086 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=134.595..166.341 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.033..17.333 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=134.260..134.260 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=122.823..129.022 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.020..71.768 rows=107051 loops=3)
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.002..0.007 rows=73 loops=24197)
-> Materialize (cost=3997.68..5030.85 rows=187 width=31) (actual time=0.003..2.033 rows=38211 loops=18001)
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.802..85.774 rows=38211 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.330..35.345 rows=38212 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.014..4.636 rows=38232 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=29.058..29.058 rows=38211 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.026..17.231 rows=38232 loops=1)
Planning time: 1.966 ms
Execution time: 168731.769 ms
我在所有表上都有一个关于lead_id的索引,在lead_batches中有一个关于(类型,上传)的附加索引。
非常感谢您的任何帮助!
编辑:
没有附加WHERE约束的执行计划:
Hash Left Join (cost=15861.46..17780.37 rows=30972 width=158) (actual time=765.076..844.512 rows=32053 loops=1)
Hash Cond: (l.lead_id = pd_sub.lead_id)
-> Hash Left Join (cost=10829.21..12630.45 rows=30972 width=135) (actual time=667.460..724.297 rows=32053 loops=1)
Hash Cond: (l.lead_id = lr_sub.lead_id)
-> Hash Join (cost=167.39..1852.48 rows=30972 width=60) (actual time=2.579..36.683 rows=32050 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.034..22.166 rows=32623 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=121.40..121.40 rows=3679 width=20) (actual time=2.503..2.503 rows=3679 loops=1)
Buckets: 4096 Batches: 1 Memory Usage: 234kB
-> Seq Scan on lead_batches lb (cost=0.00..121.40 rows=3679 width=20) (actual time=0.011..1.809 rows=3679 loops=1)
Filter: (uploaded = 1)
Rows Removed by Filter: 193
-> Hash (cost=10661.81..10661.81 rows=1 width=79) (actual time=664.855..664.855 rows=24197 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2821kB
-> Nested Loop Left Join (cost=6356.81..10661.81 rows=1 width=79) (actual time=142.634..647.146 rows=24197 loops=1)
Join Filter: ((u.user_id)::text = (lr_sub.user_id)::text)
Rows Removed by Join Filter: 1742184
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=142.590..241.913 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=141.250..171.403 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.027..15.322 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=140.917..140.917 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=127.911..135.076 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.020..74.626 rows=107051 loops=3)
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.002..0.006 rows=73 loops=24197)
-> Hash (cost=5029.92..5029.92 rows=187 width=31) (actual time=97.561..97.561 rows=38213 loops=1)
Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2660kB
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.712..85.099 rows=38213 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.831..35.015 rows=38214 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.012..4.995 rows=38234 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=28.468..28.468 rows=38213 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.024..17.089 rows=38234 loops=1)
Planning time: 2.058 ms
Execution time: 849.460 ms
禁用nested_loops的执行计划:
Hash Left Join (cost=13088.17..17390.71 rows=34 width=158) (actual time=277.646..343.924 rows=18001 loops=1)
Hash Cond: (l.lead_id = pd_sub.lead_id)
-> Hash Right Join (cost=8055.91..12358.31 rows=34 width=135) (actual time=181.614..238.365 rows=18001 loops=1)
Hash Cond: (lr_sub.lead_id = l.lead_id)
-> Hash Left Join (cost=6359.43..10661.82 rows=1 width=79) (actual time=156.498..201.533 rows=24197 loops=1)
Hash Cond: ((lr_sub.user_id)::text = (u.user_id)::text)
-> Gather (cost=6356.81..10659.19 rows=1 width=72) (actual time=156.415..190.934 rows=24197 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Hash Join (cost=5356.81..9659.09 rows=1 width=72) (actual time=143.387..178.653 rows=8066 loops=3)
Hash Cond: ((lr_sub.lead_id = lead_results.lead_id) AND (lr_sub.resulted_datetime = (max(lead_results.resulted_datetime))))
-> Parallel Seq Scan on lead_results lr_sub (cost=0.00..3622.05 rows=44605 width=72) (actual time=0.036..22.404 rows=35684 loops=3)
-> Hash (cost=5110.36..5110.36 rows=16430 width=12) (actual time=143.052..143.052 rows=24194 loops=3)
Buckets: 32768 Batches: 1 Memory Usage: 1391kB
-> HashAggregate (cost=4781.76..4946.06 rows=16430 width=12) (actual time=131.793..137.760 rows=24204 loops=3)
Group Key: lead_results.lead_id
-> Seq Scan on lead_results (cost=0.00..4246.51 rows=107051 width=12) (actual time=0.023..78.918 rows=107051 loops=3)
-> Hash (cost=1.72..1.72 rows=72 width=23) (actual time=0.061..0.061 rows=73 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 12kB
-> Seq Scan on users u (cost=0.00..1.72 rows=72 width=23) (actual time=0.031..0.039 rows=73 loops=1)
-> Hash (cost=1696.05..1696.05 rows=34 width=60) (actual time=25.068..25.068 rows=17998 loops=1)
Buckets: 32768 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2084kB
-> Hash Join (cost=10.96..1696.05 rows=34 width=60) (actual time=0.208..18.630 rows=17998 loops=1)
Hash Cond: (l.batch_id = lb.batch_id)
-> Seq Scan on leads l (cost=0.00..1273.62 rows=32597 width=44) (actual time=0.043..13.065 rows=32623 loops=1)
Filter: (merged = 0)
Rows Removed by Filter: 5595
-> Hash (cost=10.91..10.91 rows=4 width=20) (actual time=0.137..0.137 rows=4 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Index Scan using lead_batches_type_idx on lead_batches lb (cost=0.28..10.91 rows=4 width=20) (actual time=0.091..0.129 rows=4 loops=1)
Index Cond: ((type)::text = 'Marketing'::text)
Filter: (uploaded = 1)
-> Hash (cost=5029.92..5029.92 rows=187 width=31) (actual time=96.005..96.005 rows=38213 loops=1)
Buckets: 65536 (originally 1024) Batches: 1 (originally 1) Memory Usage: 2660kB
-> Hash Join (cost=3997.68..5029.92 rows=187 width=31) (actual time=52.166..84.592 rows=38213 loops=1)
Hash Cond: ((personal_details.lead_id = pd_sub.lead_id) AND ((max(personal_details.updated_date)) = pd_sub.updated_date))
-> HashAggregate (cost=1811.38..2186.06 rows=37468 width=8) (actual time=23.785..34.403 rows=38214 loops=1)
Group Key: personal_details.lead_id
-> Seq Scan on personal_details (cost=0.00..1623.92 rows=37492 width=8) (actual time=0.013..4.680 rows=38234 loops=1)
-> Hash (cost=1623.92..1623.92 rows=37492 width=35) (actual time=27.960..27.960 rows=38213 loops=1)
Buckets: 65536 Batches: 1 Memory Usage: 2809kB
-> Seq Scan on personal_details pd_sub (cost=0.00..1623.92 rows=37492 width=35) (actual time=0.019..15.350 rows=38234 loops=1)
Planning time: 2.469 ms
Execution time: 346.590 ms
你基本上缺少一些重要的索引。
为了测试改进,我自己设置了表,并尝试用解释计划中读取的类似分布的测试数据填充它们。
我的基线表现是~160秒:https://explain.depesz.com/s/WlKO
我做的第一件事是为外键引用创建索引(尽管不是全部都是必需的):
CREATE INDEX idx_personal_details_leads ON sales.personal_details (lead_id);
CREATE INDEX idx_leads_batches ON sales.leads (batch_id);
CREATE INDEX idx_lead_results_users ON sales.lead_results (user_id);
这让我们下降到大约112秒:https://explain.depesz.com/s/aRcf
现在,大部分时间都是实际花在自联接上(使用最新的personal_details
表updated_date
和使用最新的lead_results
表resulted_datetime
)。基于此,我想出了以下两个索引:
CREATE INDEX idx_personal_details_updated ON sales.personal_details (lead_id, updated_date DESC);
CREATE INDEX idx_lead_results_resulted ON sales.lead_results (lead_id, resulted_datetime DESC);
...然后立即将我们降低到~110毫秒:https://explain.depesz.com/s/dDfk
调试帮助
什么帮助我调试哪些索引最有效,我首先重写了查询以消除任何子选择,而是使用专用的CTE为每个:
WITH
leads_update_latest AS (
SELECT lead_id, MAX(updated_date) AS updated_date
FROM sales.personal_details
GROUP BY lead_id
),
pd AS (
SELECT pd_sub.*
FROM sales.personal_details pd_sub
INNER JOIN leads_update_latest sub ON (pd_sub.lead_id = sub.lead_id AND pd_sub.updated_date = sub.updated_date)
),
leads_result_latest AS (
SELECT lead_id, MAX(resulted_datetime) AS resulted_datetime
FROM sales.lead_results
GROUP BY lead_id
),
lr AS (
SELECT lr_sub.*
FROM sales.lead_results lr_sub
INNER JOIN leads_result_latest sub ON (lr_sub.lead_id = sub.lead_id AND lr_sub.resulted_datetime = sub.resulted_datetime)
),
leads AS (
SELECT l.*
FROM sales.leads l
INNER JOIN sales.lead_batches lb ON (l.batch_id = lb.batch_id)
WHERE lb.type = 'Marketing'
AND lb.uploaded = 1
AND l.merged = 0
)
SELECT l.*, pd.*, lr.resulted_datetime, u.name
FROM leads l
LEFT JOIN pd ON l.lead_id = pd.lead_id
LEFT JOIN lr ON l.lead_id = lr.lead_id
LEFT JOIN sales.users u ON u.user_id = lr.user_id
;
令人惊讶的是,通过将查询单独重写为CTE,PostgreSQL计划程序更快,只需〜2.3秒而没有任何索引:https://explain.depesz.com/s/lqzq
......优化:
- FK指数下降到~230毫秒:https://explain.depesz.com/s/a6wT
但是,对于其他组合索引,CTE版本降级:
- 组合反向索引高达~270毫秒:https://explain.depesz.com/s/TNNm
但是,由于这些组合索引会大大加快原始查询的速度,因此它们的增长速度也比单列索引快得多,并且它们在数据库可伸缩性方面需要考虑额外的写入成本。
因此,对于CTE版本执行速度稍慢但速度足以省略DB必须维护的两个附加索引可能是有意义的。
以上是关于Postgresql查询因添加WHERE约束而无法解释的主要内容,如果未能解决你的问题,请参考以下文章
postgresql 在 where 子句中使用 json 子元素