优化来自多个表的连接查询
Posted
技术标签:
【中文标题】优化来自多个表的连接查询【英文标题】:Optimize join query from multiple tables 【发布时间】:2021-08-11 11:34:09 【问题描述】:我有通过外键相互连接的表(postgresql 13.1)。
order: order_id, name
sub_order: mainorder, order_id (foreign key to order), detail
task_group: id, group_name
tasks: id, taskname, task_group_id (foregin key to group_name)
task_kind: id, kind_name
task_task_kind: id, kind_id(fk to task_kind), task_id (fk to task)
time_per_project: person, start_time, stop_time, part, order_id (foreign key to sub_order),
希望我描述的足够多。我对物化视图的查询如下,效果很好:
SELECT
so.order_id AS order_id,
MIN(so.status) AS status_id,
SUM(AGE(tpp.stop_time, tpp.start_time)) AS total,
SUM(
CASE WHEN (tasksgroups.id = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS srut,
SUM(
CASE WHEN (tpp.valve_part_id = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS korpus,
SUM(
CASE WHEN (tasks_with_kinds.task_kind = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS zwykle,
SUM(
CASE WHEN (tasks_with_kinds.task_kind = 6) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS wyprawki
FROM
intranet.sub_orders so
LEFT JOIN intranet.time_per_project tpp ON so.mainorder = tpp.project_id
LEFT JOIN intranet.task_task_kind tasks_with_kinds ON tasks_with_kinds.id = tpp.task
LEFT JOIN intranet.task tasks ON tasks.id = tasks_with_kinds.task_id
LEFT JOIN intranet.task_group tasksgroups ON tasksgroups.id = tasks.task_group
GROUP BY
so.order_id
HAVING (SUM(AGE(tpp.stop_time, tpp.start_time)) > interval '0 minutes');
我想添加另一个与表的连接,如下所示:
article_group: id, group_name
article_cost: id, group_id (fk to article_group), order_id (fk to sub_orders)
我最终加入了子查询,因为对于某些项目,它计算了同一行两次或更多次
SELECT
so.order_id AS order_id,
MIN(so.status) AS status_id,
SUM(AGE(tpp.stop_time, tpp.start_time)) AS total,
SUM(
CASE WHEN (tasksgroups.id = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS srut,
SUM(
CASE WHEN (tpp.valve_part_id = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS korpus,
SUM(
CASE WHEN (tasks_with_kinds.task_kind = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS zwykle,
SUM(
CASE WHEN (tasks_with_kinds.task_kind = 6) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS wyprawki,
ac.transport,
ac.service
FROM
intranet.sub_orders so
LEFT JOIN intranet.time_per_project tpp ON so.mainorder = tpp.project_id
LEFT JOIN intranet.task_task_kind tasks_with_kinds ON tasks_with_kinds.id = tpp.task
LEFT JOIN intranet.task tasks ON tasks.id = tasks_with_kinds.task_id
LEFT JOIN intranet.task_group tasksgroups ON tasksgroups.id = tasks.task_group
LEFT JOIN (
SELECT
soa.order_id AS ordid,
sum(
CASE WHEN group_id = 14 THEN
COST
END) AS transport,
sum(
CASE WHEN group_id = 11 THEN
COST
END) AS service
FROM
intranet.article_costs
INNER JOIN intranet.sub_orders soa ON soa.mainorder = project_id
GROUP BY
soa.order_id) ac ON ac.ordid = so.order_id
WHERE order_id = 2074
GROUP BY
so.order_id, ac.transport, ac.service
HAVING (SUM(AGE(tpp.stop_time, tpp.start_time)) > interval '0 minutes' OR ac.transport > 0 or ac.service > 0);
不知道您是否认为这个物化视图查询可以? 如果为真,是否可以在没有嵌套连接的子查询的情况下实现相同的行为?
【问题讨论】:
【参考方案1】:关于没有子查询的相同行为
WITH ac as(
SELECT
soa.order_id AS ordid,
sum(
CASE WHEN group_id = 14 THEN
COST
END) AS transport,
sum(
CASE WHEN group_id = 11 THEN
COST
END) AS service
FROM
intranet.article_costs
INNER JOIN intranet.sub_orders soa ON soa.mainorder = project_id
GROUP BY
soa.order_id
)
SELECT
so.order_id AS order_id,
MIN(so.status) AS status_id,
SUM(AGE(tpp.stop_time, tpp.start_time)) AS total,
SUM(
CASE WHEN (tasksgroups.id = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS srut,
SUM(
CASE WHEN (tpp.valve_part_id = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS korpus,
SUM(
CASE WHEN (tasks_with_kinds.task_kind = 1) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS zwykle,
SUM(
CASE WHEN (tasks_with_kinds.task_kind = 6) THEN
AGE(tpp.stop_time, tpp.start_time)
END) AS wyprawki,
ac.transport,
ac.service
FROM
intranet.sub_orders so
LEFT JOIN intranet.time_per_project tpp ON so.mainorder = tpp.project_id
LEFT JOIN intranet.task_task_kind tasks_with_kinds ON tasks_with_kinds.id = tpp.task
LEFT JOIN intranet.task tasks ON tasks.id = tasks_with_kinds.task_id
LEFT JOIN intranet.task_group tasksgroups ON tasksgroups.id = tasks.task_group
LEFT JOIN ac ON ac.ordid = so.order_id
WHERE order_id = 2074
GROUP BY
so.order_id, ac.transport, ac.service
HAVING (SUM(AGE(tpp.stop_time, tpp.start_time)) > interval '0 minutes' OR ac.transport > 0 or ac.service > 0);
不知道你觉得这个物化视图查询没问题?
如果数据或查询时间太大 - 使用具体化(但在此之前 - 优化查询)。
【讨论】:
以上是关于优化来自多个表的连接查询的主要内容,如果未能解决你的问题,请参考以下文章