Group BY 查询使用索引,但窗口函数查询不使用
Posted
技术标签:
【中文标题】Group BY 查询使用索引,但窗口函数查询不使用【英文标题】:Group BY query utilizes indexes, but window function query doesn't 【发布时间】:2021-12-09 03:41:25 【问题描述】:我使用 IBM 的一个名为 Maximo Asset Management 的 COTS 系统。系统有一个 WORKORDER 表,有 350,000 行。
Maximo 有一个名为relationships 的概念,可用于从相关记录中提取数据。
关系如何运作:
对于每个单独的 WORKORDER 记录,系统使用关系中的 WHERE 子句运行选择查询以提取相关记录 (screenshot)。
相关记录:
在这种情况下,相关记录是名为 WOTASKROLLUP_VW 的自定义数据库视图中的行。
在一篇相关文章中,我探讨了可以在视图中使用的不同 SQL 汇总技术:Group by x, get other fields too。当我在完整的 WORKORDER 表上运行它们时,我探索的选项的性能彼此相似。
然而,实际上,Maximo 的设计目的是一次只能获取一行——通过单独的 select 语句。因此,当只选择一条 WORKORDER 记录时,查询的表现非常不同。
我已经将每个查询包装在一个外部查询中,其中包含一个选择特定工作订单的 WHERE 子句。我这样做是为了模仿 Maximo 在使用关系时所做的事情。
查询 1b:(GROUP BY;选择性聚合)
性能非常好,即使只选择一条记录,因为使用了索引(仅 37 毫秒)。
select
*
from
(
select
wogroup as wonum,
sum(actlabcost) as actlabcost_tasks_incl,
sum(actmatcost) as actmatcost_tasks_incl,
sum(acttoolcost) as acttoolcost_tasks_incl,
sum(actservcost) as actservcost_tasks_incl,
sum(actlabcost + actmatcost + acttoolcost + actservcost) as acttotalcost_tasks_incl,
max(case when istask = 0 then rowstamp end) as other_wo_columns
from
maximo.workorder
group by
wogroup
)
where
wonum in ('WO360996')
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 34 | 4 (0)| 00:00:01 |
| 1 | SORT GROUP BY NOSORT | | 1 | 34 | 4 (0)| 00:00:01 |
| 2 | TABLE ACCESS BY INDEX ROWID| WORKORDER | 1 | 34 | 4 (0)| 00:00:01 |
|* 3 | INDEX RANGE SCAN | WORKORDER_NDX32 | 1 | | 3 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - access("WOGROUP"='WO360996')
查询 #2:(SUM 窗口函数)
选择单个记录时,性能相对慢,因为没有使用索引(3 秒)。
select
*
from
(
select
wonum,
actlabcost_tasks_incl,
actmatcost_tasks_incl,
acttoolcost_tasks_incl,
actservcost_tasks_incl,
acttotalcost_tasks_incl,
other_wo_columns
from
(
select
wonum,
istask,
sum(actlabcost ) over (partition by wogroup) as actlabcost_tasks_incl,
sum(actmatcost ) over (partition by wogroup) as actmatcost_tasks_incl,
sum(acttoolcost) over (partition by wogroup) as acttoolcost_tasks_incl,
sum(actservcost) over (partition by wogroup) as actservcost_tasks_incl,
sum(actlabcost + actmatcost + acttoolcost + actservcost) over (partition by wogroup) as acttotalcost_tasks_incl,
rowstamp as other_wo_columns
from
maximo.workorder
)
where
istask = 0
)
where
wonum in ('WO360996')
-----------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 355K| 61M| | 14789 (1)| 00:00:01 |
|* 1 | VIEW | | 355K| 61M| | 14789 (1)| 00:00:01 |
| 2 | WINDOW SORT | | 355K| 14M| 21M| 14789 (1)| 00:00:01 |
| 3 | TABLE ACCESS FULL| WORKORDER | 355K| 14M| | 10863 (2)| 00:00:01 |
-----------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("WONUM"='WO360996' AND "ISTASK"=0)
问题:
为什么#1B 中的 GROUP BY 查询可以使用索引(快),但 #2 中的 Sum Window Function 不能使用索引(慢)?
【问题讨论】:
【参考方案1】:您的两个查询与您使用的第一个查询不同:
select wogroup as wonum,
你刚刚使用的第二个:
select wonum,
这意味着您不会在 WOGROUP
上使用索引,因为您正在过滤 WONUM
列而不是 WOGROUP
列(恰好已别名为 WONUM
)。
看起来您的第二个查询可以被纠正和减少(通过将过滤器移动到内部子查询并摆脱分区,因为您已经在过滤):
select wonum,
actlabcost_tasks_incl,
actmatcost_tasks_incl,
acttoolcost_tasks_incl,
actservcost_tasks_incl,
acttotalcost_tasks_incl,
other_wo_columns
from (
select wogroup AS wonum,
istask,
sum(actlabcost ) over () as actlabcost_tasks_incl,
sum(actmatcost ) over () as actmatcost_tasks_incl,
sum(acttoolcost) over () as acttoolcost_tasks_incl,
sum(actservcost) over () as actservcost_tasks_incl,
sum(actlabcost + actmatcost + acttoolcost + actservcost) over () as acttotalcost_tasks_incl,
rowstamp as other_wo_columns
from maximo.workorder
where wogroup = 'WO360996'
)
where istask = 0;
【讨论】:
以上是关于Group BY 查询使用索引,但窗口函数查询不使用的主要内容,如果未能解决你的问题,请参考以下文章
SQLite - 是不是可以在同一个查询中使用 group_concat 函数但 'GROUP BY' 不同的标准?