使用Listagg分析函数优化wmsys.wm_concat
Posted robinson1988
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了使用Listagg分析函数优化wmsys.wm_concat相关的知识,希望对你有一定的参考价值。
在上周末优化班的时候一个朋友拿了一个SQL出来,让我现场优化,因为当时太忙,我安排七年老师帮忙处理。跑得慢的SQL如下:with temp as
(select sgd.detail_id id,
wmsys.wm_concat(distinct(sg.gp_name)) groupnames,
wmsys.wm_concat(distinct(su.user_name)) usernames
from sgd
left join sg
on sg.id = sgd.gp_id
left join sug
on sg.id = sug.gp_id
left join su
on sug.user_id = su.id
group by sgd.detail_id)
select zh.id,
zh.id detailid,
zh.name detailname,
zh.p_level hospitallevel,
zh.type hospitaltype,
dza.name region,
temp.groupnames,
temp.usernames,
(case
when gd.gp_id is null then
0
else
1
end) isalloted
from zh
left join dza
on zh.area_id = dza.id
left join temp
on zh.id = temp.id
left join (select gp_id, detail_id from sys_gp_detail where gp_Id = :0) gd
on zh.id = gd.detail_id order by length(id),zh.id asc
该SQL返回20779行数据,要跑4分32秒。
该执行计划中全是HASH JOIN,我就不贴了。
大家看我分析思路:
1. 首先这SQL最终返回20779行数据,该SQL语句最后部分没有GROUP BY,仅仅是表关联,并且是外连接
2. 那么我可以判定zh也就差不多20779行数据,因为它是外连接的主表
3. 我也可以判定整个SQL里面的表都不大,因为最终只返回20779行数据,并且没有最终是没有GROUP BY
4. 问题来了,既然都是小表,那为啥跑4分32秒?
遇到这种奇怪问题,我喜欢把SQL拆了。并且喜欢拆子查询部分。所以你懂的
我们需要单独跑with as里面的SQL语句,跑了一下,发现居然要跑1--2分钟
with as 的子查询我们单独拿出来看看
select sgd.detail_id id,
wmsys.wm_concat(distinct(sg.gp_name)) groupnames,
wmsys.wm_concat(distinct(su.user_name)) usernames
from sgd
left join sg
on sg.id = sgd.gp_id
left join sug
on sg.id = sug.gp_id
left join su
on sug.user_id = su.id
group by sgd.detail_id
这个子查询里面就多了2个列转行函数wmsys.wm_concat
把它给注释掉单独跑一下列,发现SQL秒杀了
现在基本上定位问题所在,就是这个wmsys.wm_concat列转行函数引起的性能问题
于是我将上面SQL进行了部分拆分
select sgd.detail_id id, wmsys.wm_concat(distinct(sg.gp_name)) groupnames
from sys_gp_detail sgd
left join sys_gp sg on sg.id = sgd.gp_id
group by sgd.detail_id
已用时间: 00: 00: 58.04
执行计划
----------------------------------------------------------
Plan hash value: 3491823204
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20584 | 824K| | 1308 (8)| 00:00:06 |
| 1 | SORT GROUP BY | | 20584 | 824K| 15M| 1308 (8)| 00:00:06 |
|* 2 | HASH JOIN RIGHT OUTER| | 313K| 12M| | 449 (6)| 00:00:02 |
| 3 | TABLE ACCESS FULL | SYS_GP | 3 | 69 | | 3 (0)| 00:00:01 |
| 4 | TABLE ACCESS FULL | SYS_GP_DETAIL | 313K| 5518K| | 438 (5)| 00:00:02 |
------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("SG"."ID"(+)="SGD"."GP_ID")
统计信息
----------------------------------------------------------
1 recursive calls
249348 db block gets
44447 consistent gets
0 physical reads
0 redo size
9993548 bytes sent via SQL*Net to client
6067828 bytes received via SQL*Net from client
83118 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
请注意,select语句居然产生了 db block gets。db block gets一般情况下只有DML语句才会产生
select语句中除了延迟块清除,或者有with as 产生的临时表之外,实在想不通哪里还能产生 db block gets
那么这个SQL跑得慢就是慢在db block gets,相信大家对这块没有异议
要想成为所谓的技术大牛,必须通读官方文档。不管你是搞Oracle,mysql,Hadoop,Java....等等,都必须通读官方文档
Oracle11g/Oracle10.2.0.5之后,wmsys.wm_concat 返回的是Clob,之前返回的是Varchar2
这就是为什么会产生大量的db block gets,知道了这个原因,立即将这个SQL进行等价改写,使用Listagg分析函数代替wmsys.wm_concat
wmsys.wm_concat 函数是可以支持 distinct 的,但是listagg分析函数是不支持 distinct的,所以改写SQL的时候,需要先去重,再进行列转行
select detail_id, listagg(gp_name, ',') within
group(
order by null)
from (select sgd.detail_id, sg.gp_name
from sys_gp_detail sgd
left join sys_gp sg on sg.id = sgd.gp_id
group by sgd.detail_id, sg.gp_name)
group by detail_id;
已用时间: 00: 00: 01.12
执行计划
----------------------------------------------------------
Plan hash value: 147456425
--------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
--------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 20584 | 1547K| | 1467 (7)| 00:00:07 |
| 1 | SORT GROUP BY | | 20584 | 1547K| | 1467 (7)| 00:00:07 |
| 2 | VIEW | VM_NWVW_0 | 43666 | 3283K| | 1467 (7)| 00:00:07 |
| 3 | HASH GROUP BY | | 43666 | 1748K| 15M| 1467 (7)| 00:00:07 |
|* 4 | HASH JOIN RIGHT OUTER| | 313K| 12M| | 449 (6)| 00:00:02 |
| 5 | TABLE ACCESS FULL | SYS_GP | 3 | 69 | | 3 (0)| 00:00:01 |
| 6 | TABLE ACCESS FULL | SYS_GP_DETAIL | 313K| 5518K| | 438 (5)| 00:00:02 |
--------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
4 - access("SG"."ID"(+)="SGD"."GP_ID")
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
2775 consistent gets
0 physical reads
0 redo size
450516 bytes sent via SQL*Net to client
15595 bytes received via SQL*Net from client
1387 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
20779 rows processed
可以看到,SQL等价改写之后,可以秒杀,之前需要58秒。改写完了一个 wmsys.wm_concat ,还有另外一个 wmsys.wm_concat 需要改写。思路一样
select detail_id, listagg(user_name, ',') within
group(
order by null)
from (select sgd.detail_id id, su.user_name
from sgd
left join sg on sg.id = sgd.gp_id
left join sug on sg.id = sug.gp_id
left join su on sug.user_id = su.id
group by sgd.detail_id, su.user_name)
group by detail_id;
最终将改写好的2个结果集合并,得到等价的with as 语句,再替换原始SQL的with as语句,就可以秒杀了。
等价的with as 语句如下:
select a.detail_id id , a.groupnames, b.usernames
from (select detail_id, listagg(gp_name, ',') within
group(
order by null) groupnames
from (select sgd.detail_id, sg.gp_name
from sys_gp_detail sgd
left join sys_gp sg on sg.id = sgd.gp_id
group by sgd.detail_id, sg.gp_name)
group by detail_id) a,
(select detail_id, listagg(user_name, ',') within
group(
order by null) usernames
from (select sgd.detail_id, su.user_name
from sgd
left join sg on sg.id = sgd.gp_id
left join sug on sg.id = sug.gp_id
left join su on sug.user_id = su.id
group by sgd.detail_id, su.user_name)
group by detail_id) b
where a.. detail_id = b.detail_id;
结语: 道森学院 建议各位开发人员以后在开发过程中,尽量使用listagg函数代替wmsys.wm_concat
以上是关于使用Listagg分析函数优化wmsys.wm_concat的主要内容,如果未能解决你的问题,请参考以下文章
oracle行转列,列转行函数的使用(listagg,xmlagg)