A.COLUMN LIKE B.COLUMN% 关联的优化方法
Posted robinson1988
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了A.COLUMN LIKE B.COLUMN% 关联的优化方法相关的知识,希望对你有一定的参考价值。
现在有个SQL要跑10秒:
SQL> select a0.id,
2 a1.room_no,
3 a1.user_name,
4 a1.user_no,
5 row_number() over(partition by a0.id order by a1.room_enter_time desc) as fn
6 from vid_attachment a0
7 inner join vid_room_log a1
8 on a0.file_name like a1.room_md5 || '%'
9 where a0.room_no is null
10 and a1.room_md5 is not null;
未选定行
已用时间: 00: 00: 10.53
执行计划
----------------------------------------------------------
Plan hash value: 374412539
----------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 728K| 146M| | 116K (1)| 00:23:16 |
| 1 | WINDOW SORT | | 728K| 146M| 162M| 116K (1)| 00:23:16 |
| 2 | NESTED LOOPS | | 728K| 146M| | 82835 (1)| 00:16:35 |
|* 3 | TABLE ACCESS FULL| VID_ATTACHMENT | 592 | 74000 | | 384 (1)| 00:00:05 |
|* 4 | TABLE ACCESS FULL| VID_ROOM_LOG | 1231 | 103K| | 139 (0)| 00:00:02 |
----------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
3 - filter("A0"."ROOM_NO" IS NULL)
4 - filter("A1"."ROOM_MD5" IS NOT NULL AND "A0"."FILE_NAME" LIKE
"A1"."ROOM_MD5"||'%')
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
305333 consistent gets
1320 physical reads
0 redo size
524 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
0 rows processed
这个SQL两个表关联条件是a0.file_name like a1.room_md5 || '%'
LIKE,INSERT,SUBSTR 等变长模糊匹配,只能走NL,不能走HASH
执行计划中,ID=3 VID_ATTACHMENT过滤之后剩下30091条数据:
SQL> select count(*) from VID_ATTACHMENT where room_no is not null;
COUNT(*)
----------
30091
VID_ROOM_LOG 是NL被驱动表,它走的是全表扫描,要被扫描30091次,这就是为啥SQL要跑10秒钟
现在将SQL等价改写:
SQL> select a0.id,
2 a1.room_no,
3 a1.user_name,
4 a1.user_no,
5 row_number() over(partition by a0.id order by a1.room_enter_time desc) as fn
6 from (select a.*, b.min_len
7 from vid_attachment a,
8 (select min(length(room_md5)) min_len from vid_room_log) b) a0
9 inner join (select a.*, min(length(room_md5)) over() min_len
10 from vid_room_log a) a1
11 on a0.file_name like a1.room_md5 || '%'
12 and substr(a0.file_name, 1, a0.min_len) =
13 substr(a1.room_md5, 1, a1.min_len)
14 where a0.room_no is null
15 and a1.room_md5 is not null;
未选定行
已用时间: 00: 00: 00.07
执行计划
----------------------------------------------------------
Plan hash value: 413666598
----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7288 | 2142K| | 1053 (1)| 00:00:13 |
| 1 | WINDOW SORT | | 7288 | 2142K| 2344K| 1053 (1)| 00:00:13 |
|* 2 | HASH JOIN | | 7288 | 2142K| | 577 (1)| 00:00:07 |
| 3 | NESTED LOOPS | | 592 | 81696 | | 435 (1)| 00:00:06 |
| 4 | VIEW | | 1 | 13 | | 51 (0)| 00:00:01 |
| 5 | SORT AGGREGATE | | 1 | 39 | | | |
| 6 | INDEX FAST FULL SCAN| IDX_ROOMMD5 | 24623 | 937K| | 51 (0)| 00:00:01 |
|* 7 | TABLE ACCESS FULL | VID_ATTACHMENT | 592 | 74000 | | 384 (1)| 00:00:05 |
|* 8 | VIEW | | 24623 | 3919K| | 141 (1)| 00:00:02 |
| 9 | WINDOW BUFFER | | 24623 | 2067K| | 141 (1)| 00:00:02 |
| 10 | TABLE ACCESS FULL | VID_ROOM_LOG | 24623 | 2067K| | 141 (1)| 00:00:02 |
----------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access(SUBSTR("A"."FILE_NAME",1,INTERNAL_FUNCTION("B"."MIN_LEN"))=SUBSTR("A1"."ROOM_M
D5",1,INTERNAL_FUNCTION("A1"."MIN_LEN")))
filter("A"."FILE_NAME" LIKE "A1"."ROOM_MD5"||'%')
7 - filter("A"."ROOM_NO" IS NULL)
8 - filter("A1"."ROOM_MD5" IS NOT NULL)
统计信息
----------------------------------------------------------
0 recursive calls
0 db block gets
2017 consistent gets
0 physical reads
0 redo size
524 bytes sent via SQL*Net to client
405 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
0 rows processed
在原有的关联条件a0.file_name like a1.room_md5 || '%' 上面 再加上
substr(a0.file_name, 1, a0.min_len) =substr(a1.room_md5, 1, a1.min_len)
让两个表可以走HASH,SQL就可以秒杀了
如果SQL是:
select a0.id,
a1.room_no,
a1.user_name,
a1.user_no,
row_number() over(partition by a0.id order by a1.room_enter_time desc) as fn
from vid_attachment a0
inner join vid_room_log a1
on a0.file_name like '%' || a1.room_md5 || '%'
where a0.room_no is null
and a1.room_md5 is not null;
这种情况无解,无法优化
最后我想说的是,关系型数据库本质就是让你来=值关联的,不是让你来模糊关联的,表设计的时候就应该杜绝模糊关联
以上是关于A.COLUMN LIKE B.COLUMN% 关联的优化方法的主要内容,如果未能解决你的问题,请参考以下文章