Oracle开发者性能课第4课(如何创建索引)实验
Posted dingdingfish
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Oracle开发者性能课第4课(如何创建索引)实验相关的知识,希望对你有一定的参考价值。
概述
本实验参考DevGym中的实验指南。
创建环境
首先创建表bricks,并在最后搜集统计信息:
create table bricks (
brick_id integer not null,
colour varchar2(10) not null,
shape varchar2(10) not null,
weight integer not null,
colour_mixed_case varchar2(10) not null,
insert_datetime date not null,
junk varchar2(1000) not null
);
exec dbms_random.seed ( 0 );
insert into bricks
with rws as (
select level x,
case ceil ( level / 250 )
when 4 then 'red'
when 1 then 'blue'
when 2 then 'green'
when 3 then 'yellow'
end colour
from dual
connect by level <= 1000
)
select rownum,
colour,
case mod ( rownum, 4 )
when 0 then 'cube'
when 1 then 'cylinder'
when 2 then 'pyramid'
when 3 then 'prism'
end shape,
round ( dbms_random.value ( 1, 10 ) ),
case mod ( rownum, 3 )
when 0 then upper ( colour )
when 1 then lower ( colour )
when 2 then initcap ( colour )
end mixed_case,
date'2020-01-01' + ( rownum / 12 ),
rpad ( chr ( mod ( rownum, 26 ) + 65 ), 1000, 'x' )
from rws;
commit;
exec dbms_stats.gather_table_stats ( null, 'bricks' ) ;
查看数据,bricks表有1000行:
SQL> select count(*) from bricks;
COUNT(*)
___________
1000
SQL>
SELECT
brick_id,
colour,
shape,
weight,
colour_mixed_case,
insert_datetime
FROM
bricks where rownum <= 9;
BRICK_ID COLOUR SHAPE WEIGHT COLOUR_MIXED_CASE INSERT_DATETIME
___________ _________ ___________ _________ ____________________ __________________
13 blue cylinder 5 blue 02-JAN-20
14 blue pyramid 8 Blue 02-JAN-20
15 blue prism 8 BLUE 02-JAN-20
16 blue cube 5 blue 02-JAN-20
17 blue cylinder 4 Blue 02-JAN-20
18 blue pyramid 3 BLUE 02-JAN-20
19 blue prism 9 blue 02-JAN-20
20 blue cube 2 Blue 02-JAN-20
21 blue cylinder 8 BLUE 02-JAN-20
9 rows selected.
junk列长度为1000,是随机字符串和很多’x’的拼接,没有在结果集中显示。
colour在4种颜色中循环;shape在4种形状中循环;weight是10以内的随机数。colour_mixed_case是颜色的3种大小写转换。
简介
数据库中最常见的数据访问方法为:
- 全表扫描
- 索引访问
本文将比较这两种方法。
全表扫描(Full Table Scan)
由于表目前没有索引,所以所有的SQL都基于全表扫描,标志为TABLE ACCESS FULL,以及Predicate Information部分中的filter:
SQL>
select /*+ gather_plan_statistics */count(*) from bricks where colour = 'red';
COUNT(*)
___________
250
SQL>
select * from table(dbms_xplan.display_cursor( format => 'iosTATS LAST'));
PLAN_TABLE_OUTPUT
__________________________________________________________________________________________
SQL_ID d65kqrh09hm1g, child number 0
-------------------------------------
select /*+ gather_plan_statistics */count(*) from bricks where colour
= 'red'
Plan hash value: 1774413877
---------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 190 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 190 |
|* 2 | TABLE ACCESS FULL| BRICKS | 1 | 250 | 250 |00:00:00.01 | 190 |
---------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("COLOUR"='red')
20 rows selected.
这意味着数据库读取表中的每一行和每个数据块(直到高水位线)。然后应用where子句。并返回条件为真的行。对于返回几行的搜索,检查每一行显然是一种巨大的浪费。
在查询的WHERE子句中为列创建索引使数据库只读取与搜索条件匹配的行。这会减少执行查询所需的工作量。
全表扫描的工作原理参看这里。
创建索引
数据库索引存储索引列中的值,并指向表中对应这些值的行。Oracle数据库中的标准索引是B树索引。使用此方法,数据库可以在索引中搜索与WHERE子句匹配的条目。这会导致工作量大幅度减少。
创建索引需3个要素:
- 索引名
- 建立索引的表
- 逗号分割的索引列
例如:
SQL> create index brick_colour_i on bricks ( colour );
Index BRICK_COLOUR_I created.
重新查询:
SQL> select /*+ gather_plan_statistics */count(*) from bricks where colour = 'red';
COUNT(*)
___________
250
SQL> select * from table(dbms_xplan.display_cursor( format => 'IOSTATS LAST'));
PLAN_TABLE_OUTPUT
__________________________________________________________________________________________________________
SQL_ID d65kqrh09hm1g, child number 0
-------------------------------------
select /*+ gather_plan_statistics */count(*) from bricks where colour
= 'red'
Plan hash value: 2801761771
-------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads |
-------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.01 | 2 | 4 |
| 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 2 | 4 |
|* 2 | INDEX RANGE SCAN| BRICK_COLOUR_I | 1 | 250 | 250 |00:00:00.01 | 2 | 4 |
-------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COLOUR"='red')
20 rows selected.
执行计划中的INDEX RANGE SCAN和Predicate Information中的access表明这次使用了索引,Buffers由190变为2,巨大的节省。
查看索引
SQL>
select * from user_indexes where table_name = 'BRICKS';
SQL>
select index_name, column_name, column_position
from user_ind_columns
where table_name = 'BRICKS'
order by index_name, column_position;
INDEX_NAME COLUMN_NAME COLUMN_POSITION
_________________ ______________ __________________
BRICK_COLOUR_I COLOUR 1
复合索引
复合索引即多列索引。
目前我们已有一个1列的索引,但在下面基于两列的查询时仍发挥了作用(Buffers由190降到46)。首先利用这个索引得到250行(access方法),剩下的使用filter方法。
select /*+ gather_plan_statistics */*
from bricks
where colour = 'red'
and weight = 1;
select *
from table(dbms_xplan.display_cursor( format => 'IOSTATS LAST'));
PLAN_TABLE_OUTPUT
___________________________________________________________________________________________________________________
SQL_ID 5c4qmbrdjdnj9, child number 0
-------------------------------------
select /*+ gather_plan_statistics */* from bricks where colour =
'red' and weight = 1
Plan hash value: 2278089145
----------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 46 |
|* 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BRICKS | 1 | 25 | 14 |00:00:00.01 | 46 |
|* 2 | INDEX RANGE SCAN | BRICK_COLOUR_I | 1 | 250 | 250 |00:00:00.01 | 3 |
----------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("WEIGHT"=1)
2 - access("COLOUR"='red')
21 rows selected.
显然,建立复合索引可进一步提高效率。
create index brick_colour_weight_i on bricks ( colour, weight ) ;
再次执行查询,Buffers由46降到了14:
select /*+ gather_plan_statistics indexed */ *
from bricks
where colour = 'red'
and weight = 1;
select * from table(dbms_xplan.display_cursor( format => 'IOSTATS LAST'));
PLAN_TABLE_OUTPUT
__________________________________________________________________________________________________________________________
SQL_ID 2v44h4xyb20hn, child number 0
-------------------------------------
select /*+ gather_plan_statistics indexed */* from bricks where
colour = 'red' and weight = 1
Plan hash value: 338984659
-----------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
-----------------------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 14 |00:00:00.01 | 14 |
| 1 | TABLE ACCESS BY INDEX ROWID BATCHED| BRICKS | 1 | 25 | 14 |00:00:00.01 | 14 |
|* 2 | INDEX RANGE SCAN | BRICK_COLOUR_WEIGHT_I | 1 | 25 | 14 |00:00:00.01 | 3 |
-----------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COLOUR"='red' AND "WEIGHT"=1)
20 rows selected.
仅使用索引的扫描
以下SQL中查询的列均在复合索引中,因此仅使用索引:
select /*+ gather_plan_statistics */weight, count(*)
from bricks
where colour = 'red'
group by weight;
select * from table(dbms_xplan.display_cursor( format => 'IOSTATS LAST'));
PLAN_TABLE_OUTPUT
___________________________________________________________________________________________________________
SQL_ID 27z1m28bucygk, child number 0
-------------------------------------
select /*+ gather_plan_statistics */weight, count(*) from bricks
where colour = 'red' group by weight
Plan hash value: 2875908788
--------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
--------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 10 |00:00:00.01 | 3 |
| 1 | SORT GROUP BY NOSORT| | 1 | 10 | 10 |00:00:00.01 | 3 |
|* 2 | INDEX RANGE SCAN | BRICK_COLOUR_WEIGHT_I | 1 | 250 | 250 |00:00:00.01 | 3 |
--------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("COLOUR"='red')
20 rows selected.
索引与空值
由于B-Tree索引不考虑所有索引列为空的情况,因此要利用复合索引,所有索引列中至少应有一列为非空。
例如,我们修改表定义让其允许为空:
alter table bricks
modify ( colour null, weight null );
如果插入colour列和weight列均为NULL的1行,则之前那个SQL就无法利用索引了。
此处就不试了,以下恢复表之前的定义:
alter table bricks modify weight not null;
alter table bricks modify colour not null;
复合索引列顺序
索引中列的顺序对其有效性有很大影响。并且影响优化器是否能够使用它!
数据库从左到右搜索索引中的列。为了最有效,请使用索引WHERE子句的前导列。
利用对于索引列(A,B,C),where条件可以利用索引的情形可以是:
- A,B,C
- A,B
- A
下例演示了当SQL未使用前导列时,索引无法利用的情形(Predicate Information中的filter):
select /*+ gather_plan_statistics */colour, count(*)
from bricks
where weight = 1
group by colour;
select *
from table(dbms_xplan.display_cursor(format => 'IOSTATS LAST'));
SQL> select * from table(dbms_xplan.display_cursor( format => 'IOSTATS LAST'));
PLAN_TABLE_OUTPUT
____________________________________________________________________________________________________________
SQL_ID 60u3dax0arpc7, child number 0
-------------------------------------
select /*+ gather_plan_statistics */colour, count(*) from bricks
where weight = 1 group by colour
Plan hash value: 4237521759
---------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
---------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | 4 |00:00:00.01 | 7 |
| 1 | HASH GROUP BY | | 1 | 4 | 4 |00:00:00.01 | 7 |
|* 2 | INDEX FAST FULL SCAN| BRICK_COLOUR_WEIGHT_I | 1 | 100 | 72 |00:00:00.01 | 7 |
---------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("WEIGHT"=1)
20 rows selected.
虽然索引是读取数据的最快方式,但为每个查询创建一个索引会导致索引数量激增。这会增加存储需求,使优化器更难为每个查询选择最佳索引。
将创建的索引数量保持在最低限度。为关键查询保留索引!
唯一索引
创建唯一索引可避免重复数据:
create unique index brick_brick_id_u
on bricks ( brick_id );
insert into bricks
values ( 1, 'red', 'cylinder', 1, 'RED', sysdate, 'stuff' );
Error starting at line : 1 in command -
insert into bricks
values ( 1, 'red', 'cylinder', 1, 'RED', sysdate, 'stuff' )
Error report -
ORA-00001: unique constraint (SSB.BRICK_BRICK_ID_U) violated
当查询涉及的所有索引列都为等值查询时,优化器还可以使用唯一索引执行INDEX UNIQUE SCAN,例如:
select /*+ gather_plan_statistics brick_id */* from bricks where brick_id = 1;
select * from table(dbms_xplan.display_cursor(format => 'IOSTATS LAST'));
PLAN_TABLE_OUTPUT
_____________________________________________________________________________________________________________
SQL_ID fabm3y8smx39f, child number 0
-------------------------------------
select /*+ gather_plan_statistics brick_id */* from bricks where
brick_id = 1
Plan hash value: 4143599005
----------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers |
----------------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | | Oracle开发者性能课第5课(为什么我的查询不使用索引)实验