Oracle 避免浪费性的加入回来?

Posted

技术标签:

【中文标题】Oracle 避免浪费性的加入回来?【英文标题】:Oracle Avoid Wasteful Join Back? 【发布时间】:2021-09-17 13:54:55 【问题描述】:

假设我们在下面的人为示例中定义了三个表(A、B 和 C),其中 A 和 B 与 C 相关(在其中有外键)。 假设,我们想要来自所有三个表的值和 A 和 B 的谓词。 Oracle 在任何时候只能将两个行集连接在一起。 我们得到一个类似于 ((A -> C) -> B) 的连接顺序。 这意味着我们花费 I/O 从 C 获取行,当我们重新连接到 B(和 B 的谓词)时,我们最终只是丢弃了这些行。

我们如何避免表 C 上的这种“浪费”I/O?

星形转换很棒,但只有在优化器确定成本证明星形转换合理时才会生效。 也就是说,我们不能保证得到星形变换。 这可能看起来像一个人想要的,但优化器正在获得较差的估计行(参见下面的示例 - 相差 10 倍)。 因此,优化选择不使用星形变换,否则它会被证明是有益的。

我们不能像 from 那样在星形转换中手动编写查询,因为 SQL 是由 BI 报告工具生成的。

也许我的问题是如何“强制”优化器使用星形转换而不用手动编写该形式的查询? 或者,也许,我的问题是如何让估计的行更好,这样我们就可以更加确信优化器会调用星形转换? 或者,也许(很可能)还有其他一些我还不知道的很酷的 Oracle 特性可能会提供解决方案。

Oracle 12.1 企业版(但几个月后升级到 19.1) 提前致谢。

drop table join_back_c;
drop table join_back_a;
drop table join_back_b;

create table join_back_a
  as
  with "D" as (select 1 from dual connect by level <= 1000)
    select rownum                  a_code
           , rpad('x',100)         a_name
      from "D"
;

create unique index IX_join_back_a_code on join_back_a(a_code); 
alter table join_back_a add constraint PK_dan_join_back_a primary key (a_code);

create table join_back_b
  as
  with "D" as (select /*+ materialize */ 1 from dual connect by level <= 320)
    select  rownum                b_id
          , mod(rownum, 10)       b_group
    from "D", "D"
   where rownum <= 100000 --100k
;

create unique index IX_join_back_b_id on join_back_b(b_id);   
create index IX_join_back_b_group on join_back_b(b_group); 
alter table join_back_b add constraint PK_dan_join_back_b primary key (b_id);

create table join_back_c
  as
  with "D" as (select /*+ materialize */ level from dual connect by level <= 3200)
    select  rownum                              c_id
          , trunc(dbms_random.value(1, 1000))   a_code     --table a FK
          , trunc(dbms_random.value(1, 100000)) b_id       --table b FK
    from "D", "D"
   where rownum <= 1000000 -- 1M
;

create index IR_join_back_c_a_code on join_back_c(a_code);
create index IR_join_back_c_b_id on join_back_c(b_id);  

exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_C');
exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_A');
exec dbms_stats.gather_table_stats('DATA','JOIN_BACK_B');
select *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;
--------------------------------------------------------------------------------------------------------
| id  | Operation                      | name                  | rows  | Bytes | cost (%CPU)| time     |
--------------------------------------------------------------------------------------------------------
|   0 | select statement               |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|*  1 |  hash join                     |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|   2 |   nested LOOPS                 |                       |       |       |            |          |
|   3 |    nested LOOPS                |                       |  1001 |   116K|   839   (1)| 00:00:01 |
|   4 |     table access by index ROWID| JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|*  5 |      index range scan          | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     index range scan           | IR_JOIN_BACK_C_A_CODE |  1001 |       |     4   (0)| 00:00:01 |
|   7 |    table access by index ROWID | JOIN_BACK_C           |  1001 | 14014 |   837   (1)| 00:00:01 |
|*  8 |   table access full            | JOIN_BACK_B           | 10000 | 80000 |   143   (5)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."B_ID"="C"."B_ID")
   5 - access("A"."A_CODE"=1)
   6 - access("C"."A_CODE"=1)
   8 - filter("B"."B_GROUP"=1)
select count(*) 
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;  -- about 100 rows

加入顺序:((A -> C) -> B)

A -> C(第 3 步)的准确估计行数约为 1k。

第 8 步的估计也很准确。

但是,此与 B 的连接(步骤 1)只会进一步减少步骤 3 中的 1k 行集。在这种情况下,B 的谓词将 (A -> C) 行集减少了 1/10。 这意味着我们从 C 中访问了 1000 行,只是为了丢弃其中的 900 行。

select /*+ star_transformation */
       *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1
;
--------------------------------------------------------------------------------------------------------
| id  | Operation                      | name                  | rows  | Bytes | cost (%CPU)| time     |
--------------------------------------------------------------------------------------------------------
|   0 | select statement               |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|*  1 |  hash join                     |                       |  1001 |   124K|   983   (2)| 00:00:01 |
|   2 |   nested LOOPS                 |                       |       |       |            |          |
|   3 |    nested LOOPS                |                       |  1001 |   116K|   839   (1)| 00:00:01 |
|   4 |     table access by index ROWID| JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|*  5 |      index range scan          | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|*  6 |     index range scan           | IR_JOIN_BACK_C_A_CODE |  1001 |       |     4   (0)| 00:00:01 |
|   7 |    table access by index ROWID | JOIN_BACK_C           |  1001 | 14014 |   837   (1)| 00:00:01 |
|*  8 |   table access full            | JOIN_BACK_B           | 10000 | 80000 |   143   (5)| 00:00:01 |
--------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("B"."B_ID"="C"."B_ID")
   5 - access("A"."A_CODE"=1)
   6 - access("C"."A_CODE"=1)
   8 - filter("B"."B_GROUP"=1)

我正在寻找类似于以下内容的执行路径。 尽管下面估计有 10M 行,但此查询的行数仍保持在 100 左右。 但是,我们无法将生成的 SQL 控制到这种程度。 这就是上面所说的在星形转换中手动编写查询,例如 from。

select *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where C.rowid in ( select C1.rowid 
                      from join_back_C "C1"
                           join join_back_a "A1"
                                on C1.a_code = A1.a_code
                     where A1.a_code = 1
                    intersect
                    select C2.rowid 
                      from join_back_C "C2"
                           join join_back_b "B1"
                                on C2.b_id = B1.b_id
                     where B1.b_group = 1                  
                  )
;
---------------------------------------------------------------------------------------------------------------
| id  | Operation                     | name                  | rows  | Bytes |TempSpc| cost (%CPU)| time     |
---------------------------------------------------------------------------------------------------------------
|   0 | select statement              |                       |  9928K|  1316M|       |  4649  (17)| 00:00:01 |
|*  1 |  hash join                    |                       |  9928K|  1316M|       |  4649  (17)| 00:00:01 |
|   2 |   table access full           | JOIN_BACK_A           |  1000 |   102K|       |    16   (0)| 00:00:01 |
|*  3 |   hash join                   |                       |  9928K|   321M|       |  4320  (11)| 00:00:01 |
|   4 |    table access full          | JOIN_BACK_B           |   100K|   781K|       |   142   (5)| 00:00:01 |
|   5 |    nested LOOPS               |                       |    10M|   248M|       |  3858   (3)| 00:00:01 |
|   6 |     view                      | VW_NSO_1              |  1001 | 12012 |       |  2855   (4)| 00:00:01 |
|   7 |      INTERSECTION             |                       |       |       |       |            |          |
|   8 |       SORT UNIQUE             |                       |  1001 | 18018 |       |            |          |
|   9 |        NESTED LOOPS           |                       |  1001 | 18018 |       |     5   (0)| 00:00:01 |
|* 10 |         INDEX RANGE SCAN      | IX_JOIN_BACK_A_CODE   |     1 |     4 |       |     1   (0)| 00:00:01 |
|* 11 |         INDEX RANGE SCAN      | IR_JOIN_BACK_C_A_CODE |  1001 | 14014 |       |     4   (0)| 00:00:01 |
|  12 |       SORT UNIQUE             |                       | 99191 |  2131K|  3120K|            |          |
|* 13 |        HASH JOIN              |                       | 99191 |  2131K|       |  1789   (5)| 00:00:01 |
|* 14 |         TABLE ACCESS FULL     | JOIN_BACK_B           | 10000 | 80000 |       |   143   (5)| 00:00:01 |
|  15 |         INDEX FAST FULL SCAN  | IR_JOIN_BACK_C_B_ID   |  1000K|    13M|       |  1614   (3)| 00:00:01 |
|  16 |     TABLE ACCESS BY USER ROWID| JOIN_BACK_C           | 10000 |   136K|       |     1   (0)| 00:00:01 |
---------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("A"."A_CODE"="C"."A_CODE")
   3 - access("B"."B_ID"="C"."B_ID")
  10 - access("A1"."A_CODE"=1)
  11 - access("C1"."A_CODE"=1)
  13 - access("C2"."B_ID"="B1"."B_ID")
  14 - filter("B1"."B_GROUP"=1)

尝试将表 C 上的两个外键索引转换为位图索引 - 不走运。此外,尝试了表 C(a_code, b_id) 上的复合索引 - 再次,没有运气。此外,复合索引并不可取,因为我们的表 C 确实有很多外键(一些代理和一些自然键)。

【问题讨论】:

【参考方案1】:

星形转换似乎有一个用于谓词选择性的金发姑娘区,而您的谓词要么过于选择性,要么选择性不够。

根据Data Warehousing Guide, section 4.5.2.5 How Oracle Chooses to Use Star Transformation:

如果查询需要访问 事实表,最好使用全表扫描而不使用 的转变。但是,如果约束谓词 维度表具有足够的选择性,只有一小部分 必须检索事实表,计划基于 转换可能会更好。

谓词a.a_code = 1 是主键上的相等条件。读取唯一索引几乎​​总是尽可能快地进行操作,并且 Oracle 将始终尽可能选择该路径。另一方面,谓词 b.b_group = 1 将选择 10% 的行,这是在全表扫描区域中,并且不是您希望在子查询中重复运行的操作。

在您的示例中,当我注释掉唯一索引和主键时:

--create unique index IX_join_back_a_code on join_back_a(a_code); 
--alter table join_back_a add constraint PK_dan_join_back_a primary key (a_code);

并更改 10% 的选择性:

      , mod(rownum, 10)       b_group

到 0.1% 的选择性:

      , mod(rownum, 1000)       b_group

我可以在我的 19c 数据库上进行星形转换:

alter session set star_transformation_enabled=true;

explain plan for
select /*+ star_transformation */ *
  from join_back_a "A"
       join join_back_c "C"
         on A.a_code = C.a_code
       join join_back_b "B"
         on B.b_id = C.b_id
where a.a_code = 1
      and b.b_group = 1;

select * from table(dbms_xplan.display);
Plan hash value: 3923125903

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                                | Name                       | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                         |                            |     1 |   153 |   126   (1)| 00:00:01 |
|   1 |  TEMP TABLE TRANSFORMATION               |                            |       |       |            |          |
|   2 |   LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9D66A0_377EE48 |       |       |            |          |
|   3 |    TABLE ACCESS BY INDEX ROWID BATCHED   | JOIN_BACK_B                |   100 |   900 |   101   (0)| 00:00:01 |
|*  4 |     INDEX RANGE SCAN                     | IX_JOIN_BACK_B_GROUP       |   100 |       |     1   (0)| 00:00:01 |
|*  5 |   HASH JOIN                              |                            |     1 |   153 |    25   (4)| 00:00:01 |
|*  6 |    VIEW                                  | VW_ST_D5F377AC             |     1 |    39 |    16   (7)| 00:00:01 |
|   7 |     NESTED LOOPS                         |                            |     1 |    28 |    14   (8)| 00:00:01 |
|   8 |      BITMAP CONVERSION TO ROWIDS         |                            |       |    13 |    14   (8)| 00:00:01 |
|   9 |       BITMAP AND                         |                            |       |       |            |          |
|  10 |        BITMAP CONVERSION FROM ROWIDS     |                            |       |       |            |          |
|* 11 |         INDEX RANGE SCAN                 | IR_JOIN_BACK_C_A_CODE      |       |       |     5   (0)| 00:00:01 |
|  12 |        BITMAP MERGE                      |                            |       |       |            |          |
|  13 |         BITMAP KEY ITERATION             |                            |       |       |            |          |
|  14 |          TABLE ACCESS FULL               | SYS_TEMP_0FD9D66A0_377EE48 |   100 |   500 |     2   (0)| 00:00:01 |
|  15 |          BITMAP CONVERSION FROM ROWIDS   |                            |       |       |            |          |
|* 16 |           INDEX RANGE SCAN               | IR_JOIN_BACK_C_B_ID        |       |       |     3   (0)| 00:00:01 |
|  17 |      TABLE ACCESS BY USER ROWID          | JOIN_BACK_C                |     1 |    14 |     2   (0)| 00:00:01 |
|  18 |    MERGE JOIN CARTESIAN                  |                            |   100 | 11400 |     9   (0)| 00:00:01 |
|* 19 |     TABLE ACCESS FULL                    | JOIN_BACK_A                |     1 |   105 |     7   (0)| 00:00:01 |
|  20 |     BUFFER SORT                          |                            |   100 |   900 |     2   (0)| 00:00:01 |
|  21 |      TABLE ACCESS FULL                   | SYS_TEMP_0FD9D66A0_377EE48 |   100 |   900 |     2   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   4 - access("B"."B_GROUP"=1)
   5 - access("C0"="ITEM_2" AND "A"."A_CODE"="ITEM_1")
   6 - filter("ITEM_1"=1)
  11 - access("C"."A_CODE"=1)
  16 - access("C"."B_ID"="C0")
  19 - filter("A"."A_CODE"=1)
 
Note
-----
   - star transformation used for this statement

我不确定这个答案是否会帮助您改进查询,但希望它至少可以帮助解释为什么您的查询没有按照您想要的方式运行。

【讨论】:

【参考方案2】:

除了 Jon Heller 的回答:

另一方面,谓词 b.b_group = 1 将选择 10% 的行,这是在全表扫描范围内,并且不是您希望在子查询中重复运行的操作。

mod(rownum, 10) b_group 不仅提供 10% 的选择性,还意味着在您的测试用例中,每个表块都包含几十个这样的行:

SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_b;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                                177
SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_b where b_group=1;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                                177

SQL> select min(cnt), max(cnt), avg(cnt)
  2  from (
  3    select dbms_rowid.rowid_block_number(rowid) block_n, count(*) cnt
  4    from join_back_b
  5    where b_group=1
  6    group by dbms_rowid.rowid_block_number(rowid)
  7  );

  MIN(CNT)   MAX(CNT)   AVG(CNT)
---------- ---------- ----------
        49         62 56.4971751

它为我们提供了来自 B 的 10000 行,b.b_id=c.b_id 谓词为我们提供了来自 JOIN_BACK_C 的约 10% 的选择性,并且还意味着 JOIN_BACK_C 的每个块都包含几十个所需的行:

SQL> select count(distinct dbms_rowid.rowid_block_number(rowid)) from join_back_c;

COUNT(DISTINCTDBMS_ROWID.ROWID_BLOCK_NUMBER(ROWID))
---------------------------------------------------
                                               2597

select min(cnt), max(cnt), avg(cnt), count(distinct block_n)
from (
  select dbms_rowid.rowid_block_number(c.rowid) block_n, count(*) cnt 
  from join_back_b b
       join join_back_c c
       on b.b_id=c.b_id
  where b_group=1 
  group by dbms_rowid.rowid_block_number(c.rowid)
);

  MIN(CNT)   MAX(CNT)   AVG(CNT) COUNT(DISTINCTBLOCK_N)
---------- ---------- ---------- ----------------------
         8         57 38.6334232                   2597

此外,join_back_c.a_code=1 的选择性也很差~ 1/1000 = 1000 行随机块,而该表仅包含 ~2500 个块。所以你需要扫描 1/2.5 =~ 40% 的表块。显然,最好使用多块读取。

但是,如果我们回到主要问题:是的,我理解您的问题 - 有时最好将一个行源拆分为 2 个不同的访问路径,而 CBO 通常无法做到这一点。这种情况有一种标准方法 - 重写查询并重复行源两次,例如:

稍微修改测试数据以获得更好的选择性/减少 IO:

create table join_back_b
  as
  with "D" as (select /*+ materialize */ 1 from dual connect by level <= 320)
    select  rownum                b_id
          , mod(rownum, 1000)     b_group
    from "D", "D"
   where rownum <= 100000 --100k
   order by b_group
;

和+padding(使行更大):

create table join_back_c
  as
  with "D" as (select /*+ materialize */ level from dual connect by level <= 3200)
    select  rownum                              c_id
          , trunc(dbms_random.value(1, 1000))   a_code     --table a FK
          , trunc(dbms_random.value(1, 100000)) b_id       --table b FK
          , rpad('x',100,'x') padding
    from "D", "D"
   where rownum <= 1000000 -- 1M
;

例子:

with
 ac as (
  select c.rowid rid
        ,a.*
  from join_back_a A
       join join_back_c C
         on A.a_code = C.a_code
  where a.a_code = 1
 )
,bc as (
  select c.rowid rid
        ,b.*
  from join_back_b B
       join join_back_c C
         on b.b_id = c.b_id
  where b.b_group = 1
)
select--+ no_adaptive_plan NO_ELIMINATE_JOIN(c) no_merge(ac) no_merge(bc) 
   *
  from ac 
       join bc on ac.rid=bc.rid
       join join_back_c C
         on bc.rid = c.rowid;

计划:

Plan hash value: 3065703407

-----------------------------------------------------------------------------------------------------------------
| Id  | Operation                               | Name                  | Rows  | Bytes | Cost (%CPU)| Time     |
-----------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                        |                       |     1 |   230 |   209   (0)| 00:00:01 |
|   1 |  NESTED LOOPS                           |                       |     1 |   230 |   209   (0)| 00:00:01 |
|*  2 |   HASH JOIN                             |                       |     1 |   115 |   208   (0)| 00:00:01 |
|   3 |    VIEW                                 |                       |   992 | 37696 |   202   (0)| 00:00:01 |
|   4 |     NESTED LOOPS                        |                       |   992 | 25792 |   202   (0)| 00:00:01 |
|   5 |      TABLE ACCESS BY INDEX ROWID BATCHED| JOIN_BACK_B           |   100 |   900 |     2   (0)| 00:00:01 |
|*  6 |       INDEX RANGE SCAN                  | IX_JOIN_BACK_B_GROUP  |   100 |       |     1   (0)| 00:00:01 |
|*  7 |      INDEX RANGE SCAN                   | IR_JOIN_BACK_C_B_ID   |    10 |   170 |     2   (0)| 00:00:01 |
|   8 |    VIEW                                 |                       |  1001 | 77077 |     6   (0)| 00:00:01 |
|   9 |     NESTED LOOPS                        |                       |  1001 |   118K|     6   (0)| 00:00:01 |
|  10 |      TABLE ACCESS BY INDEX ROWID        | JOIN_BACK_A           |     1 |   105 |     2   (0)| 00:00:01 |
|* 11 |       INDEX UNIQUE SCAN                 | IX_JOIN_BACK_A_CODE   |     1 |       |     1   (0)| 00:00:01 |
|* 12 |      INDEX RANGE SCAN                   | IR_JOIN_BACK_C_A_CODE |  1001 | 16016 |     4   (0)| 00:00:01 |
|  13 |   TABLE ACCESS BY USER ROWID            | JOIN_BACK_C           |     1 |   115 |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - access("AC"."RID"="BC"."RID")
   6 - access("B"."B_GROUP"=1)
   7 - access("B"."B_ID"="C"."B_ID")
  11 - access("A"."A_CODE"=1)
  12 - access("C"."A_CODE"=1)

【讨论】:

以上是关于Oracle 避免浪费性的加入回来?的主要内容,如果未能解决你的问题,请参考以下文章

避免Linux内存浪费:Facebook开发新的THP收缩机制

如何避免在基于 B-tree 的类似 STL 的映射中浪费键复制?

vagrant使用简介

新手建站如何选择合适配置的服务器,避免资源浪费?

新手建站如何选择合适配置的服务器,避免资源浪费?

加入线程:“避免资源死锁”