一个RDBMS左连接SQL执行计划解析

Posted ChavinKing

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了一个RDBMS左连接SQL执行计划解析相关的知识,希望对你有一定的参考价值。

1、测试数据如下:

SQL> select * from t1;
  a | b  | c
---+----+---
  1 | 10 | 1
  2 | 20 | 2
  3 | 30 | 3
  4 | 40 | 4
  5 | 50 | 5
  6 | 60 | 6
(6 rows)

SQL> select * from t2;
  a | b  | d
---+----+---
  1 | 10 | 1
  2 | 20 | 2
  3 | 30 | 3
(3 rows)


2、解析示例SQL 如下 :

select *
from (
select * from t1 where c >= 2
) t1 left join (
select * from t2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1
where t1.b < 50
;


3、Oracle数据库查看执行结果及执行计划:

SQL> select *
from (
select * from t1 where c >= 2
) t1 left join (
select * from t2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1
where t1.b < 50
;

     A        B           C      A         B        D
---------- ---------- ---------- ---------- ---------- ----------
      2       20           2      2        20        2
      3       30           3
      4       40           4


Execution Plan
----------------------------------------------------------
Plan hash value: 1823443478

---------------------------------------------------------------------------
| Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time      |
---------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |      |    3 |   234 |    7  (15)| 00:00:01 |
|*  1 |  HASH JOIN OUTER   |      |    3 |   234 |    7  (15)| 00:00:01 |
|*  2 |   TABLE ACCESS FULL| T1   |    3 |   117 |    3   (0)| 00:00:01 |
|*  3 |   TABLE ACCESS FULL| T2   |    1 |    39 |    3   (0)| 00:00:01 |
---------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("T1"."A"="T2"."A"(+))
    2 - filter("T1"."B"<50 AND "C">=2)
    3 - filter("T2"."D"(+)>1 AND "B"(+)<30)

Note
-----
    - dynamic sampling used for this statement (level=2)


Statistics
----------------------------------------------------------
       0  recursive calls
       0  db block gets
       7  consistent gets
       0  physical reads
       0  redo size
     926  bytes sent via SQL*Net to client
     523  bytes received via SQL*Net from client
       2  SQL*Net roundtrips to/from client
       0  sorts (memory)
       0  sorts (disk)
       3  rows processed

     
4、PGSQL数据库查看执行结果及执行计划:
      
postgres=# select *
postgres-# from (
postgres(# select * from t1 where c >= 2
postgres(# ) t1 left join (
postgres(# select * from t2 where b < 30
postgres(# ) t2 on t1.a = t2.a
postgres-# and t2.d > 1
postgres-# where t1.b < 50
postgres-# ;
  a | b  | c | a | b  | d
---+----+---+---+----+---
  2 | 20 | 2 | 2 | 20 | 2
  3 | 30 | 3 |   |    | 
  4 | 40 | 4 |   |    | 
(3 rows)


postgres=# explain analyze select *
postgres-# from (
postgres(# select * from t1 where c >= 2
postgres(# ) t1 left join (
postgres(# select * from t2 where b < 30
postgres(# ) t2 on t1.a = t2.a
postgres-# and t2.d > 1
postgres-# where t1.b < 50
postgres-# ;
                                                  QUERY PLAN                                                
------------------------------------------------------------------------------------------------------------
  Hash Left Join  (cost=37.04..85.88 rows=197 width=24) (actual time=0.020..0.027 rows=3 loops=1)
    Hash Cond: ("outer".a = "inner".a)
    ->  Seq Scan on t1  (cost=0.00..36.55 rows=197 width=12) (actual time=0.005..0.008 rows=3 loops=1)
          Filter: ((c >= 2) AND (b < 50))
    ->  Hash  (cost=36.55..36.55 rows=197 width=12) (actual time=0.006..0.006 rows=1 loops=1)
          ->  Seq Scan on t2  (cost=0.00..36.55 rows=197 width=12) (actual time=0.002..0.003 rows=1 loops=1)
                Filter: ((b < 30) AND (d > 1))
  Total runtime: 0.052 ms
(8 rows)
      

5、mysql数据库查看执行结果及执行计划:
      
mysql> select *
     -> from (
     -> select * from t1 where c >= 2
     -> ) t1 left join (
     -> select * from t2 where b < 30
     -> ) t2 on t1.a = t2.a
     -> and t2.d > 1
     -> where t1.b < 50
     -> ;
+---+----+---+------+------+------+
| a | b  | c | a    | b    | d    |
+---+----+---+------+------+------+
| 2 | 20 | 2 |    2 |   20 |    2 |
| 3 | 30 | 3 | NULL | NULL | NULL |
| 4 | 40 | 4 | NULL | NULL | NULL |
+---+----+---+------+------+------+
3 rows in set (0.05 sec)

mysql> explain select *
     -> from (
     -> select * from t1 where c >= 2
     -> ) t1 left join (
     -> select * from t2 where b < 30
     -> ) t2 on t1.a = t2.a
     -> and t2.d > 1
     -> where t1.b < 50
     -> ;
+----+-------------+------------+------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table      | type | possible_keys | key         | key_len | ref  | rows | Extra       |
+----+-------------+------------+------+---------------+-------------+---------+------+------+-------------+
|  1 | PRIMARY     | <derived2> | ALL  | NULL          | NULL        | NULL    | NULL |    6 | Using where |
|  1 | PRIMARY     | <derived3> | ref  | <auto_key0>   | <auto_key0> | 8       | t1.a |    1 | Using where |
|  3 | DERIVED     | t2         | ALL  | NULL          | NULL        | NULL    | NULL |    3 | Using where |
|  2 | DERIVED     | t1         | ALL  | NULL          | NULL        | NULL    | NULL |    6 | Using where |
+----+-------------+------------+------+---------------+-------------+---------+------+------+-------------+
4 rows in set (0.00 sec)


6、针对以上SQL执行计划的分析:

1) 全表扫描左表T1,同时根据T1表子查询条件"C">=2和where过滤条件"T1"."B"<50联合过滤,即filter("T1"."B"<50 AND "C">=2),计算结果临时表记为tmp1;
2) 全表扫描右表T2,同时根据T2表子查询条件"B"(+)<30和on子句"T2"."D"(+)>1联合过滤,即filter("T2"."D"(+)>1 AND "B"(+)<30),计算结果临时表记为tmp2;
3) 左表T1及右表T2处理后临时表tmp1和tmp2通过access("T1"."A"="T2"."A"(+))连接条件进行Hash Left Join操作,左临时表结果集全量返回,右表不匹配行置为null,返回结果临时表记为tmp3;
4) 返回结果集。


7、一些更为复杂得SQL如下,有兴趣自行研究:

1) 测试数据

create table tmp1 as
select a,b,c,a as e from t1;

create table tmp2 as
select a,b,d,a as e from t2;

2) 示例SQL

select *
from (
select * from tmp1 where c >= 1
) t1 left join (
select * from tmp2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1 and t1.e >= 2
where t1.b < 50
;


select *
from (
select * from tmp1 where c >= 1
) t1 left join (
select * from tmp2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1 and t1.e >= 2
where t1.b < 50 and t2.e <= 3
;

以上是关于一个RDBMS左连接SQL执行计划解析的主要内容,如果未能解决你的问题,请参考以下文章

SQL Server:在左连接查询的执行计划中插入隐藏的“排序”

SQL Server 执行计划解析

执行计划

在 Hive 中执行 SQL 过程

Hive SQL执行计划深度解析

如何解析oracle执行计划