Oracle SQL如何优化IN不相关子查询

Posted 2023-04-14

技术标签:

【中文标题】Oracle SQL如何优化IN不相关子查询【英文标题】：Oracle SQL how to optimize IN uncorrelated subquery 【发布时间】：2017-12-20 13:16:51 【问题描述】：

我有以下表现不佳的查询：

select 
    distinct 
    u.uuid
    u.user_name,
    u.key

    from request req 
         join int_user u on u.uuid = req.user_uuid
         join int_right r on r.uuid = req.right_uuid

    where r.uuid in (
            select r2.uuid from int_right r2
                    where 
                            (
                                lower(r2.right_name) like '%keyword%'
                                or lower(r2.right_key) like '%keyword%'
                            )

                    )

子查询是不相关的，它通常会返回几行，有时只返回一行。现在我不明白为什么如果我获取子查询并单独执行它然后获取结果列表并使用 IN 运算符将其静态添加到外部查询中，那么它将执行得非常好，从 3- 6s 执行时间降至 0.05s。

r.uuid in ('value1', 'value2', 'value3')

我如何告诉 oracle 先执行我的子查询，然后将结果集应用于外部查询？

几点说明：

请求表非常庞大 - 大约 700 万行 Int_right 表 - 大约 10K 行 Int_user 表 - 大约 100K 行

从执行计划来看，oracle 似乎对所有表进行了全面扫描。请求表上的成本和基数非常大。有趣的是，即使我的子查询会针对某个搜索条件返回单行，查询仍然很慢，但是如果我将 IN 运算符替换为 equals(=) em> 那么查询变得非常快而且成本低。在这种情况下，oracle 似乎只会在 int_right 表中进行完整扫描，而对于其他表，它会进行唯一或范围扫描。

我还尝试了此查询的其他变体，例如将条件直接添加到外部查询，使用存在或相关子查询，但无论如何它仍然很慢。

【问题讨论】：

抱歉打错了。已更正，谢谢。为什么需要子查询？我认为您可以将子查询中的 where 子句作为外部查询中的 where 子句。确实是这样，但是在大多数情况下，如果我在外部查询中使用 where 子句，我将获得相同的性能。如果搜索条件很长（20-30 个字符），那么子查询的性能会更好。您需要提供短搜索条件和长搜索条件的执行计划，以及带有和不带有子查询的执行计划，以便我们查看是否有导致缓慢的原因。 【参考方案1】：

为什么需要子查询？

可以通过两种不同的方式应用相同的条件：

加入

select 
    distinct 
    u.uuid
    u.user_name,
    u.key

    from request req 
         join int_user u on u.uuid = req.user_uuid
         join int_right r on r.uuid = req.right_uuid 
         And (lower(r.right_name) like '%keyword%' or lower(r.right_key) like '%keyword%')

在哪里

select 
    distinct 
    u.uuid
    u.user_name,
    u.key

    from request req 
         join int_user u on u.uuid = req.user_uuid
         join int_right r on r.uuid = req.right_uuid 
    Where (lower(r.right_name) like '%keyword%' or lower(r.right_key) like '%keyword%')

虽然我不能 100% 确定哪个 1 会更快，但两者都会导致查询速度更快。据我了解，加入的人会更快...

【讨论】：

感谢您的回复，但已经尝试过了，但并没有提高性能。当搜索条件较小时，我得到相同的性能，但是当它较大（如 20-30 个字符长）时，子查询的性能更好，但无法解释为什么.. @DimaSendrea 加入标准是否有任何索引？如果有帮助，也试试where exists。 [dba-oracle.com/t_exists_clause_vs_in_clause.htm].【参考方案2】：

您通常无法通过查看文本来调整 SQL 语句（除非代码中存在根本缺陷，例如缺少连接条件等）。对于 Oracle，最有效的方法之一是：

1) 执行有问题的语句并附加如下提示

select /*+ gather_plan_statistics */ ... <rest of query>

2) 运行以下命令获取执行计划指标

select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'))

这样你会：

a) 查看所使用的真实执行计划，

b) 获取计划中每个步骤的估计/实际行源计数。如果估计值与实际值之间存在差异，那么这通常是您需要关注的地方，因为这是优化器很可能没有足够或准确的足够信息来处理的地方。

例如

SQL> select /*+ gather_plan_statistics */ count(dname)
  2  from  scott.emp e, scott.dept d
  3  where e.sal <= 1500
  4  and  d.deptno = e.deptno;

COUNT(DNAME)
------------
           7

1 row selected.

SQL> select * from table(dbms_xplan.display_cursor(null,null,'ALLSTATS LAST'));

PLAN_TABLE_OUTPUT
---------------------------------------------------------------------------------------------------
SQL_ID  c1cb4s8b141h8, child number 0
-------------------------------------
select /*+ gather_plan_statistics */ count(dname) from  scott.emp e,
scott.dept d where e.sal <= 1500 and  d.deptno = e.deptno

Plan hash value: 3037575695

---------------------------------------------------------------------------------------------------
| Id  | Operation                     | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
---------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT              |         |      1 |        |      1 |00:00:00.01 |       9 |
|   1 |  SORT AGGREGATE               |         |      1 |      1 |      1 |00:00:00.01 |       9 |
|   2 |   MERGE JOIN                  |         |      1 |      3 |      7 |00:00:00.01 |       9 |
|   3 |    TABLE ACCESS BY INDEX ROWID| DEPT    |      1 |      4 |      4 |00:00:00.01 |       2 |
|   4 |     INDEX FULL SCAN           | DEPT_PK |      1 |      4 |      4 |00:00:00.01 |       1 |
|*  5 |    SORT JOIN                  |         |      4 |      3 |      7 |00:00:00.01 |       7 |
|*  6 |     TABLE ACCESS FULL         | EMP     |      1 |      3 |      7 |00:00:00.01 |       7 |
---------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   5 - access("D"."DEPTNO"="E"."DEPTNO")
       filter("D"."DEPTNO"="E"."DEPTNO")
   6 - filter("E"."SAL"<=1500)

在第 6 行，您可以看到优化器估计 3 行，但实际上得到了 7 行。较大的差异表明需要调查的区域。

【讨论】：

以上是关于Oracle SQL如何优化IN不相关子查询的主要内容，如果未能解决你的问题，请参考以下文章