条件连接性能优化 - Oracle SQL
Posted
技术标签:
【中文标题】条件连接性能优化 - Oracle SQL【英文标题】:Conditional Join Performance Optimization - Oracle SQL 【发布时间】:2019-09-11 12:32:46 【问题描述】:我需要查看我们数据库中的一个实物,它可以是我的小型建筑公司处理的任何类型的交易。然后,根据交易类型,确定到期日期和承诺日期。有十五种不同的交易类型,但我主要关注四五种:
SELECT
datatable.ID_Number,
datatable.Object_Type,
CASE
WHEN Object_Type = 'AA' THEN (SELECT PO_DUE_DATE FROM tblPO WHERE datatable.ID_Number = tblPO.PO_ID)
WHEN Object_Type = 'AB' THEN (SELECT PROD_DUE_DATE FROM tblPROD WHERE datatable.ID_Number = tblPROD.PROD_ID)
WHEN Object_Type = 'AC' THEN (SELECT PLAN_DUE_DATE FROM tblPLAN WHERE datatable.ID_Number = tblPLAN.PLAN_ID)
WHEN Object_Type = 'BN' THEN (SELECT NEED_DUE_DATE FROM tblPURCHASE WHERE datatable.ID_Number = tblPURCHASE.PURCHASE_ID)
ELSE TO_DATE(NULL) END AS Object_Due_Date,
CASE
WHEN Object_Type = 'AA' THEN (SELECT PO_PROM_DATE FROM tblPO WHERE datatable.ID_Number = tblPO.PO_ID)
WHEN Object_Type = 'AB' THEN (SELECT PROD_PROM_DATE FROM tblPROD WHERE datatable.ID_Number = tblPROD.PROD_ID)
WHEN Object_Type = 'AC' THEN (SELECT PLAN_PROM_DATE FROM tblPLAN WHERE datatable.ID_Number = tblPLAN.PLAN_ID)
WHEN Object_Type = 'BN' THEN (SELECT NEED_PROM_DATE FROM tblPURCHASE WHERE datatable.ID_Number = tblPURCHASE.PURCHASE_ID)
ELSE TO_DATE(NULL) END AS Object_Promised_Date
FROM
datatable
WHERE
( other filtering criteria )
这给了我这样的输出:
| ID_Number | Object_Type | Object_Due_Date | Object_Promised_Date |
|:---------:|:-----------:|:---------------:|:--------------------:|
| 1 | AA | 11/26/2018 | 10/18/2018 |
| 2 | AB | 5/12/2018 | 3/31/2018 |
| 3 | AA | 6/15/2018 | 9/18/2018 |
| 4 | AA | 1/24/2018 | 10/2/2018 |
| 5 | ZZ | 10/27/2018 | 6/11/2018 |
| 7 | BN | 1/23/2018 | 7/2/2018 |
| 8 | AC | 4/3/2018 | 8/3/2018 |
| 9 | BN | 12/1/2018 | 8/16/2018 |
| 10 | BN | 1/10/2018 | 10/6/2018 |
而且效果很好!问题是datatable
大约有 2000 万条记录,而且这些日期可能会发生变化,因此我需要每隔一段时间(每周一次或两次)刷新报告。运行和更新需要 8-9 小时,因为对于每条记录,我都会有条件地加入另一个表。
如何提高此查询的运行时效率?我知道我可以离开表格加入,但我不知道如何用日期填充单个列值,具体取决于Object_Type
,而不是为Type_AA_Due_Date
Type_AB_ Due_Date
等设置 n 列。
【问题讨论】:
【参考方案1】:你也可以试试这个方法:
SELECT
datatable.ID_Number,
datatable.Object_Type,
COALESCE (tblPO.PO_DUE_DATE, tblPROD.PROD_DUE_DATE, tblPLAN.PLAN_DUE_DATE,
tblPURCHASE.NEED_DUE_DATE) AS Object_Due_Date,
COALESCE (tblPO.PO_PROM_DATE, tblPROD.PROD_PROM_DATE, tblPLAN.PLAN_PROM_DATE,
NEED_PROM_DATE.NEED_DUE_DATE) AS Object_Promised_Date
FROM
datatable
LEFT JOIN tblPO
ON datatable.ID_Number = tblPO.PO_ID
AND datatable.Object_Type = 'AA'
LEFT JOIN tblPROD
ON datatable.ID_Number = tblPROD.PROD_ID
AND datatable.Object_Type = 'AB'
LEFT JOIN tblPLAN
ON datatable.ID_Number = tblPLAN.PLAN_ID
AND datatable.Object_Type = 'AC'
LEFT JOIN tblPURCHASE
ON datatable.ID_Number = tblPURCHASE.PURCHASE_ID
AND datatable.Object_Type = 'BN'
WHERE
( other filtering criteria )
【讨论】:
LEFT JOIN PO_DUE_DATE FROM tblPO
from
是什么意思通常我在列名上使用单词连接
这是一个错误 - 已修复
托尼,好方法。奇怪的是,COALESCE
似乎给我带来了一些非常奇怪的结果,可能是由于字段不是 NULL
而是空的,充满了空格或垃圾字节数据。【参考方案2】:
试试这个
SELECT
datatable.ID_Number,
datatable.Object_Type,
tblPO.PO_DUE_DATE Object_Due_Date,
tblPO.PO_PROM_DATE Object_Promised_Date
FROM
datatable
join tblPO on datatable.ID_Number = tblPO.PO_ID
WHERE
Object_Type = 'AA'
( other filtering criteria )
union all
SELECT
datatable.ID_Number,
datatable.Object_Type,
tblPROD.PROD_DUE_DATE,
tblPROD.PROD_PROM_DATE
FROM
datatable
join tblPLAN on datatable.ID_Number = tblPLAN.PLAN_ID
WHERE
Object_Type = 'AB'
( other filtering criteria )
union all
as early with tables tblPLAN and tblPURCHASE
【讨论】:
【参考方案3】:这样的事情怎么样?
SELECT
dt.ID_Number,
dt.Object_Type,
CASE
WHEN Object_Type = 'AA' THEN po.PO_DUE_DATE
WHEN Object_Type = 'AB' THEN pd.PROD_DUE_DATE
WHEN Object_Type = 'AC' THEN pl.PLAN_DUE_DATE
WHEN Object_Type = 'BN' THEN pc.NEED_DUE_DATE
ELSE TO_DATE(NULL) END AS Object_Due_Date,
CASE
WHEN Object_Type = 'AA' THEN po.PO_PROM_DATE
WHEN Object_Type = 'AB' THEN pd.PROD_PROM_DATE
WHEN Object_Type = 'AC' THEN pl.PLAN_PROM_DATE
WHEN Object_Type = 'BN' THEN pc.NEED_PROM_DATE
ELSE TO_DATE(NULL) END AS Object_Promised_Date
FROM
datatable dt
LEFT JOIN tblPO po ON dt.ID_Number = po.PO_ID
LEFT JOIN tblPROD pd ON dt.ID_Number = pd.PROD_ID
LEFT JOIN tbdPLAN pl ON dt.ID_Number = pl.PLAN_ID
LEFT JOIN tblPURCHASE pc ON dt.ID_Number = pc.PURCHASE_ID
WHERE
( other filtering criteria )
我刚刚将您的条件 SELECTs
替换为 LEFT JOINs
。您可以在此查询与您的原始查询上运行EXPLAIN PLAN
,看看是否有任何改变。
其他想法
如果您还没有在datatable.ID_Number
上定义索引,那么您应该在其上创建一个索引,因为它可以在所有SELECTs
/ JOINs
中访问
如果您的其他 JOIN
列(PO_ID、PROD_ID、...)尚未编入索引,则可能在它们上创建索引
LEFT JOIN
表中,请将连接更改为 INNER JOIN
...这可能会加快速度
【讨论】:
您能解释一下为什么这是一个更好的解决方案吗? 我想说,“确定”连接通常比“条件”连接更好,这样优化器就可以制定更准确的计划,而不必猜测。话虽这么说......你是最好的法官:) 尝试同时运行查询和EXPLAIN PLAN
,看看是否有任何区别/改进。
只是为了我的好奇,你能测试一下吗?是否有任何性能改进?
我在速度方面进行了测试,速度更快(9小时-> 4小时),但我没有使用EXPLAIN PLAN
,因为我不知道如何
甜!!只需将文本 EXPLAIN PLAN
放在您的查询前面(即 EXPLAIN PLAN SELECT dt.ID_Number ... )
并运行它。这将为您提供优化器的查询计划。您可以以相同的方式将其与旧查询计划进行比较,看看它在做什么不同。也许尝试摆弄索引,看看是否可以更快地获得它!或者您也可以将您的一些 WHERE (other filtering criteria)
合并到您的联接中。【参考方案4】:
类似于 Needle 的回答,但我喜欢将来自不同来源的相似数据组合到一个视图中。在这种情况下,我使用的是内联视图,但如果您要在很多地方使用它,那么创建一个实际视图以显示不同交易类型的到期日期和承诺日期可能是值得的。
SELECT
datatable.ID_Number,
datatable.Object_Type,
type_lookups.Object_Due_Date,
type_lookups.Object_Promised_Date
FROM
datatable
LEFT JOIN (
select 'AA' as Object_Type,
PO_DUE_DATE as Object_Due_Date,
PO_PROM_DATE as Object_Promised_Date,
PO_ID as ID
from tblPO
union all
select 'AB' as Object_Type,
PROD_DUE_DATE as Object_Due_Date,
PROD_PROM_DATE as Object_Promised_Date,
PROD_ID as ID
from tblPROD
union all
select 'AC' as Object_Type,
PLAN_DUE_DATE as Object_Due_Date,
PLAN_PROM_DATE as Object_Promised_Date,
PLAN_ID as ID
from tblPLAN
union all
select 'BN' as Object_Type,
NEED_DUE_DATE as Object_Due_Date,
NEED_PROM_DATE as Object_Promised_Date,
PURCHASE_ID as ID
from tblPURCHASE
) type_lookups
ON type_lookups.object_type = datatable.Object_Type
AND type_lookups.ID = datatable.ID_Number
WHERE
( other filtering criteria )
【讨论】:
您能否就为什么这优于我目前的方法添加任何解释? 标量子查询(您当前的方法)每行调用一次;如果您的唯一值数量较少,结果缓存会有所帮助,但通常如果您的查询返回很多行,我认为您应该尝试将您的逻辑移动到连接中。以上是关于条件连接性能优化 - Oracle SQL的主要内容,如果未能解决你的问题,请参考以下文章