查询运行时间过长,如何优化?

Posted

技术标签:

【中文标题】查询运行时间过长,如何优化?【英文标题】:Query takes too long to run, how to optimize it? 【发布时间】:2020-12-04 05:51:12 【问题描述】:

查询结构:“with”子句中的 Helper-select - 使用“top 1 transaction_date”选择最近的条目。然后做很多连接。运行时间太长 - 我做错了什么?

CREATE VIEW [IRWSMCMaterialization].[FactInventoryItemOnHandDailyView] AS
WITH TempTBLFactIvnItmDaily AS (
SELECT TOP 20
     ITEM_NUMBER AS [InventoryItemNumber]
    ,CAST(FORMAT(TRANSACTION_DATE, 'yyyyMMdd') AS INT) AS [DateKey]
    ,BRANCH_PLANT_FHK AS [BranchPlantKey]
    ,BRANCH_PLANT_CODE AS [BranchPlantCode]
    ,CAST(QUANTITY_ON_HAND AS BIGINT)  AS [QuantityOnHand]
    ,TRANSACTION_DATE AS [Date]
    ,WAREHOUSE_LOCATION_FHK AS [WarehouseLocationKey]
    ,WAREHOUSE_LOCATION_CODE AS [WarehouseLocationCode]
    ,WAREHOUSE_LOT_NUMBER_CODE  AS [WarehouseLotNumber]
    ,WAREHOUSE_LOT_NUMBER_FHK AS [WarehouseLotNumberKey]
    ,UNIT_OF_MEASURE AS [UnitOfMeasureName]
    ,UNIT_OF_MEASURE_PHK AS [UnitOfMeasureKey]
    
  FROM dbo.RS_INV_ITEM_ON_HAND
-- below is where clause, choose only most recent entry
WHERE TRANSACTION_DATE = (SELECT TOP 1 TRANSACTION_DATE FROM dbo.RS_INV_ITEM_ON_HAND ORDER BY TRANSACTION_DATE DESC)
)

SELECT [InventoryItemNumber],
                [DateKey],
                [Date],
                [BranchPlantCode] AS [BP],
                [WarehouseLocationCode] AS [Location],
                [QuantityOnHand],
                [UnitOfMeasureName] AS [UoM],
                CASE [WarehouseLotNumber]
                 WHEN 'Not Assigned' THEN NULL
                ELSE [WarehouseLotNumber]
                  END
                AS [Lot]
FROM TempTBLFactIvnItmDaily iioh
JOIN DWH.DimBranchPlant bp ON  iioh.BranchPlantKey = bp.BRANCH_PLANT_PHK
JOIN DWH.DimWarehouseLocation wloc ON iioh.WarehouseLocationKey = wloc.WAREHOUSE_LOCATION_PHK
JOIN DWH.DimWarehouseLotNumber wlot ON iioh.WarehouseLotNumberKey = wlot.WarehouseLotNumber_PHK
JOIN DWH.DimUnitOfMeasure uom ON CAST(iioh.UnitOfMeasureKey AS VARCHAR(100)) = uom.UNIT_OF_MEASURE_PHK
where bp.BRANCH_PLANT_CODE = '96100' 
    AND iioh.QuantityOnHand > 0
    AND (wloc.WAREHOUSE_LOCATION_CODE like '6000W01%' OR wloc.WAREHOUSE_LOCATION_CODE like 'BL%')
GO

【问题讨论】:

性能调优的第一步是检查执行计划。如果您需要帮助,请在此处发布。尽管您会在dba.stackexchange.com 获得更专业的帮助 我认为你应该在查询视图时使用前 20 个 @GhufranAtaie,就是这样,我尝试使用前 1、前 3 - 仍然需要很长时间才能运行 【参考方案1】:

有很多事情看起来并不好。首先,您的基本查询必须简单得多。像这样的:

SELECT iioh.ITEM_NUMBER AS [InventoryItemNumber],
       CAST(FORMAT(iioh.TRANSACTION_DATE, 'yyyyMMdd') AS INT) AS [DateKey],
       iioh.TRANSACTION_DATE AS [Date],
       iioh.BRANCH_PLANT_CODE AS [BP],
       iioh.WAREHOUSE_LOCATION_CODE AS [Location],
       CAST(iioh.QUANTITY_ON_HAND AS BIGINT) AS [QuantityOnHand],
       iioh.UNIT_OF_MEASURE AS [UoM],
       NULLIF(iioh.WAREHOUSE_LOT_NUMBER_CODE, 'Not Assigned') AS [Lot]        
FROM dbo.RS_INV_ITEM_ON_HAND iioh
JOIN DWH.DimBranchPlant bp 
    ON  iioh.BranchPlantKey = bp.BRANCH_PLANT_PHK
JOIN DWH.DimWarehouseLocation wloc 
    ON iioh.WarehouseLocationKey = wloc.WAREHOUSE_LOCATION_PHK
JOIN DWH.DimUnitOfMeasure uom 
    ON CAST(iioh.UnitOfMeasureKey AS VARCHAR(100)) = uom.UNIT_OF_MEASURE_PHK
where bp.BRANCH_PLANT_CODE = '96100' 
    AND iioh.QuantityOnHand > 0
    AND (wloc.WAREHOUSE_LOCATION_CODE like '6000W01%' OR wloc.WAREHOUSE_LOCATION_CODE like 'BL%')
    AND iioh.TRANSACTION_DATE = @TRANSACTION_DATE

例如,您正在加入DWH.DimWarehouseLotNumber,但您没有提取列 - 您真的需要它吗?此外,视图未返回其他列 - 为什么要查询它们?

从那里,您首先按 date 过滤,然后是 y 其他字段,因此您的前 20 条记录可能会按下一个条件过滤 - 这是您想要的行为吗?

另外,你真的想要这个演员吗?

ON CAST(iioh.UnitOfMeasureKey AS VARCHAR(100)) = uom.UNIT_OF_MEASURE_PHK

在性能方面最好使用CONVERT,而不是FORMAT。另外,为什么不将 TRANSACTION_DATE 保存/具体化为 INT(例如使用持久计算列或仅在 CRUD 上),而不是在每次读取时计算此值?

使用LIKE 子句过滤location code 也可以提高性能。为什么不添加一个新列 WareHouseLocationCodeType 并为所有满足此条件的位置设置相同的值:

(wloc.WAREHOUSE_LOCATION_CODE like '6000W01%' OR wloc.WAREHOUSE_LOCATION_CODE like 'BL%')

然后您可以在视图中按此列进行过滤,因为这对您非常重要。此外,您可以在此列上创建filter index 以提高性能等等。

此外,您可能希望创建一个内联函数而不是视图并将日期作为参数传递:

CREATE OR ALTER FUNCTION [IRWSMCMaterialization].[FactInventoryItemOnHandDailyView] 
(
    @TRANSACTION_DATE datetime
)
RETURNS TABLE
AS
RETURN
(
    SELECT iioh.ITEM_NUMBER AS [InventoryItemNumber],
           CAST(FORMAT(iioh.TRANSACTION_DATE, 'yyyyMMdd') AS INT) AS [DateKey],
           iioh.TRANSACTION_DATE AS [Date],
           iioh.BRANCH_PLANT_CODE AS [BP],
           iioh.WAREHOUSE_LOCATION_CODE AS [Location],
           CAST(iioh.QUANTITY_ON_HAND AS BIGINT) AS [QuantityOnHand],
           iioh.UNIT_OF_MEASURE AS [UoM],
           NULLIF(iioh.WAREHOUSE_LOT_NUMBER_CODE, 'Not Assigned') AS [Lot]  
          ,iioh.TRANSACTION_DATE 
    FROM dbo.RS_INV_ITEM_ON_HAND iioh
    JOIN DWH.DimBranchPlant bp 
        ON  iioh.BranchPlantKey = bp.BRANCH_PLANT_PHK
    JOIN DWH.DimWarehouseLocation wloc 
        ON iioh.WarehouseLocationKey = wloc.WAREHOUSE_LOCATION_PHK
    JOIN DWH.DimUnitOfMeasure uom 
        ON CAST(iioh.UnitOfMeasureKey AS VARCHAR(100)) = uom.UNIT_OF_MEASURE_PHK
    where bp.BRANCH_PLANT_CODE = '96100' 
        AND iioh.QuantityOnHand > 0
        AND (wloc.WAREHOUSE_LOCATION_CODE like '6000W01%' OR wloc.WAREHOUSE_LOCATION_CODE like 'BL%')
        AND iioh.TRANSACTION_DATE = @TRANSACTION_DATE
)

然后这样称呼它:

SELECT TOP 20 *
FROM [IRWSMCMaterialization].[FactInventoryItemOnHandDailyView] ('2020-12-04')
ORDER BY @TRANSACTION_DATE DESC

【讨论】:

【参考方案2】:

查询优化是当今的科学。如果您想在查询中找到瓶颈,可以按照以下步骤操作:

第一步,使用以下命令启用统计信息:

SET STATISTICS TIME ON;
SET STATISTICS IO ON;

一旦您在同一窗口的某些查询窗口中执行这些命令,就会执行您的查询。执行查询时切换到Messages 选项卡,您将看到很多有用的信息,例如执行时间、解析和编译时间,可能还有最有趣的 I/O 读取。

作为第二步,尝试了解哪个表有大量读取,例如,如果您期望查询中有 10 行,但在某些表中您有 10k 或 100k 逻辑读取,则出现问题。这意味着从一个表执行 10 行查询会读取 10k 页。显然你在这张表上缺少一些索引,试着找到你需要的索引。

如果您在 where 子句中有一些静态值,如下所示,请考虑 Filtered Index

bp.BRANCH_PLANT_CODE = '96100' AND iioh.QuantityOnHand > 0

并非总是如此,但在某些情况下,如果您在转换索引或在 where 子句中使用其他函数(如下所示),转换可能会破坏您的索引,即使您在此列上有索引,查询优化器也不会在查询执行中使用它:

CAST(iioh.UnitOfMeasureKey AS VARCHAR(100))

最后一个,如果您的查询中有OR 逻辑运算符,请尝试逐个执行OR 逻辑运算符的一部分,分别查看性能。这个逻辑运算符真的可以杀死你的查询,这是一个例子:

AND (wloc.WAREHOUSE_LOCATION_CODE like '6000W01%' OR wloc.WAREHOUSE_LOCATION_CODE like 'BL%')

一旦你在这里确定你没有任何问题,你可以走得更远。

【讨论】:

以上是关于查询运行时间过长,如何优化?的主要内容,如果未能解决你的问题,请参考以下文章

Postgresql 计划不周的查询运行时间过长

一条sql执行过长的时间,你如何优化,从哪些方面?

SQL 查询耗时过长

MySQL innoDB:查询执行时间过长

Redshift 查询花费太多时间

如何优化 SQL 查询以减少运行时间?