如何提高 Sql server 中 Distinct Query 的性能
Posted
技术标签:
【中文标题】如何提高 Sql server 中 Distinct Query 的性能【英文标题】:how to Improve Performance of Distinct Query In Sql server 【发布时间】:2018-09-21 06:11:45 【问题描述】:我有一个查询,目前仅 324 条记录需要 10 秒,有什么方法可以提高此性能。Ps 我对 Sql 服务器非常陌生。
试过了:
我已经在 SP 中使用了SET NOCOUNT ON
,因为我读到它可以提高性能,甚至对每个表都使用了alias
。请告诉我可以做些什么来提高它的性能
DECLARE @vRequestedBy VARCHAR(2000) = CASE WHEN @RequestedBy <> '' THEN @RequestedBy END,
@vJobType NVARCHAR(2000) = CASE WHEN @JobType <> '' THEN @JobType END;
SELECT distinct ts.JobID,
dbo.TSP_CAT_Category.Category,
ts.JobType,
dbo.TSP_TSR_JobStatus.JobStatus,
dbo.wsm_Contact.Name "ContactName",
ts.Created,
wb.Name AS BuildingName,
ts.Contact,
ts.CreatedBy,
ts.ContactEmail,
dbo.wsm_Contact.TradingAs,
--wsm_Contact_User.UserId "RequestedByUserId",
c2.Name "RequestedByUser",
dbo.wsm_Contact.ContactID
FROM
dbo.TSP_TSR_Job ts
LEFT OUTER JOIN
dbo.wsm_Ref_Buildings wb ON ts.BuildingID = wb.BuildingId
LEFT OUTER JOIN
dbo.wsm_Contact ON ts.TenancyID = dbo.wsm_Contact.ContactID
LEFT OUTER JOIN
dbo.TSP_TSR_JobStatus ON ts.JobStatusID = dbo.TSP_TSR_JobStatus.JobStatusID
LEFT OUTER JOIN
dbo.TSP_CAT_Category ON ts.CategoryID = dbo.TSP_CAT_Category.CategoryID
LEFT OUTER JOIN
dbo.wsm_Contact_User ON UserID = ts.ContactEmail COLLATE SQL_Latin1_General_CP1_CI_AS
LEFT OUTER JOIN
wsm_Contact c2 ON c2.ContactID = wsm_Contact_User.ContactID
WHERE
-- JobId criteria
(@JobID = 0 OR JobID = @JobId)
AND (@TenancyId = '0' OR TenancyId in (select Item from Split_fn(@TenancyID,',')))
AND (@TradingAs = '0' OR wsm_Contact.ContactID in (select Item from Split_fn(@TradingAs,',') ))
--RequestedBy
AND (@vRequestedBy IS NULL OR @vRequestedBy = '0' OR ts.ContactEmail in (Select distinct Email from dbo.wsm_Contact WHere Email in (select Item from Split_fn(@vRequestedBy,',')) ))
-- Job Category
AND (@CategoryId = '0' OR ts.CategoryID in (select Item from Split_fn(@CategoryId,',') ))
-- Contact Id (always filter on this, enough security?!)
AND ts.BuildingID IN (SELECT distinct b.BuildingId
FROM
wsm_ContactSite s
INNER JOIN
wsm_Contact c ON c.ContactID = s.ContactID
INNER JOIN
wsm_Ref_Buildings b ON b.SiteId = s.SiteID
WHERE
c.ContactID = @ContactUserId)
AND wsm_Contact.FloorID IN (SELECT t.FloorID
FROM wsm_Contact_Tenancy t
WHERE t.ContactID = @ContactUserId)
AND wsm_Contact.OCCPSTAT NOT IN ('I', 'P')
AND (@vJobType IS NULL OR ts.JobType in (select Item from Split_fn(@vJobType,',')))
AND (ts.Created between @CreatedFrom and DATEADD(DD,1,@CreatedTo))
ORDER BY
JobID
统计数据:
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
SQL Server Execution Times:
CPU time = 0 ms, elapsed time = 0 ms.
Table 'TSP_CAT_Category'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'wsm_Contact_Tenancy'. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'wsm_Contact'. Scan count 2, logical reads 3822, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'wsm_ContactSite'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'wsm_Ref_Buildings'. Scan count 3, logical reads 2811, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 2, logical reads 341364, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#AFEC4F2F'. Scan count 2, logical reads 524444, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TSP_TSR_Job'. Scan count 3, logical reads 58210, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'wsm_Contact_User'. Scan count 2, logical reads 2300, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TSP_TSR_JobStatus'. Scan count 2, logical reads 650, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '1159564537'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Workfile'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#BB5E01DB'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#BA69DDA2'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#B1D497A1'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table '#B0E07368'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
SQL Server Execution Times:
CPU time = 8391 ms, elapsed time = 5792 ms.
SQL Server Execution Times:
CPU time = 8391 ms, elapsed time = 5793 ms.
【问题讨论】:
您的查询非常大,我不确定是否有人能够从 Stack Overflow 页面调整整个内容。您可能想开始阅读诸如索引和调优之类的内容。 第一:能否提供执行计划?第二:桌子“工作台”在哪里发挥作用?是否涉及访问“工作表”的视图?您的查询至少不包含任何工作表。第三:ts.ContactEmail COLLATE SQL_Latin1_General_CP1_CI_AS
可能执行得非常糟糕,因为这可能涉及转换过程。请确保所有数据具有相同的排序规则。
@JosefBiehler brentozar.com/pastetheplan/?id=ry0ay7Gtm
也可能是参数嗅探问题。你能展示你的存储过程的声明吗?
你读过这个吗:sqlperformance.com/2012/10/t-sql-queries/sp_prefix
【参考方案1】:
-
为
wsm_Ref_Buildings
添加聚集索引
将每个IN
变成EXISTS
添加OPTION RECOMPILE
,因为您真的需要所有这些OR
s
将Split_fn
正文替换为来自互联网的内容(有更干净 此类代码的示例以及SQL SERVER SPLIT_STRING 函数附带),
确保Split_fn
是DETERMINISTIC
(需要WITH SCHEMABINDING
选项)
尽量摆脱所有DISTINCT
s
修复所有关于索引和缺失统计信息的警告(至少其中任何警告)(如TSP_CAT_Category
table)
粘贴新的实际执行计划(不是估计一个)
也许有一天会将所有 PK 从字符串转换为整数
DISTINCT
s 在您的IN
子查询中只会做无意义的额外排序,这对查询逻辑或输出没有影响。最上面的DISTINCT
修复(或仍然不做)设计糟糕的查询:重复行(如果有)是由必须修复的错误定义的连接产生的(例如OUTER APPLY (SELECT TOP 1...)
)。
更新
Exists
示例:
WHERE ts.CategoryID in (select Item from Split_fn(@CategoryId,',') )
-->>
WHERE EXISTS (select 1 from Split_fn(@CategoryId,',') s WHERE s.Item = ts.CategoryID)
【讨论】:
我应该用什么代替 Distinct TenancyId EXISTS (select Item from Split_fn(@TenancyID,','))) 在需要表达式的地方指定非布尔类型的表达式时出错 嗯,这不是EXISTS
的有效语法和用法。 docs.microsoft.com/en-us/sql/t-sql/language-elements/…
在答案中添加了示例@nikhiljain【参考方案2】:
Ivan Starostin 有一组很好的建议,我不会在此重复,但我想请您考虑一下为什么您觉得首先需要使用“distinct”。
select distinct
增加查询的时间和精力,这不是“宽查询”(多列查询)的好方法,尤其是在查询涉及多个连接表的情况下。虽然将表连接在一起是常见且必要的,但不要忘记连接通常具有乘以行数的效果。因此,如果行数过多,请在完全依赖 select distinct
作为灵丹妙药之前重新考虑连接。
例如,这需要“distinct”吗?
SELECT
ts.JobID
, ts.Contact
, ts.ContactEmail
, ts.Created
, ts.CreatedBy
, ts.JobType
FROM dbo.TSP_TSR_Job ts
WHERE (ts.Created BETWEEN @CreatedFrom AND DATEADD(DD, 1, @CreatedTo))
AND (@JobID = 0 OR ts.JobID = @JobId)
AND (@TenancyId = '0' OR ts.TenancyId IN (
SELECT
Item
FROM Split_fn(@TenancyID, ',')
)
)
如果(我怀疑)不需要“distinct”,则将其用作子查询,然后添加剩余的表。您还可以在不添加更多行的情况下合并类别和状态的查找表,例如
SELECT
ts.JobID
, ts.Contact
, ts.ContactEmail
, ts.Created
, ts.CreatedBy
, ts.JobType
, dbo.TSP_CAT_Category.Category
, dbo.TSP_TSR_JobStatus.JobStatus
FROM dbo.TSP_TSR_Job ts
LEFT OUTER JOIN dbo.TSP_TSR_JobStatus ON ts.JobStatusID = dbo.TSP_TSR_JobStatus.JobStatusID
LEFT OUTER JOIN dbo.TSP_CAT_Category ON ts.CategoryID = dbo.TSP_CAT_Category.CategoryID
WHERE (ts.Created BETWEEN @CreatedFrom AND DATEADD(DD, 1, @CreatedTo))
AND (@JobID = 0 OR ts.JobID = @JobId)
AND (@TenancyId = '0' OR ts.TenancyId IN (
SELECT
Item
FROM Split_fn(@TenancyID, ',')
)
)
-- Job Category
AND (@CategoryId = '0'
OR ts.CategoryID IN (
SELECT
Item
FROM Split_fn(@CategoryId, ',')
)
)
AND (@vJobType IS NULL
OR ts.JobType IN (
SELECT
Item
FROM Split_fn(@vJobType, ',')
)
)
如果不需要“distinct”,则将其作为子查询(“派生表”或“公用表表达式”),然后尝试一一添加每个附加连接(即添加连接,以及相关的 where子句过滤并为此表添加选定的列)。然后,如果您在该额外连接之后开始在结果中看到不需要的重复,您就会知道重复来自何处。您可能需要对该表使用完全不同的方法来解决此问题(例如,加入使用 row_number() 的子查询以仅获取“最新”联系人)。
【讨论】:
【参考方案3】:您可以为 wsm_Contact 、 TSP_TSR_JobStatus 、TSP_CAT_Category 、 wsm_Contact_User 创建别名,并在查询中使用他们的别名
【讨论】:
以上是关于如何提高 Sql server 中 Distinct Query 的性能的主要内容,如果未能解决你的问题,请参考以下文章