50 万条记录的 SQL 查询性能优化

Posted

技术标签:

【中文标题】50 万条记录的 SQL 查询性能优化【英文标题】:SQL Query Performance Optimization on half million records 【发布时间】:2021-03-10 14:48:43 【问题描述】:

我正在使用具有五个表的数据库,并且我创建了一个 SP 来概述所有五个表。所有 5 个表的总和没有达到 50 万条记录,但 SP 需要 20 - 50 秒才能给我计数。我已经创建了索引,但仍然没有运气。

下面是我的SP:

ALTER procedure [dbo].[sp_TableCounts]

@TotalRecords   int output,
@TodayRecords   int output,

@New    int output,
@Modified   int output,
@Deleted    int output,
@DeletedToday   int output,

@TotalError int output,

@Table1NewRecords   int output,
@Table1ModifiedRecords  int output,
@Table1DeletedRecords   int output,
@Table1ErrorRecords int output,

@Table2NewRecords   int output,
@Table2ModifiedRecords  int output,
@Table2DeletedRecords   int output,
@Table2ErrorRecords int output,

@Table3NewRecords   int output,
@Table3ModifiedRecords  int output,
@Table3DeletedRecords   int output,
@Table3ErrorRecords int output,

@Table4NewRecords   int output,
@Table4ModifiedRecords  int output,
@Table4DeletedRecords   int output,
@Table4ErrorRecords int output,

@Table5NewRecords   int output,
@Table5ModifiedRecords  int output,
@Table5DeletedRecords   int output,
@Table5ErrorRecords int output

as
begin

SELECT @TotalRecords =
(select count(1) from Table1)
+
(select count(1) from Table2)
+
(select count(1) from Table3)
+
(select count(1) from Table4)
+
(select count(1) from Table5)



SELECT @TodayRecords =
(select count(1) from Table1 where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table2 where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table3 where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table4 where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table5  where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))


SELECT @New =
(select count(1) from Table1 where RecordStatus = 1)
+
(select count(1) from Table2 where RecordStatus = 1)
+
(select count(1) from Table3 where RecordStatus = 1)
+
(select count(1) from Table4 where RecordStatus = 1)
+
(select count(1) from Table5  where RecordStatus = 1)

SELECT @Modified =
(select count(1) from Table1 where RecordStatus = 2)
+
(select count(1) from Table2 where RecordStatus = 2)
+
(select count(1) from Table3 where RecordStatus = 2)
+
(select count(1) from Table4 where RecordStatus = 2)
+
(select count(1) from Table5  where RecordStatus = 2)

SELECT @Deleted =
(select count(1) from Table1 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')))
+
(select count(1) from Table2 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')))
+
(select count(1) from Table3 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')))
+
(select count(1) from Table4 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')))
+
(select count(1) from Table5  where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B'))) 

SELECT @DeletedToday = 
(select count(1) from Table1 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')) and Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table2 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')) and Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table3 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')) and Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table4 where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')) and Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))
+
(select count(1) from Table5  where (RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B')) and Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))


SELECT @TotalError =
(select count(1) from Table1 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))
+
(select count(1) from Table2 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))
+
(select count(1) from Table3 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))
+
(select count(1) from Table4 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))
+
(select count(1) from Table5  where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))


SELECT @Table1NewRecords = (select count(1) from Table1 where RecordStatus = 1)
SELECT @Table1ModifiedRecords = (select count(1) from Table1 where RecordStatus = 2)
SELECT @Table1DeletedRecords = (select count(1) from Table1 where RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B'))
SELECT @Table1ErrorRecords = (select count(1) from Table1 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))

SELECT @Table2NewRecords = (select count(1) from Table2 where RecordStatus = 1)
SELECT @Table2ModifiedRecords = (select count(1) from Table2 where RecordStatus = 2)
SELECT @Table2DeletedRecords = (select count(1) from Table2 where RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B'))
SELECT @Table2ErrorRecords = (select count(1) from Table2 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))

SELECT @Table3NewRecords = (select count(1) from Table3 where RecordStatus = 1)
SELECT @Table3ModifiedRecords = (select count(1) from Table3 where RecordStatus = 2)
SELECT @Table3DeletedRecords = (select count(1) from Table3 where RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B'))
SELECT @Table3ErrorRecords = (select count(1) from Table3 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))

SELECT @Table4NewRecords = (select count(1) from Table4 where RecordStatus = 1)
SELECT @Table4ModifiedRecords = (select count(1) from Table4 where RecordStatus = 2)
SELECT @Table4DeletedRecords = (select count(1) from Table4 where RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B'))
SELECT @Table4ErrorRecords = (select count(1) from Table4 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))

SELECT @Table5NewRecords = (select count(1) from Table5 where RecordStatus = 1)
SELECT @Table5ModifiedRecords = (select count(1) from Table5 where RecordStatus = 2)
SELECT @Table5DeletedRecords = (select count(1) from Table5 where RecordStatus = 3 or ([Status] = 'A')  or ([Status] = 'B'))
SELECT @Table5ErrorRecords = (select count(1) from Table5 where RecordStatus = 4 and ([Status] != 'A')  and ([Status] != 'B'))

end

关于如何使查询结构更好的任何建议?我认为这种结构很差。

更新


另外,我不能使用大量索引,因为表有大量的 每秒更新/插入/删除事务,并且也有 许多索引会影响 SLA。

【问题讨论】:

什么版本的sql server?运行select @@version Windows Server 2019 Standard 10.0 上的 Microsoft SQL Server 2017 企业版 这个查询有很多问题,包括多次扫描所有数据。 where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103)) 只是一个错误。 RecTimestamp 是什么?除非它是与日期相关的类型,否则它是一个严重的设计错误。对字段应用函数意味着任何使用原始字段值构建的索引都不能使用 也许您应该首先弄清楚查询的哪一部分花费的时间最长,然后专注于此。 你需要把它分解成更小的块。您不必要地多次扫描每个表,因此您可以立即使用“sum()”和“case()”获得 5 倍的改进。您是否检查了执行计划以查看是否使用了您的索引? 【参考方案1】:

您可以在此处进行许多改进,其中最大的改进之一是替换每块 5 个重复选择,如下所示:

select
    @Table1NewRecords=Sum (case when recordstatus=1 then 1 else 0 end)
    @Table1ModifiedRecords=Sum (case when recordstatus=2 then 1 else 0 end)
    @Table1DeletedRecords=Sum (case when recordstatus=3 or [Status] in( 'A','B') then 1 else 0 end)
    @Table1ErrorRecords=Sum (case when recordstatus=4 and [Status] != 'A'  and [Status] != 'B' then 1 else 0 end)
from table1
where recordstatus between 1 and 4

和上面一样,如果您重复扫描相同的表以获取不同的计数,您只能点击表一次并使用条件总和来计算相关行。

您还应该考虑将 RecTimestamp 上的函数添加为永久列,并在其上建立索引,因为 where Convert(date,CONVERT(datetime,RecTimestamp,120),103) 不可分割,并且会强制扫描每个表。这是因为 SQL 首先必须对每一行执行计算才能知道它是否符合条件。

此外,由于您使用的是 sql 2017,那么batchmode 可以提高这些类型的查询的性能,您需要查看执行计划并检查各种运算符的执行模式。

【讨论】:

我不能使用大量索引,因为表每秒都有大量的更新/插入/删除事务,并且索引过多会影响 SLA【参考方案2】:

由于对 where 子句谓词中的过滤列进行的数据类型转换,即使这些索引存在,其中一些查询也无法使用索引。

例如"where Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))" 之类的东西会导致全表扫描,即使有是“RecTimestamp”中的一个索引。

【讨论】:

我该如何克服这个问题?有什么建议吗? @SQLpro 的答案中的第一个示例应该可以解决这部分问题。【参考方案3】:

可以进行两项重大修改以加快查询速度。

    正如 KrsitoferA 所说,替换

    Convert(date,CONVERT(datetime,RecTimestamp,120),103) = Convert(date,GETDATE(),103))

...作者:

WHERE  RecTimestamp >= CAST(GETDATE() AS DATE) 
  AND  RecTimestamp < CAST(DATEADD(day, 1, GETDATE()) AS DATE) 

这将通过 sargable 并使用索引。

其次,在 table1..5 中添加一些持久计算列:

ALTER TABLE Table1 
   ADD STATUS_4_AND_A_OR_B 
   AS CAST(CASE WHEN RecordStatus = 3 or [Status] = 'A'  or [Status] = 'B' THEN 1 ELSE 0 END AS BIT) PERSISTED;

ALTER TABLE Table1 
   ADD STATUS_4_NOT_A_NOT_B 
   AS CAST(CASE WHEN  RecordStatus = 4 and [Status] <> 'A'  and [Status] <> 'B' THEN 1 ELSE O END AS BIT) PERSISTED;

    创建索引:

    X1 : (记录状态) X2 : (STATUS_4_AND_A_OR_B) X3 : (STATUS_4_AND_A_OR_B, RecTimestamp) X4:(STATUS_4_NOT_A_NOT_B)

然后享受

【讨论】:

我不知道为什么我不能在我的帖子中对第三部分进行代码启发......我尝试了几次......

以上是关于50 万条记录的 SQL 查询性能优化的主要内容,如果未能解决你的问题,请参考以下文章

SQL性能优化(Oracle)

优化SQL查询:如何写出高性能SQL语句

优化SQL查询:如何写出高性能SQL语句

优化SQL查询:如何写出高性能SQL语句

优化SQL查询:如何写出高性能SQL语句

优化SQL查询:如何写出高性能SQL语句