SQL Server:清理旧数据需要太长时间

Posted

技术标签:

【中文标题】SQL Server:清理旧数据需要太长时间【英文标题】:SQL Server : cleaning old data takes too long 【发布时间】:2020-07-21 08:29:55 【问题描述】:

我正在尝试从我的 SQL Server 数据库中清除旧数据(每个表中大约 5000 个条目),但是由于我在另一个数据库中循环了一个 CURSOR,所以它需要的时间太长(超过一个小时)。

BEGIN
    DECLARE @UserId int
    DECLARE @productNum varchar(50)

    DECLARE user_ids CURSOR FOR SELECT id
    FROM Users
    WHERE productId IN (SELECT ap.id 
                        FROM Account AS a, AccountProduct AS ap 
                        WHERE a.id = ap.accountId 
                          AND a.name IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX'))
    
    DECLARE product_cur CURSOR FOR 
        SELECT ap.id 
        FROM Account AS a, AccountProduct AS ap
        WHERE a.id = ap.accountId 
          AND a.name IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX') 
    
    OPEN user_ids

    FETCH NEXT FROM user_ids INTO @UserId

    WHILE @@FETCH_STATUS = 0
    BEGIN
        OPEN product_cur 

        FETCH NEXT FROM product_cur INTO @productNum

        WHILE @@FETCH_STATUS = 0
        BEGIN
            DELETE FROM UserRole 
            WHERE userId = @UserId 
              AND productId = (SELECT id 
                               FROM AccountProduct 
                               WHERE number = @productNum)
                
            DELETE FROM AccountProduct 
            WHERE number = @productNum

            FETCH NEXT FROM product_cur INTO @productNum
        END
        CLOSE product_cur 

        DELETE FROM Users 
        WHERE id = @UserId 
          AND accountId IN (SELECT id FROM Account 
                            WHERE name IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX'))

        FETCH NEXT FROM user_ids INTO @UserId
    END

    CLOSE user_ids
    DEALLOCATE user_ids
    DEALLOCATE product_cur 
END

您知道完成这项任务的更好方法吗?

【问题讨论】:

Bad habits to kick : using old-style JOINs - 旧式 逗号分隔的表格列表 样式已替换为 ANSI 中的 proper ANSI JOIN 语法-92 SQL 标准(25 多年前),不鼓励使用它 不使用CURSOR 将是一个开始,尤其是光标内的光标。 SQL 是一种基于集合的语言,您应该使用基于集合的解决方案。 以上是完整的SQL吗?例如,您引用了光标room_cur,但是,您从不声明它。 @Larnu,这是我的错误,room_cur 是 product_cur,我编辑了问题。抱歉错误 【参考方案1】:

这对于评论来说太长了,但是,这应该足以为您提供正确的想法。然而,上面的 SQL 似乎不是完整的 SQL,所以我不能给你提供相同行为的 SQL(例如,你在 FETCH 语句中引用游标 room_cur,但是, SQL 中没有声明游标room_cur)。

SQL 是一种基于集合的语言,它擅长基于集合的解决方案。游标不是基于集合的解决方案,它们是迭代任务,而 SQL Server 很烂。这是设计。 SQL 不是一种编程语言,因此像编程语言一样编写它意味着性能不佳。

对于DELETE 语句,您只需像对待任何其他语句一样对待它,因为DELETEFROM 返回的数据集中定义的表中删除行。对于Users 上的DELETE,这(可能)意味着你想要这样的东西:

DELETE U
FROM dbo.Users U
     JOIN dbo.Account A ON U.acccountID = A.id
WHERE A.[name] IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX');

这只是DELETEdbo.Users 中的行,其中在dbo.Account 中找到连接行,name 中的值在IN 子句中。

【讨论】:

【参考方案2】:

您可以通过删除进行频繁提交。我建议在事务中一次删除一个用户 ID 并提交它们。

Go for batch based deletion.

DECLARE @UserIdsToDelete TABLE(RowNo int, UserId int)
DECLARE @ProductsToDelete TABLE(RowNo int, ProductId int)

INSERT INTO @UserIdsToDelete
SELECT ROW_NUMBER() OVER (ORDER BY UserId) as RowNo, UserId
    FROM Users
    WHERE productId IN (SELECT ap.id 
                        FROM Account AS a, AccountProduct AS ap 
                        WHERE a.id = ap.accountId 
                          AND a.name IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX'))

INSERT INTO @productsToDelete
SELECT ROW_NUMBER() OVER (ORDER BY ap.id) as RowNo, ap.id 
        FROM Account AS a, AccountProduct AS ap
        WHERE a.id = ap.accountId 
          AND a.name IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX') 


DECLARE @UserIdForDeletion INT
DECLARE @RowNoForDeletion INT = 1

SET @UserIdForDeletion = (SELECT UserID from 
@UserIdsToDelete 
WHERE RowNO = @RowNoForDeletion )
 
-- Deletion of Users
WHILE (@UserIdForDeletion IS NOT NULL )
  BEGIN

BEGIN TRY
        SET XACT_ABORT ON

        BEGIN TRANSACTION

   DELETE FROM UserRole
   WHERE UserId = @UserIdForDeletion 
   AND productId IN (SELECT ProductID from @ProductsToDelete)

    DELETE FROM Users 
        WHERE id = @UserId 
          AND accountId IN (SELECT id FROM Account 
                            WHERE name IN ('XXXXXX', 'XXXXXX', 'XXXXXX', 'XXXXXX'))

    COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH

        IF XACT_STATE() <> 0 
            ROLLBACK TRANSACTION;
    
        THROW;
    END CATCH

@RowNoForDeletion += 1;
SET @UserIdForDeletion = (SELECT UserID from 
@UserIdsToDelete 
WHERE RowNO = @RowNoForDeletion )
END

-- Delete the account products
BEGIN TRY
        SET XACT_ABORT ON

        BEGIN TRANSACTION
DELETE FROM AccountProduct 
            WHERE number IN (SELECT ProductID from @ProductsToDelete)
COMMIT TRANSACTION;
    END TRY
    BEGIN CATCH

        IF XACT_STATE() <> 0 
            ROLLBACK TRANSACTION;
    
        THROW;
    END CATCH

【讨论】:

【参考方案3】:

不要使用游标,因为它肯定会影响查询的性能。为什么不以较小的批量执行删除以加快查询执行时间? How to delete large data of table in SQL without log?

【讨论】:

以上是关于SQL Server:清理旧数据需要太长时间的主要内容,如果未能解决你的问题,请参考以下文章

数据库在 SQL Server 2012 上加入 HA 组需要很长时间

使用 Flex 时 SQL Server 需要很长时间才能将数据返回到 ColdFusion

在 PowerShell 中将大型 blob 从 SQL Server 提取到文件需要很长时间

如何让 SQL Server 2012 Management Studio 长时间连接到数据库?

清理手机缓存每次都要花费很长时间,有啥办法可以快速清理吗?

为啥 SQL Server 表值函数插入需要很长时间?