有啥方法可以在同一个大表上使用 3x UNION All 来加速复杂查询?

Posted

技术标签:

【中文标题】有啥方法可以在同一个大表上使用 3x UNION All 来加速复杂查询?【英文标题】:Is there any way to speed up a complex query with 3x UNION All on the same large table?有什么方法可以在同一个大表上使用 3x UNION All 来加速复杂查询? 【发布时间】:2015-03-25 17:44:09 【问题描述】:

我正在优化一个过程,目前从 SQL 执行需要超过 23 分钟。我尝试重新索引我的表格,但没有任何改善的迹象。

我无法为我们找到交叉应用或 UNPIVOT 以使其正常工作的方法。

我可以做些什么来提高这个过程的速度吗?

    @ClientId int,
    @FormParentId int = NULL,
    @FormQueryMode varchar(20) = 'changelog'
declare
    @ClientId int,
    @FormParentId INT,
    @FormQueryMode varchar(20)

select
        @ClientId = 11,
        @FormParentId = 277719,
        @FormQueryMode = NULL

DROP TABLE #History
*/
    CREATE TABLE #History
    (
        FormHistRecId bigint IDENTITY(1,1) NOT NULL,
        FormParentId int,
        EffDate datetime,
        Updated datetime NULL,
        Change varchar(50),
        VersionNote varchar(1000),
        VersionUserId varchar(50)
    )

    INSERT INTO #History (FormParentId, EffDate, Updated, Change, VersionNote, VersionUserId)
    SELECT ParentId, EffDate, Updated, Change, VersionNote, VersionUserId
    FROM (
        SELECT f.ClientId, CASE WHEN f.MainFormId IS NULL THEN f.ParentId ELSE (SELECT ParentId FROM Forms WHERE FormId=f.MainFormId) END AS ParentId, f.EffDate, f.Updated, 'Forms' as Change, 
            f.VersionNote, f.VersionUserId
        FROM Forms f
        WHERE Status IN (0,1,2) --negatives are 'temporary', and should be ignored if they aren't already cleaned out
        UNION ALL/*
        SELECT f.ClientId, (SELECT ParentId FROM Forms WHERE FormId=f.MainFormId) AS ParentId, f.EffDate, f.Updated, 'Attachment_Forms' as Change, 
            f.VersionNote, f.VersionUserId
        FROM Forms f
        WHERE Status IN (0,1,2) --negatives are 'temporary', and should be ignored if they aren't already cleaned out
        UNION ALL*/
        SELECT f.ClientId, CASE WHEN f.MainFormId IS NULL THEN f.ParentId ELSE (SELECT ParentId FROM Forms WHERE FormId=f.MainFormId) END, h.EffDate, h.Updated, 'Holders' as Change, 'Holder Updated' as VersionNote, h.VersionUserId
        FROM Forms f
        LEFT OUTER JOIN Forms f_exp ON f_exp.FormId=(
            SELECT TOP 1 FormId
            FROM Forms 
            WHERE ParentId=f.ParentId 
            AND ( EffDate > f.EffDate OR (EffDate=f.EffDate AND Updated > f.Updated))
            ORDER BY EffDate ASC, Updated ASC
        )
        --check for holder updates within the span this Forms record's selection implies
        INNER JOIN Holders h ON h.ParentId=f.HolderParentId
                AND h.EffDate >= f.EffDate --must be effective after-or-with the form
                AND h.EffDate >= f.Updated --must be entered after-or-with the form
                AND (f_exp.FormId IS NULL OR h.EffDate < f_exp.EffDate) --must be effective before next form becomes effective
                AND (f_exp.FormId IS NULL OR h.EffDate < f_exp.Updated) --must be entered before next form is entered
        UNION ALL
        SELECT f.ClientId, CASE WHEN f.MainFormId IS NULL THEN f.ParentId ELSE (SELECT ParentId FROM Forms WHERE FormId=f.MainFormId) END, ct.Updated as EffDate, NULL AS Updated, 'ClientTemplates' as Change
            ,ct.VersionNote, ct.VersionUserId
        FROM Forms f
        LEFT OUTER JOIN Forms f_exp ON f_exp.FormId=(
            SELECT TOP 1 FormId
            FROM Forms 
            WHERE ParentId=f.ParentId 
            AND ( EffDate > f.EffDate OR (EffDate=f.EffDate AND Updated > f.Updated))
            ORDER BY EffDate ASC, Updated ASC
        )
        INNER JOIN ClientTemplates ct ON ct.ParentId=f.TemplateParentId
                AND ct.Updated >= f.EffDate
                --AND ct.Updated >= f.Updated
                AND (f_exp.FormId IS NULL OR ct.Updated < f_exp.EffDate)
                --AND (f_exp.FormId IS NULL OR ct.Updated < f_exp.Updated)
        UNION ALL
        SELECT f.ClientId, CASE WHEN f.MainFormId IS NULL THEN f.ParentId ELSE (SELECT ParentId FROM Forms WHERE FormId=f.MainFormId) END, mt.Updated as EffDate, NULL AS Updated, 'MasterTemplates' as Change
            ,mt.VersionNote, mt.VersionUserId
        FROM Forms f
        LEFT OUTER JOIN Forms f_exp ON f_exp.FormId=(
            SELECT TOP 1 FormId
            FROM Forms 
            WHERE ParentId=f.ParentId 
            AND ( EffDate > f.EffDate OR (EffDate=f.EffDate AND Updated > f.Updated))
            ORDER BY EffDate ASC, Updated ASC
        )
        INNER JOIN ClientTemplates ct ON ct.ParentId=f.TemplateParentId
                AND ct.Updated >= f.EffDate
                --AND ct.Updated >= f.Updated
                AND (f_exp.FormId IS NULL OR ct.Updated < f_exp.EffDate)
                --AND (f_exp.FormId IS NULL OR ct.Updated < f_exp.Updated)
        LEFT OUTER JOIN ClientTemplates ct_exp ON ct_exp.TemplateId=(
            SELECT TOP 1 TemplateId
            FROM ClientTemplates 
            WHERE ParentId=ct.ParentId 
            AND Updated > ct.Updated    --(no eff date)
            ORDER BY Updated ASC
        )
        INNER JOIN MasterTemplates mt ON mt.ParentId=ct.MasterParentId
                AND mt.Updated >= ct.Updated
                --AND mt.Updated >= ct.Updated
                AND (ct_exp.TemplateId IS NULL OR mt.Updated < ct_exp.Updated)
        UNION ALL
        SELECT (SELECT TOP 1 ClientId FROM Forms WHERE ParentId=ctblsel.FormParentId), 
            (SELECT
            CASE WHEN f.MainFormId IS NULL THEN f.ParentId ELSE (SELECT ParentId FROM Forms WHERE FormId=f.MainFormId) END
            FROM Forms f WHERE FormId=(
                SELECT TOP 1 FormId
                FROM Forms
                WHERE ParentId=ctblsel.FormParentId
                AND ( EffDate > ctblsel.EffDate OR (EffDate=ctblsel.EffDate AND Updated > ctblsel.Updated))
                ORDER BY EffDate DESC, Updated DESC
                ) 
            )AS FormParentId,
            --ctblsel.FormParentId, 
ctblsel.EffDate, ctblsel.Updated, 'CTblEntrySelection' as Change
            ,'Client Table Entry Selected' as VersionNote, ctblsel.VersionUserId
        FROM CTblEntrySelection ctblsel
    ) dt
    WHERE ClientId=@ClientId
    AND (@FormParentId IS NULL OR ParentId=@FormParentId)

    DECLARE CTblCur CURSOR FOR
    SELECT ClientTableId
    FROM ClientTables
    WHERE ClientId=@ClientId
    OPEN CTblCur
    DECLARE @ClientTableId int
    DECLARE @q nvarchar(MAX), @p nvarchar(100), @tblname nvarchar(50)
    SET @p =
'@ClientId int,
@FormParentId int,
@ClientTableId int'
    FETCH NEXT FROM CTblCur INTO @ClientTableId
    WHILE @@FETCH_STATUS = 0
    BEGIN
        SET @tblname = '[ClientTable_'+CONVERT(nvarchar,@ClientTableId)+']'
        SET @q = 
'INSERT INTO #History (FormParentId, EffDate, Updated, Change, VersionNote, VersionUserId)
SELECT ctblsel.FormParentId, ctbl.Updated as EffDate, NULL AS Updated, ''ClientTable'' as Change
    ,ctbl.VersionNote, ctbl.VersionUserId
FROM CTblEntrySelection ctblsel
LEFT OUTER JOIN CTblEntrySelection ctblsel_exp ON ctblsel_exp.RecId = (
        SELECT TOP 1 RecId
        FROM CTblEntrySelection
        WHERE FormParentId=ctblsel.FormParentId
        AND ClientTableId=ctblsel.ClientTableId
        AND ( EffDate > ctblsel.EffDate OR (EffDate=ctblsel.EffDate AND Updated > ctblsel.Updated))
        ORDER BY EffDate ASC, Updated ASC
)
INNER JOIN '+@tblname+' ctbl ON ctbl.ParentId=ctblsel.CTblEntryParentId
        AND ctbl.Updated >= ctblsel.EffDate --must be effective after-or-with the form
        --AND ctbl.Updated >= ctblsel.Updated --must be entered after-or-with the form
        AND (ctblsel_exp.RecId IS NULL OR ctbl.Updated < ctblsel_exp.EffDate) --must be effective before next form becomes effective
        --AND (ctblsel_exp.RecId IS NULL OR ctbl.Updated < ctblsel_exp.Updated) --must be entered before next form is entered
WHERE (SELECT TOP 1 ClientId FROM Forms WHERE ParentId=ctblsel.FormParentId)=@ClientId
AND ctblsel.ClientTableId=@ClientTableId'
        IF @FormParentId IS NOT NULL
            SET @q = @q + '
AND ctblsel.FormParentId=@FormParentId'
print @q
        EXEC sp_ExecuteSql @q, @p, @ClientId, @FormParentId, @ClientTableId
        FETCH NEXT FROM CTblCur INTO @ClientTableId
    END
    CLOSE CTblCur
    DEALLOCATE CTblCur

    IF @FormQueryMode = 'issuancelog'
    BEGIN
        INSERT INTO #History (FormParentId, EffDate, Updated, Change, VersionNote, VersionUserId)
        SELECT l.FormParentId, l.EffDate, l.Updated, 'Issuance', h.VersionNote, l.UserId
        FROM FormIssuanceLog l
        LEFT OUTER JOIN #History h ON h.FormHistRecId=(
            SELECT TOP 1 FormHistRecId
            FROM #History
            WHERE FormParentId=l.FormParentId
            AND EffDate <= l.EffDate
            AND COALESCE(Updated,EffDate) <= l.Updated --if #history's Updated is null, means it's a change that cannot be anachronistic; updated = effective.
            ORDER BY EffDate DESC, Updated DESC
        )
        WHERE (
            (@FormParentId IS NOT NULL AND l.FormParentId=@FormParentId)
            OR
            (@FormParentId IS NULL AND l.FormParentId IN ( SELECT ParentId FROM Forms WHERE ClientId=@ClientId ))
        )
        AND l.ConfirmIssued=1

        DELETE FROM #History WHERE Change<>'Issuance'
    END

    SELECT *,
        LastName + ', ' + FirstName + ' (' + Username + ')' as VersionAuthor
    FROM (
        SELECT 
            'Courier' FormType,
            dt2.FormParentId,
            NULL as LegacyFormId,
            dt2.EffDate,
            dt2.Updated,
            dt2.Change,
            dt2.VersionNote,
            dt2.VersionUserId
            ,Forms.FormId, ClientTemplates.Name as CTName, Users.FirstName, Users.LastName, Users.Username
            ,'Courier_'+CONVERT(nvarchar,dt2.FormParentId)+'_'+CONVERT(nvarchar,dt2.EffDate,127)+COALESCE('_'+CONVERT(nvarchar,dt2.Updated,127),'') as PK
            ,CASE Change
                WHEN 'MasterTemplates' THEN 'Master Form (blank PDF)'
                WHEN 'ClientTemplates' THEN 'Template'
                WHEN 'ClientTable' THEN 'Client Table'
                WHEN 'Forms' THEN 'Form'
                WHEN 'Holders' THEN 'Holder Address Record'
                WHEN 'CTblEntrySelection' THEN 'Client Table Selection'
                WHEN 'Issuance' THEN 'Form Issuance'
                ELSE '-----'
            END ChangeText
            ,'pg_IssuanceLogPrompt' as Interface
            ,CASE @FormQueryMode WHEN 'issuancelog' THEN 'ShowHistForm_IssMode' ELSE 'ShowHistForm' END as Command
            ,CONVERT(varchar(50),dt2.FormParentId) as Param1
            ,CONVERT(varchar(50),dt2.EffDate,126) Param2
            ,CONVERT(varchar(50),dt2.Updated,126) Param3
            ,'Courier.frm' AS HTTPTarget
            ,Forms.Status FormStatus
            ,ie.Name AS EditionName
        FROM (
            SELECT *, ROW_NUMBER() OVER (PARTITION BY FormParentId, EffDate, Updated ORDER BY ChangeDominance ASC ) rn
            FROM (
                SELECT *,
                    CASE Change 
                        WHEN 'MasterTemplates' THEN 1
                        WHEN 'ClientTemplates' THEN 2
                        WHEN 'ClientTable' THEN 3
                        WHEN 'Forms' THEN 4
                        WHEN 'Holders' THEN 5
                        WHEN 'CTblEntrySelection' THEN 6
                        ELSE 99
                    END as ChangeDominance  --largely applies to change log; issuance log will get 99, but it only has one value anyway
                FROM #History
            ) dt
        ) dt2
        LEFT OUTER JOIN Forms ON Forms.FormId=(
            SELECT TOP 1 FormId
            FROM Forms
            WHERE ParentId=dt2.FormParentId
            AND EffDate <= dt2.EffDate
            AND Updated <= COALESCE(dt2.Updated, dt2.Updated) --if #history's Updated is null, means it's a change that cannot be anachronistic; updated = effective.
            AND [Status] >= 0
            ORDER BY EffDate DESC, Updated DESC
        )
        LEFT OUTER JOIN Users ON Users.UserId=dt2.VersionUserId
        LEFT OUTER JOIN ClientTemplates ON ClientTemplates.TemplateId=(
            SELECT TOP 1 TemplateId
            FROM ClientTemplates
            WHERE ParentId=Forms.TemplateParentId
            AND Updated <= dt2.EffDate
            --AND Updated <= dt2.Updated
            ORDER BY Updated DESC
        )
        LEFT OUTER JOIN dbo.IssuanceEditions ie ON ie.EditionId=(
            SELECT TOP 1 EditionId
            FROM dbo.IssuanceEditions ie2
            WHERE ie2.ClientId=(SELECT DISTINCT ClientId FROM dbo.Forms WHERE ParentId=dt2.FormParentId)
            AND ie2.Updated <= dt2.EffDate
            ORDER BY ie2.Updated DESC
        )           
        WHERE rn=1
        UNION ALL
        SELECT 
            'LegacyPDF' as FormType,
            NULL as FormParentId,
            LegacyFormId,
            EffDate,
            NULL as Updated,
            'Imported File' as Change,
            HistDescription as VersionNote,
            NULL as VersionUserId
            ,NULL as FormId, NULL as CTName, NULL as FirstName, NULL as LastName, NULL as Username
            ,'LegacyPDF_'+CONVERT(nvarchar,LegacyFormId) as PK
            ,'Imported File' as ChangeText
            ,'ShowLegacyForm' as Interface
            ,'' as Command
            ,CONVERT(varchar(50),LegacyFormId) as Param1
            ,NULL Param2
            ,NULL Param3
            ,'Courier.dwnl' AS HTTPTarget
            ,NULL FormStatus
            ,NULL EditionName
        FROM LegacyForms
        WHERE ClientId=@ClientId
        AND (@FormParentId IS NULL OR FormParentId=@FormParentId)
    ) dt
    ORDER BY EffDate DESC, Updated DESC

【问题讨论】:

【参考方案1】:

光标跳出来作为您可能想要消除的东西。这是一个如此庞大的查询,包含如此多的部分,几乎不可能看到哪里可以改进,以及是否可以改进。

查看批处理生成的查询计划,如果批处理生成的查询计划可能会占用 80% 的时间,那么应该非常仔细地检查这些查询。我预计大部分时间都花在 sp_executesql 语句上,如果可能的话,应该更改为 SQL Server 可以优化的连接。

【讨论】:

感谢您的建议,我会尝试的。这是一个旧查询,过去可以很好地运行 Web 程序,但现在会导致许多问题发生。 可能发生了什么(在这里疯狂猜测......)客户端数量增加了,因此光标执行了更多次...... 这是正确的。现在有超过 3,000 个客户表。为了让一个客户继续运行程序,我必须启动并创建一个新数据库。 如果您想进行关系操作,您可能需要一张表,其中包含一个额外的 clientid 列。它需要重写一些代码,但从中长期来看,你的生活将变得更加实用。

以上是关于有啥方法可以在同一个大表上使用 3x UNION All 来加速复杂查询?的主要内容,如果未能解决你的问题,请参考以下文章

在 Django 中的大表上的内存效率(常量)和速度优化迭代

MySQL查询在大表上很慢

如何在低规格系统上的大表上提高 MySQL 性能?

executeUpdate 在大表上返回负值

大表上的第一次查询调用速度非常慢

Oracle - 未使用大表上的索引