凌乱的 SQL 查询需要更高效

Posted

技术标签:

【中文标题】凌乱的 SQL 查询需要更高效【英文标题】:Messy SQL Query Needs to be more Efficient 【发布时间】:2021-02-19 16:07:35 【问题描述】:

此数据拉取需要帮助。效率/性能很糟糕,我对 SQL 的了解还不够,无法让它变得更好。我正在进行一个需要我快速学习 SQL 的项目,但考虑到我正在考虑的时间框架,我已经找到了你,专业人士......来自专业人士的任何想法高效?

SELECT
d.[Date] AS [Date],
LEFT(CONVERT(VARCHAR,d.[Date],112),6) AS [YearMo],
FORMAT(d.[Date],'MMMM') AS [Month],
YEAR(d.[Date]) AS [Year],
e.[MbrNo] AS [Member ID],
e.[Mkt_State] AS [Mkt State],
e.[Mkt] AS [Mkt Segment],

COALESCE(e.[Individual_Premium_Amt],0) AS [Individual Premium],
COALESCE(e.[Total_Premium_Amt],0) AS [Total Premium],

COALESCE(v.[Inpatient_Pd],0) AS [Inpatient Pd],
COALESCE(v.[Outpatient_Pd],0) AS [Outpatient Pd],
COALESCE(v.[Professional_Pd],0) AS [Professional Pd],
COALESCE(v.[Other_Pd],0) AS [Other Pd],
COALESCE(v.[Med_Pd],0) AS [Med Pd],
COALESCE(SUM(v.[Med_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])),0) AS [Total Med Pd YTD],
COALESCE(v.[Med_Allowed],0) AS [Med Allowed],
COALESCE(v.[Rx_Pd],0) AS [Rx Pd],
COALESCE(SUM(v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])),0) AS [Total RX Pd YTD],
COALESCE(v.[Rx_Allowed],0) AS [Rx Allowed],
COALESCE(v.[Med_RX_Pd],0) AS [Med Rx Pd],
COALESCE(SUM(v.[Med_Pd] + v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])
    ORDER BY e.[MbrNo],LEFT(CONVERT(VARCHAR,d.[Date],112),6) ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW),0) AS [RS Med Rx Pd],
COALESCE(SUM(v.[Med_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date]))
    + SUM(v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])),0) AS [Total Pd YTD],
COALESCE(v.[Med_RX_Allowed],0) AS [Med Rx Allowed],

CASE
    WHEN ((SUM(v.[Med_Pd] + v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date]))) > rr.[Recover_Threshold])
    THEN ((SUM(v.[Med_Pd] + v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])) - rr.[Recover_Threshold]))
    ELSE 0 
END AS [Recoverable Amt],

SUM(1.0) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])) AS [MM (Yearly)],

CASE
    WHEN ((SUM(v.[Med_Pd] + v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date]))) > rr.[Recover_Threshold])
    THEN ((SUM(v.[Med_Pd] + v.[Rx_Pd]) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date])) - rr.[Recover_Threshold]))
    / SUM(1.0) OVER (PARTITION BY e.[MbrNo],YEAR(d.[Date]))
    ELSE 0 
END AS [Recoverable Amtv2], 


COALESCE(rr.[Members],0) AS [Members],
COALESCE(rr.[MM],0) AS [MM],
COALESCE(rr.[Rx_Rebates],0) AS [Rx Rebates],
COALESCE(rr.[RA_PMPM],0) AS [RA PMPM],
COALESCE(rr.[RA_Payable],0) AS [RA Payable],
COALESCE(rr.[Pd_Threshold],0) AS [SR Pd Threshold],
COALESCE(rr.[Recover_Threshold],0) AS [SR Recover Threshold],
COALESCE(rr.[CF_Inpatient_PMPM],0) AS [CF IP PMPM],
COALESCE(rr.[CF_Outpatient_PMPM],0) AS [CF OP PMPM],
COALESCE(rr.[CF_Professional_PMPM],0) AS [CF PROF PMPM],
COALESCE(rr.[CF_RX_PMPM],0) AS [CF RX PMPM],
COALESCE(rr.[CF_Med_PMPM],0) AS [CF Med PMPM],
COALESCE(SUM(rr.[CF_RX_PMPM])+(rr.[CF_Med_PMPM]),0) AS [CF Med_Rx PMPM]


FROM -- Date Scaffold - Each month starting 20170101 to the current GETDATE() month 
    (SELECT
        DATEADD(MONTH,number,'20190101') AS [Date],
        EOMONTH(DATEADD(MONTH,number,'20190101')) AS [EOM Date]
        FROM MASTER..[spt_values]
        WHERE TYPE='P'
            AND DATEADD(MONTH,number,'20190101') <= GETDATE()
    ) AS d


INNER JOIN -- Join Med enrollment for each month to the date scaffold, creating the membermonths format
    (SELECT 
        e.*
        FROM [SomeDB].[dbo].[sometable] AS e
        WHERE [benefitType]=930700000
            AND e.[LOB]='Commercial'
            AND e.[Segment_Cancelled]<>'Yes'
            AND e.[Mbr_Status]<>'Pending Binder Payment'
            AND e.[MbrNo]<>0
    ) AS e
        ON e.[Start_Date]<=d.[Date] AND e.[End_Date]>=d.[EOM Date]


LEFT JOIN
    (SELECT
        c.[YEARMO],
        c.[MEMBERID],
        SUM(CASE WHEN c.[UTILGRP]='INPATIENT' THEN c.[Pd] ELSE 0 END) AS [Inpatient_Pd],
        SUM(CASE WHEN c.[UTILGRP]='OUTPATIENT' THEN c.[Pd] ELSE 0 END) AS [Outpatient_Pd],
        SUM(CASE WHEN c.[UTILGRP]='PROFESSIONAL' THEN c.[Pd] ELSE 0 END) AS [Professional_Pd],
        SUM(CASE WHEN [UTILGRP]='OTHER' THEN c.[Pd] ELSE 0 END) AS [Other_Pd],
        SUM(CASE WHEN c.[CLAIMTYPE]='Med' THEN c.[Pd] ELSE 0 END) AS [Med_Pd],
        SUM(CASE WHEN c.[CLAIMTYPE]='Med' THEN c.[ALLOWED] ELSE 0 END) AS [Med_Allowed],
        SUM(CASE WHEN c.[CLAIMTYPE]='Pharmacy' THEN c.[Pd] ELSE 0 END) AS [Rx_Pd],
        SUM(CASE WHEN c.[CLAIMTYPE]='Pharmacy' THEN c.[ALLOWED] ELSE 0 END) AS [Rx_ALlowed],
        SUM(CASE WHEN c.[CLAIMTYPE]='Med' OR c.[CLAIMTYPE]='Pharmacy' THEN c.[Pd] ELSE 0 END) AS [Med_RX_Pd],
        SUM(CASE WHEN c.[CLAIMTYPE]='Med' OR c.[CLAIMTYPE]='Pharmacy' THEN c.[ALLOWED] ELSE 0 END) AS [Med_RX_Allowed]

        FROM [SomeDB].[dbo].[SomeTable] AS c

        WHERE c.[MbrNo] <> 0
            AND c.[CLAIMLINESTATUS] NOT IN ('D','V')
            AND c.[LOB]='IND'

        GROUP BY c.[YearMo],c.[MbrNo]

    ) AS v
        ON e.[MbrNo]=v.[MbrNo] AND LEFT(CONVERT(VARCHAR,d.[Date],112),6)=v.[YearMo]


LEFT JOIN [SomeDB].[dbo].[SomeTable] AS rr
    ON e.[Mkt_Segment]=rr.[Mkt_Segment] AND LEFT(CONVERT(VARCHAR,d.[Date],112),6)=rr.[YearMo]


GROUP BY d.[Date],e.[MbrNo],e.[Mkt_State],e.[Mkt_Segment],e.[Individual_Premium_Amt],e.[Total_Premium_Amt],
v.[Inpatient_Pd],v.[Outpatient_Pd],v.[Professional_Pd],v.[Other_Pd],v.[Med_Pd],v.[Med_Allowed],v.[Rx_Pd],v.[Rx_Allowed],v.[Med_RX_Pd],v.[Med_RX_Allowed],
rr.[Members],rr.[MM],rr.[Rx_Rebates],rr.[RA_PMPM],rr.[RA_Payable],rr.[Pd_Threshold],rr.[Recover_Threshold],rr.[CF_Inpatient_PMPM],rr.[CF_Outpatient_PMPM],rr.[CF_Professional_PMPM],rr.[CF_RX_PMPM],rr.[CF_Med_PMPM]

【问题讨论】:

您需要向我们展示表和索引定义,以及每个表的行数。也许您的表格定义不佳。也许索引没有正确创建。也许您认为您在该列上没有索引。没有看到表和索引定义,我们无法判断。我们需要行计数,因为这会影响查询计划。如果您知道如何执行EXPLAIN 或获取执行计划,请将结果也放入问题中。如果您没有索引,请访问use-the-index-luke.com,了解它们为何重要。 请用您正在使用的 DBMS(Oracle、mysql、SQL Server 等)标记您的问题 您应该直接在日期“脚手架”中选择YEAR(d.Date),而不是LEFT(CONVERT(VARCHAR,d.[Date],112),6)=v.[YearMo]。那也应该有一个TOP(calculate number of months)。检查您选择的列和连接的表,如果没有业务需要,请将它们删除 【参考方案1】:

我会将所有的 0 语句(通常不是 SARGable)替换为 >0 的 Sargable 语句。

【讨论】:

以上是关于凌乱的 SQL 查询需要更高效的主要内容,如果未能解决你的问题,请参考以下文章

让 SQL 查询更高效

重写以下 SQL 查询以使其更高效/改进其执行及其原因

如何使我的 SQL 查询更高效,以便我的结果处理不会花费很长时间

mysql中Mysql模糊查询like效率,以及更高效的写法和sql优化方法

mysql中Mysql模糊查询like效率,以及更高效的写法和sql优化方法

Adaptive Execution如何让Spark SQL更高效更好用