如何优化具有大量聚合的查询
Posted
技术标签:
【中文标题】如何优化具有大量聚合的查询【英文标题】:How to Optimize Query with Lots of Aggregates 【发布时间】:2011-07-19 18:47:31 【问题描述】:如何优化此查询?现在,它的运行速度太慢了~10s。完整详情如下:
SELECT ProjectName,
Actuals_YTD,
Rem_Forecast,
Total_Forecast,
Approved_Budget,
Variance,
Variance_Percentage,
ProjectComments,
VersionType,
ModifiedDate
FROM (SELECT pd.ProjectId,
pd.ProjectName,
SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) <= '06/01/2011' THEN feb.USDactualamount ELSE 0.0 END) AS Actuals_YTD,
SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) > '06/01/2011' THEN feb.forecastusd ELSE 0.0 END) AS Rem_Forecast,
((SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) <= '06/01/2011' THEN feb.USDactualamount ELSE 0.0 END)) + (SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) > '06/01/2011' then feb.forecastusd else 0.0 end))) AS Total_Forecast,
SUM(COALESCE((feb.REVISEDPLANUSD),0)) AS Approved_Budget,
((SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) <= '06/01/2011' THEN feb.USDactualamount ELSE 0.0 END)) + (SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) > '06/01/2011' then feb.forecastusd else 0.0 end))) - ((SUM(COALESCE((feb.REVISEDPLANUSD),0)))) AS Variance,
CASE WHEN (SUM(COALESCE((feb.REVISEDPLANUSD),0))) = 0 THEN NULL ELSE ((((((SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(base.ProjectMonth) <= '06/01/2011' THEN feb.USDactualamount else 0.0 end)) + (SUM(CASE WHEN RPD.PROJECTMONTH_TO_DATE(projectmonth) > '06/01/2011' then feb.forecastusd else 0.0 end)))) - (SUM(COALESCE((feb.REVISEDPLANUSD),0)))) / (SUM(COALESCE((feb.REVISEDPLANUSD),0)))) * 100) END AS Variance_Percentage,
pd.ProjectAux1,
pd.ProjectComments,
pd.VersionType,
MAX(base.ModifiedDate) AS ModifiedDate
FROM rpd.ProjectDetail pd INNER JOIN rpd.FundSource fs ON pd.FundSourceId = fs.FundSourceId
INNER JOIN rpd.Baseline base ON pd.ProjectId = base.ProjectId
INNER JOIN rpd.FundEntityBaseline feb ON feb.BaselineId = base.BaselineId
GROUP BY pd.ProjectAux1, pd.ProjectId, pd.ProjectName, pd.ProjectComments, pd.VersionType)
WHERE VersionType Like '%Text%' WITH UR
这是 3 个表的架构(不包括 FundSource,因为它只有大约 200 行,我认为它可以忽略不计)
架构:
行:
基金实体基线:354603 基线:80208 项目详情:1813ProjectDetail 指标:
1 个主键索引 (ProjectId) 1 个外键索引 (FundSourceId) 1 SELECT/GROUP BY 包含列的索引 (ProjectAux1, ProjectId、ProjectName、ProjectComments、VersionType) 1 索引(版本类型,项目名称)基线指数:
1 个主键索引 (BaselineId) 1 个外键索引 (ProjectId) 1 索引与 (ProjectTeamId, ProjectMonth) 1 个仅包含 ProjectMonth 的索引FundEntityBaseline 上的指数
1 个主键索引 (FundEntityBaselineId) 1 个外键索引 (BaselineId)最新访问计划:
【问题讨论】:
你能显示 PROJECTMONTH_TO_DATE 函数/过程的来源吗? 【参考方案1】:将 where 子句 (WHERE VersionType Like '%Text%) 移动到一条直线上,使其位于内部 SQL 语句中。现在的方式是,您的查询将首先进行所有可能的连接,然后使用 where 子句过滤该完整集。
所以你的陈述会是这样的
WHERE pd.VersionType Like '%Text%'
GROUP BY .....
【讨论】:
【参考方案2】:将您的索引放入(=重新创建)页面大小为 32K 的表空间中 - 如果尚未配置的话。
【讨论】:
以上是关于如何优化具有大量聚合的查询的主要内容,如果未能解决你的问题,请参考以下文章