按月、年分组的数据集运行总计

Posted

技术标签:

【中文标题】按月、年分组的数据集运行总计【英文标题】:Running Total For dataset grouped by month, year 【发布时间】:2017-01-27 20:21:54 【问题描述】:

我正在尝试使用 OVER (PARTITION BY) 创建一个运行总计。

我的原始查询:

SELECT DATEPART(MONTH, t.received_date) AS [Month],
DATEPART(YEAR, t.received_date) AS [Year],
SUM(rdai.number_of_pages) AS [Count]
FROM dbo.request_document_additonal_information AS [rdai]
INNER JOIN #TempRequestIDs AS [t]
    ON rdai.request_id = t.id
GROUP BY DATEPART(MONTH, t.received_date),
DATEPART(YEAR, t.received_date)
ORDER BY Year,
Month;

结果:

Month  Year  Count
10     2015  1202342
11     2015  1059471
12     2015  1142629
1      2016  1081412
2      2016  1181385
3      2016  1334966

我的目标是为每个月创建一个运行小计,并且我尝试这样做:

SELECT DATEPART(MONTH, t.received_date) AS [Month],
DATEPART(YEAR, t.received_date) AS [Year],
SUM(rdai.number_of_pages) AS [Count]
,SUM(rdai.number_of_pages) OVER (PARTITION BY DATEPART(MONTH, t.received_date), DATEPART(YEAR, t.received_date)
                                ORDER BY DATEPART(MONTH, t.received_date), DATEPART(YEAR, t.received_date)
                                RANGE UNBOUNDED PRECEDING
                               ) as [RunningTotal]
FROM dbo.request_document_additonal_information AS [rdai]
INNER JOIN #TempRequestIDs AS [t]
    ON rdai.request_id = t.id
GROUP BY DATEPART(MONTH, t.received_date),
DATEPART(YEAR, t.received_date)
ORDER BY Year,
Month;

但错误返回状态:

Column 'dbo.request_document_additonal_information.number_of_pages' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

如果我添加GROUP BY .... rdai.number_of_pages,则会列出运行总数,但所有列的数字都是相同的。

我能否在使用此窗口化函数时出错的地方获得一些帮助?

谢谢,

【问题讨论】:

【参考方案1】:

一种选择是嵌套您的原始查询

Select A.*
      ,RunningTotal = sum(count) over (Order by Year ,Month)
 From (
        SELECT DATEPART(MONTH, t.received_date) AS [Month],
        DATEPART(YEAR, t.received_date) AS [Year],
        SUM(rdai.number_of_pages) AS [Count]
        FROM dbo.request_document_additonal_information AS [rdai]
        INNER JOIN #TempRequestIDs AS [t]
            ON rdai.request_id = t.id
        GROUP BY DATEPART(MONTH, t.received_date),
        DATEPART(YEAR, t.received_date)
      ) A
ORDER BY Year,Month;

【讨论】:

感谢 John - 这很有效,但它会在年份变化时重置。有什么办法吗? @MISNole 当然。抱歉,只是假设您想要重新设置。查看更新的答案...删除了分区,并按年,月更改了顺序 谢谢 - 在这种情况下我不需要重置,但我将保存这两个示例以备将来使用。我很感激。 @MISNole 很高兴它有帮助。

以上是关于按月、年分组的数据集运行总计的主要内容,如果未能解决你的问题,请参考以下文章

LINQ:在日期时间字段中按月和年分组

pandas 重新排序分组数据框中的列子集

MySQL按月汇总并运行总计[重复]

在 Python 中按列分组以获得总计数

在 Spark 数据集中创建具有运行总计的列

如何按月生成燃尽图