SQL:过去 30 天的滚动总和(按组)
Posted
技术标签:
【中文标题】SQL:过去 30 天的滚动总和(按组)【英文标题】:SQL: Rolling sum in the last 30 days by groups 【发布时间】:2016-10-27 23:21:09 【问题描述】:我有一张如下表:
date, custid, sales
2015-01-01, 01, 100
2015-01-10, 01, 200
2015-02-05, 01, 300
2015-03-02, 01, 400
2015-03-03, 01, 500
2015-01-01, 02, 100
2015-01-10, 02, 200
2015-02-05, 02, 300
2015-03-02, 02, 400
2015-03-03, 02, 500
...
如何按日期和 custid 生成过去 30 天的滚动总销售额。
期望的输出是:
date, custid, running_30_day_sales
2015-01-01, 01, 100
2015-01-10, 01, 300 --(100+200)
2015-02-05, 01, 500 --(200+300)
2015-03-02, 01, 700 --(300+400)
2015-03-03, 01, 1200 -- (300+400+500)
2015-01-01, 02, 100
2015-01-10, 02, 300 --(100+200)
2015-02-05, 02, 500 --(200+300)
2015-03-02, 02, 700 --(300+400)
2015-03-03, 02, 1200 -- (300+400+500)
【问题讨论】:
【参考方案1】:这是使用self join
的一种方法。每个日期都与 datediff >0 且
select a1.custid, a1.dt, a1.sales+sum(coalesce(a2.sales,0)) total
from atable a1
left join atable a2 on a1.custid=a2.custid
and datediff(day,a2.dt,a1.dt)<=30 and datediff(day,a2.dt,a1.dt)>0
group by a1.custid,a1.dt,a1.sales
order by 1,2
Sample Demo in Postgres
为了更好的理解,看一下self-join using的查询结果
select a1.*,a2.*
from atable a1
left join atable a2 on a1.custid=a2.custid
and datediff(day,a1.dt,a2.dt)<=30 and datediff(day,a1.dt,a2.dt)>0
【讨论】:
【参考方案2】:这里有一个使用累积和的技巧:
with t as (
select custid, date, sales from atable
union all
select custid, date + interval '30 day', sales from atable
)
select custid, date,
sum(sum(sales)) over (partition by cust_id order by date rows between unbounded preceding and current row) as sales_30day
from t
group by custid, date;
【讨论】:
我认为这会生成不在表中的行。 嗨,戈登,感谢您的回复。你能加一点评论吗?此外,当我在 redshift 上运行查询时,出现错误“错误:42601:带有 ORDER BY 子句的聚合窗口函数需要框架子句” @vip 。 . .是的,你是对的。这将具有值更改的所有日期。如果 OP 只想过滤原始数据中的日期,那么使用另一个连接就很容易了。【参考方案3】:你也可以使用窗口函数这样找到它
SELECT custid, dt::date,
SUM(sales) OVER (partition by custid ORDER BY dt
RANGE BETWEEN '30 days' PRECEDING AND '2 days' Following) as sum_of_sales
MIN(sales) OVER (partition by custid ORDER BY dt::date
RANGE BETWEEN '30 days' PRECEDING AND CURRENT ROW) as minimum,
MAX(sales) OVER (partition by custid ORDER BY dt::date
RANGE BETWEEN '2 days' PRECEDING AND '2 days' Following) as maximum
FROM atable
【讨论】:
以上是关于SQL:过去 30 天的滚动总和(按组)的主要内容,如果未能解决你的问题,请参考以下文章