计算值在分区上更改时的运行总计
Posted
技术标签:
【中文标题】计算值在分区上更改时的运行总计【英文标题】:Calculating a running total of when a value changes over a partition 【发布时间】:2020-06-18 17:05:07 【问题描述】:我无法弄清楚如何编写一个窗口函数来解决我的问题。我是窗口函数的新手,但我认为可以编写一个来满足我的需要。
问题陈述: 我想计算一个转移序列,显示人们何时根据相应的位置 ID 随着时间的推移改变了位置。
样本数据(表 1)
+----------+------------+-----------+---------+
| PersonID | LocationID | Date | Time |
+----------+------------+-----------+---------+
| 12 | A | 6/17/2020 | 12:00PM |
+----------+------------+-----------+---------+
| 12 | A | 6/18/2020 | 1:00PM |
+----------+------------+-----------+---------+
| 12 | B | 6/18/2020 | 6:00AM |
+----------+------------+-----------+---------+
| 12 | C | 6/19/2020 | 3:00PM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 8:00AM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 11:00AM |
+----------+------------+-----------+---------+
| 13 | A | 6/16/2020 | 12:00AM |
+----------+------------+-----------+---------+
| 13 | B | 6/16/2020 | 4:00PM |
+----------+------------+-----------+---------+
预期结果
+----------+------------+-----------+---------+-------------------+
| PersonID | LocationID | Date | Time | Transfer Sequence |
+----------+------------+-----------+---------+-------------------+
| 12 | A | 6/17/2020 | 12:00PM | 1 |
+----------+------------+-----------+---------+-------------------+
| 12 | A | 6/18/2020 | 1:00PM | 1 |
+----------+------------+-----------+---------+-------------------+
| 12 | B | 6/18/2020 | 6:00AM | 2 |
+----------+------------+-----------+---------+-------------------+
| 12 | C | 6/19/2020 | 3:00PM | 3 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 8:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 11:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | A | 6/16/2020 | 12:00AM | 1 |
+----------+------------+-----------+---------+-------------------+
| 13 | B | 6/16/2020 | 4:00PM | 2 |
+----------+------------+-----------+---------+-------------------+
我尝试了什么
SELECT
[t1].[PersonID]
,[t1].[LocationID]
,[t1].[Date]
,[t1].[Time]
,DENSE_RANK()
OVER(
partition BY [t1].[PersonID], [t1].[LocationID]
ORDER BY [t1].[Date] ASC, [t1].[Time] ASC) AS
[Transfer Sequence]
FROM Table1 [t1]
不幸的是,我相信 DENSE_RANK() 正在分配排名,而不管 LocationID 的值是否已更改。我需要一个仅在 LocationID 更改时将一个添加到序列中的函数。
任何帮助将不胜感激。
谢谢!
【问题讨论】:
【参考方案1】:您希望将“相邻”行放在同一组中。直窗函数无法为您做到这一点 - 我们需要使用间隙和孤岛技术:
select
t.*,
sum(case when locationID = lagLocationID then 0 else 1 end)
over(partition by personID order by date, time)
as transfert_sequence
from (
select
t.*,
lag(locationID)
over(partition by personID order by date, time)
as lagLocationID
from mytable t
) t
这个想法是计算每次 locationID 变化时递增的窗口总和。
请注意,当一个人回到他们以前去过的位置时,这将正确处理这种情况。
【讨论】:
【参考方案2】:我所做的(我确信这不是最好的方法)是创建第二个表,其中包含 PersonID、locationID、日期、时间和传输序列(序列)的空字段,然后是光标:
DECLARE transaction CURSOR
FOR select PersonID, LocationID, Date, Time from table1;
然后循环:
OPEN CURSOR transaction
set @count = 0
set @person_saved = ""
set @location_saed = ""
FETCH NEXT FROM transaction INTO @person, @location, @date, @time
WHILE @@FETCH_STATUS = 0
BEGIN
if @person_saved <> @person -- changing personID, reset count
begin
set count = 0
set persone_saved = @person
end
if @location_saved <> @location. -- changing location, add count
begin
set @count = @count + 1
set @location_saved = @location
end
update table1 set sequence = @count where PersonId = @person and locationId = @location and date = @date and time = @time
FETCH NEXT FROM transaction INTO @person, @location, @date, @time
END
CLOSE transaction
DEALLOCATE transaction
【讨论】:
游标方法比使用子查询解决这类问题更有效吗? 性能的时间取决于表有多少行,我通常使用游标的加号正在调试...带有游标的存储过程比SQL更容易调试。当然,如果来自@GMB 的 SQL 工作正常,那么编写起来就少了很多 :)以上是关于计算值在分区上更改时的运行总计的主要内容,如果未能解决你的问题,请参考以下文章