如何在不同时间获得相同ID的每个部分的最小日期? - SQL 服务器
Posted
技术标签:
【中文标题】如何在不同时间获得相同ID的每个部分的最小日期? - SQL 服务器【英文标题】:How to get minimum date by each section for same id different times? - SQL Server 【发布时间】:2017-06-01 12:15:53 【问题描述】:我使用 SQL Server 2012。我有一些表,其中包含延迟总和的每日历史记录,如下所示:
SET DATEFORMAT YMD
GO
CREATE TABLE [dbo].[testsum](
[CID] [int],
[HDATE] [date],
[DELAYSUM] [numeric](16, 2)
)
GO
INSERT [dbo].[testsum] ([CID], [HDATE], [DELAYSUM]) VALUES
(223,'2016-10-16',15503.80)
,(223,'2016-10-17',15493.82)
,(223,'2016-10-18',15489.25)
,(223,'2016-10-19',15417.08)
,(427,'2016-10-01',10375.89)
,(427,'2016-10-02',10375.89)
,(427,'2016-10-03',10385.91)
,(427,'2016-10-16',8448.57)
,(427,'2016-10-17',8443.13)
,(427,'2016-10-18',8440.64)
,(427,'2016-10-19',8401.31)
,(427,'2016-10-20',8411.20)
,(427,'2016-10-21',8414.58)
,(427,'2016-10-22',8414.58)
,(427,'2016-10-23',8414.58)
,(427,'2016-10-24',8401.23)
,(427,'2016-10-25',8393.92)
,(427,'2016-10-26',8379.14)
,(427,'2016-10-27',8374.57)
,(427,'2016-10-28',8358.67)
,(427,'2016-10-29',8358.67)
,(427,'2016-10-30',8358.67)
,(427,'2016-10-31',8346.61)
,(541,'2016-10-05',900.44)
,(541,'2016-10-06',832.84)
,(541,'2016-10-11',637.54)
,(541,'2016-10-15',413.89)
,(541,'2016-10-16',413.89)
,(541,'2016-10-17',413.89)
,(541,'2016-10-18',1728.12)
,(541,'2016-10-22',265.27)
,(541,'2016-10-23',265.27)
,(541,'2016-10-24',265.27)
,(541,'2016-10-25',787.10)
,(541,'2016-10-26',1222.29)
10 月份 3 个 id 的示例数据:
CID HDATE DELAYSUM
----------- ---------- ---------------------------------------
223 2016-10-16 15503.80
223 2016-10-17 15493.82
223 2016-10-18 15489.25
223 2016-10-19 15417.08
427 2016-10-01 10375.89
427 2016-10-02 10375.89
427 2016-10-03 10385.91
427 2016-10-16 8448.57
427 2016-10-17 8443.13
427 2016-10-18 8440.64
427 2016-10-19 8401.31
427 2016-10-20 8411.20
427 2016-10-21 8414.58
427 2016-10-22 8414.58
427 2016-10-23 8414.58
427 2016-10-24 8401.23
427 2016-10-25 8393.92
427 2016-10-26 8379.14
427 2016-10-27 8374.57
427 2016-10-28 8358.67
427 2016-10-29 8358.67
427 2016-10-30 8358.67
427 2016-10-31 8346.61
541 2016-10-05 900.44
541 2016-10-06 832.84
541 2016-10-11 637.54
541 2016-10-15 413.89
541 2016-10-16 413.89
541 2016-10-17 413.89
541 2016-10-18 1728.12
541 2016-10-22 265.27
541 2016-10-23 265.27
541 2016-10-24 265.27
541 2016-10-25 787.10
541 2016-10-26 1222.29
需要输出(ids(CID)中每个日期部分的最小日期以及每个时期(部分)的结束日期) 细分市场相隔 1 天或更多天:
CID HDATE DELAYSUM END_DATE
----------- ---------- ---------------------------------------
223 2016-10-16 15503.80 2016-10-19
427 2016-10-01 10375.89 2016-10-03
427 2016-10-16 8448.57 2016-10-31
541 2016-10-05 900.44 2016-10-06
541 2016-10-11 637.54 2016-10-11
541 2016-10-15 413.89 2016-10-18
541 2016-10-22 265.27 2016-10-26
暂时完成这项任务。 对不起我的英语。
【问题讨论】:
【参考方案1】:解决这个问题的一种方法是行号的差异:
select cid, min(hdate), max(hdate), min(delaysum)
from (select t.*,
row_number() over (order by hdate) as seqnum,
row_number() over (partition by cid order by hdate) as seqnum_c
from testsum t
) t
group by cid, (seqnum - seqnum_c);
编辑:
当我仔细观察时,您似乎想要第一个值,而不是最小值。 SQL Server 不提供first_value()
(还)作为聚合函数。所以:
select cid, min(hdate), max(hdate), min(first_delaysum)
from (select t.*,
first_value(delaysum) over (partition by cid, seqnum - seqnum_c order by hdate) as first_delaysum
from (select t.*,
row_number() over (order by hdate) as seqnum,
row_number() over (partition by cid order by hdate) as seqnum_c
from testsum t
) t
) t
group by cid, (seqnum - seqnum_c);
【讨论】:
【参考方案2】:这里的关键是如果连续行之间的日期差为 1,则将每个 cid 的记录分类到同一组中。此查询使用逻辑 dateadd(day,-row_number() over (partition by cid order by hdate),hdate)
来执行此操作。运行内部查询以查看组是如何分配的。
此后,使用窗口函数min
、max
和first_value
,您可以使用先前分配的组获得每个 cid 的最小 hdate、max hdate 和第一个值。
SELECT DISTINCT cid,
min(hdate) over (partition BY cid, grp) AS hdate,
first_value(delaysum) over (partition BY cid, grp ORDER BY hdate) AS delaysum,
max(hdate) over (partition BY cid, grp) AS end_date
FROM (SELECT t.* ,
dateadd(DAY,-row_number() over (partition BY cid ORDER BY hdate),hdate) AS grp
FROM testsum t ) x
ORDER BY cid,hdate
Sample Demo
【讨论】:
以上是关于如何在不同时间获得相同ID的每个部分的最小日期? - SQL 服务器的主要内容,如果未能解决你的问题,请参考以下文章