使用 SQL 检测连续的日期范围
Posted
技术标签:
【中文标题】使用 SQL 检测连续的日期范围【英文标题】:Detect consecutive dates ranges using SQL 【发布时间】:2013-12-05 14:05:39 【问题描述】:我想填写需要开始和结束日期信息的日历对象。我有一列包含一系列日期。有些日期是连续的(相差一天),有些则不是。
InfoDate
2013-12-04 consecutive date [StartDate]
2013-12-05 consecutive date
2013-12-06 consecutive date [EndDate]
2013-12-09 [startDate]
2013-12-10 [EndDate]
2014-01-01 [startDate]
2014-01-02
2014-01-03 [EndDate]
2014-01-06 [startDate]
2014-01-07 [EndDate]
2014-01-29 [startDate]
2014-01-30
2014-01-31 [EndDate]
2014-02-03 [startDate]
2014-02-04 [EndDate]
我想选择每个连续日期范围的开始和结束日期(块中的第一个和最后一个)。
StartDate EndDate
2013-12-04 2013-12-06
2013-12-09 2013-12-10
2014-01-01 2014-01-03
2014-01-06 2014-01-07
2014-01-29 2014-01-31
2014-02-03 2014-02-04
我想只使用 SQL 来解决问题。
【问题讨论】:
第二个清单中的空行是什么意思?你真的需要在 SQL 中解决这个问题吗?这似乎很难用 SQL 来表达(至少在标准的 SQL 中),显而易见的算法几乎是连续的,并且可以很容易地用过程语言编写。如果真的需要 SQL,我会使用存储过程。 【参考方案1】:不需要连接或递归 CTE。标准的 gaps-and-island 解决方案是按(值减去 row_number)分组,因为这在连续序列中是不变的。开始和结束日期只是组的 MIN() 和 MAX()。
WITH t AS (
SELECT InfoDate d,ROW_NUMBER() OVER(ORDER BY InfoDate) i
FROM @d
GROUP BY InfoDate
)
SELECT MIN(d),MAX(d)
FROM t
GROUP BY DATEDIFF(day,i,d)
【讨论】:
非常聪明的解决方案。谢谢! 我觉得group by应该改成:GROUP BY DATEADD(day,-i,d) @BennyBechDkGROUP BY DATEDIFF(day,i,d)
和 GROUP BY DATEADD(day,-i,d)
将生成相同的组。
你说“不需要使用 CTE”可能被否决了——然后使用 CTE!但是您当然可以在最终的SELECT
中将 CTE 替换为 t
,所以您仍然是正确的...
嗨,TommCatt,抱歉 id 不适用于 StartDate 和 EndDate 形式的 INPUT。【参考方案2】:
给你..
;WITH CTEDATES
AS
(
SELECT ROW_NUMBER() OVER (ORDER BY Infodate asc ) AS ROWNUMBER,infodate FROM YourTableName
),
CTEDATES1
AS
(
SELECT ROWNUMBER, infodate, 1 as groupid FROM CTEDATES WHERE ROWNUMBER=1
UNION ALL
SELECT a.ROWNUMBER, a.infodate,case datediff(d, b.infodate,a.infodate) when 1 then b.groupid else b.groupid+1 end as gap FROM CTEDATES A INNER JOIN CTEDATES1 B ON A.ROWNUMBER-1 = B.ROWNUMBER
)
select min(mydate) as startdate, max(infodate) as enddate from CTEDATES1 group by groupid
【讨论】:
您应该使用OVER (ORDER BY Infodate)
而不是OVER (ORDER BY (SELECT 1))
。另外,将min(mydate)
更改为min(infodate)
。除此之外,这是一个很好的答案【参考方案3】:
我已将这些值插入到名为 #consec
的表中,然后执行以下操作:
select t1.*
,t2.infodate as binfod
into #temp1
from #consec t1
left join #consec t2 on dateadd(DAY,1,t1.infodate)=t2.infodate
select t1.*
,t2.infodate as binfod
into #temp2
from #consec t1
left join #consec t2 on dateadd(DAY,1,t2.infodate)=t1.infodate
;with cte as(
select infodate, ROW_NUMBER() over(order by infodate asc) as seq from #temp1
where binfod is null
),
cte2 as(
select infodate, ROW_NUMBER() over(order by infodate asc) as seq from #temp2
where binfod is null
)
select t2.infodate as [start_date]
,t1.infodate as [end_date] from cte t1
left join cte2 t2 on t1.seq=t2.seq
只要您的日期期间不重叠,那应该可以为您完成工作。
【讨论】:
【参考方案4】:这是我的测试数据样本:
--required output
-- 01 - 03
-- 08 - 09
-- 12 - 14
DECLARE @maxRN int;
WITH #tmp AS (
SELECT CAST('2013-01-01' AS date) DT
UNION ALL SELECT CAST('2013-01-02' AS date)
UNION ALL SELECT CAST('2013-01-03' AS date)
UNION ALL SELECT CAST('2013-01-05' AS date)
UNION ALL SELECT CAST('2013-01-08' AS date)
UNION ALL SELECT CAST('2013-01-09' AS date)
UNION ALL SELECT CAST('2013-01-12' AS date)
UNION ALL SELECT CAST('2013-01-13' AS date)
UNION ALL SELECT CAST('2013-01-14' AS date)
),
#numbered AS (
SELECT 0 RN, CAST('1900-01-01' AS date) DT
UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY DT) RN, DT
FROM #tmp
)
SELECT * INTO #tmpTable FROM #numbered;
SELECT @maxRN = MAX(RN) FROM #tmpTable;
INSERT INTO #tmpTable
SELECT @maxRN + 1, CAST('2100-01-01' AS date);
WITH #paired AS (
SELECT
ROW_NUMBER() OVER(ORDER BY TStart.DT) RN, TStart.DT DTS, TEnd.DT DTE
FROM #tmpTable TStart
INNER JOIN #tmpTable TEnd
ON TStart.RN = TEnd.RN - 1
AND DATEDIFF(dd,TStart.DT,TEnd.DT) > 1
)
SELECT TS.DTE, TE.DTs
FROM #paired TS
INNER JOIN #paired TE ON TS.RN = TE.RN -1
AND TS.DTE <> TE.DTs -- you could remove this filter if you want to have start and end on the same date
DROP TABLE #tmpTable
用您的实际表格替换#tmp 数据。
【讨论】:
【参考方案5】:你可以这样做,这里是sqlfiddle
select
min(ndate) as start_date,
max(ndate) as end_date
from
(select
ndate,
dateadd(day, -row_number() over (order by ndate), ndate) as rnk
from dates
) t
group by
rnk
【讨论】:
【参考方案6】:另一个可以在这里工作的简单解决方案是 -
with tmp as
(
select
datefield
, dateadd('day',-row_number() over(order by date asc),datefield) as date_group
from table
)
select
min(datefield) as start_date
, max(datefield) as end_date
from tmp
group by date_group
【讨论】:
【参考方案7】:SELECT InfoDate ,
CASE
WHEN TRUNC(InfoDate - 1) = TRUNC(lag(InfoDate,1,InfoDate) over (order by InfoDate))
THEN NULL
ELSE InfoDate
END STARTDATE,
CASE
WHEN TRUNC(InfoDate + 1) = TRUNC(lead(InfoDate,1,InfoDate) over (order by InfoDate))
THEN NULL
ELSE InfoDate
END ENDDATE
FROM TABLE;
【讨论】:
以上是关于使用 SQL 检测连续的日期范围的主要内容,如果未能解决你的问题,请参考以下文章