按值组的连续日期范围对行进行分组

Posted 2023-03-31

技术标签:

【中文标题】按值组的连续日期范围对行进行分组【英文标题】：Group rows by contiguous date ranges for groups of values 【发布时间】：2014-12-08 22:55:39 【问题描述】：

考虑一些表T，由Col1, Col2, Date1, Date2排序：

Col1    Col2    Date1         Date2          rate
ABC     123     11/4/2014     11/5/2014      -90
ABC     123     11/4/2014     11/6/2014      -55
ABC     123     11/4/2014     11/7/2014      -90
ABC     123     11/4/2014     11/10/2014     -90

我想对数据进行分组，以便轻松审核更改/减少重复，所以我有

Col1    Col2    Date1         start_Date2    end_Date2      rate
ABC     123     11/4/2014     11/5/2014      11/5/2014      -90
ABC     123     11/4/2014     11/6/2014      11/6/2014      -55
ABC     123     11/4/2014     11/7/2014      11/10/2014     -90

如果我能得到另一列编号为 1 2 3 3 的行（唯一重要的是数字是不同的），然后是 GROUP BY 该列，我可以轻松地做到这一点。

我的查询尝试：

SELECT *, DENSE_RANK() OVER (ORDER BY rate) island
FROM T
ORDER BY Date2

没有给出我要找的东西：

Col1    Col2    Date1         Date2          rate     island
ABC     123     11/4/2014     11/5/2014      -90      1
ABC     123     11/4/2014     11/6/2014      -55      2
ABC     123     11/4/2014     11/7/2014      -90      1
ABC     123     11/4/2014     11/10/2014     -90      1

我希望查询识别第二组 -90 值应被视为新组，因为它们出现在具有不同 rate 的组之后。

[gaps-and-islands] SQL 标记非常有用，但我不知道如何处理速率恢复到以前的值。我应该如何修改我的查询？

【问题讨论】：

您可能对this answer to, "Solving “Gaps and Islands” with row_number() and dense_rank()?感兴趣 【参考方案1】：

您可以使用row_numbers() 的差异来识别组。连续的值会有一个常数。

select col1, col2, date1, min(date2), max(date2), rate
from (select t.*,
             (row_number() over (partition by col1, col2, date1 order by date2) -
              row_number() over (partition by col1, col2, date1, rate order by date2)
             ) as grp
      from table t
     ) t
group by col1, col2, date1, rate, grp

【讨论】：

对于这个解释你可能感兴趣this answer to, "Solving “Gaps and Islands” with row_number() and dense_rank()?

以上是关于按值组的连续日期范围对行进行分组的主要内容，如果未能解决你的问题，请参考以下文章