根据组内连续两天将指标列添加到表中

Posted 2023-03-31

技术标签:

【中文标题】根据组内连续两天将指标列添加到表中【英文标题】：Adding indicator column to table based on having two consecutive days within group 【发布时间】：2020-07-26 21:22:40 【问题描述】：

我需要添加一个逻辑，帮助我将连续两天的第一天标记为 1，将第二天标记为 0，并按列（测试）分组。如果测试 (a) 连续三天，那么第三天应该再次以 1 开头，依此类推。

示例表如下所示，新列是我需要的列。

|---------------------|------------------|---------------------|
|      test           |     test_date    |      new col        |
|---------------------|------------------|---------------------|
|      a              |     1/1/2020     |      1              |
|---------------------|------------------|---------------------|
|      a              |     1/2/2020     |      0              |
|---------------------|------------------|---------------------|
|      a              |     1/3/2020     |      1              |
|---------------------|------------------|---------------------|
|      b              |     1/1/2020     |      1              |
|---------------------|------------------|---------------------|
|      b              |     1/2/2020     |      0              |
|---------------------|------------------|---------------------|
|      b              |     1/15/2020    |      1              |
|---------------------|------------------|---------------------|

因为这似乎是一些差距和孤岛问题，我认为一些 Windows 函数方法应该可以让我到达那里。

我尝试了类似以下的方法来获得连续的部分，但在指标列中遇到了困难。

Select 
test, 
test_date,
grp_var = dateadd(day, 
                 -row_number() over (partition by test order by test_date), test_date)    
from 
my_table

【问题讨论】：

【参考方案1】：

这确实是一个空白和孤岛问题。我建议使用row_number() 和日期之间的差异来生成组，然后进行算术运算：

select
    test,
    test_date, 
    row_number() over(  
        partition by test, dateadd(day, -rn, test_date)
        order by test_date
    ) % 2 new_col
from (
    select 
        t.*, 
        row_number() over(partition by test order by test_date) rn
    from mytable t
) t

Demo on DB Fiddle：

测试 |测试日期 | new_col :--- | :--------- | ------: 一个 | 2020-01-01 | 1 一个 | 2020-01-02 | 0 一个 | 2020-01-03 | 1 乙 | 2020-01-01 | 1 乙 | 2020-01-02 | 0 乙 | 2020-01-15 | 1

【讨论】：

以上是关于根据组内连续两天将指标列添加到表中的主要内容，如果未能解决你的问题，请参考以下文章