根据列查找日期范围

Posted

技术标签:

【中文标题】根据列查找日期范围【英文标题】:Find range of dates based on a column 【发布时间】:2020-06-09 20:03:30 【问题描述】:

我希望将下面的数据按ITEM 和“更改”STATUS 分组,但对于下面的示例,由于状态再次切换回来,预计会有 3 行。

目前我使用的是MIN(FROM_DT) AND MAX(TO_DATE),但只有两行,因为STATUS 列中只有两个不同的值。

预期结果:

ITEM    FROM_DT     TO_DT       STATUS
ITEM1   02/01/2020  15/05/2020  0
ITEM1   15/05/2020  18/05/2020  1
ITEM1   18/05/2020  31/12/9999  0

样本数据:

CREATE TABLE [dbo].[AH_TEMP](
    [ITEM] [varchar](24) NULL,
    [FROM_DT] [datetime] NULL,
    [TO_DT] [datetime] NULL,
    [EXCL] [bit] NULL
) ON [PRIMARY]
GO

INSERT INTO AH_TEMP 
VALUES  
    ('ITEM1','2020-01-02 22:57:01.200','2020-01-07 22:54:52.930','0'),
    ('ITEM1','2020-01-07 22:57:21.950','2020-02-03 22:54:30.500','0'),
    ('ITEM1','2020-02-03 22:58:21.180','2020-03-02 22:54:27.253','0'),
    ('ITEM1','2020-03-02 22:56:30.737','2020-03-18 17:21:23.390','0'),
    ('ITEM1','2020-03-18 17:21:23.403','2020-03-19 09:05:38.060','0'),
    ('ITEM1','2020-03-19 09:05:38.063','2020-03-19 13:57:03.567','0'),
    ('ITEM1','2020-03-19 13:57:03.570','2020-03-19 23:01:41.403','0'),
    ('ITEM1','2020-03-19 23:03:49.900','2020-03-20 23:02:25.437','0'),
    ('ITEM1','2020-03-20 23:04:53.610','2020-04-01 22:59:39.220','0'),
    ('ITEM1','2020-04-01 23:01:45.620','2020-05-01 22:59:09.153','0'),
    ('ITEM1','2020-05-01 23:01:11.980','2020-05-14 14:30:21.930','0'),
    ('ITEM1','2020-05-14 14:30:21.930','2020-05-14 22:57:24.753','0'),
    ('ITEM1','2020-05-14 22:59:17.623','2020-05-15 17:48:34.000','0'),
    ('ITEM1','2020-05-15 17:48:35.000','2020-05-15 22:57:15.923','0'),
    ('ITEM1','2020-05-15 22:59:11.933','2020-05-16 22:54:31.750','1'),
    ('ITEM1','2020-05-16 22:56:26.793','2020-05-18 22:55:01.050','1'),
    ('ITEM1','2020-05-18 23:00:23.103','2020-05-21 22:55:24.400','0'),
    ('ITEM1','2020-05-21 22:57:01.723','2020-06-01 23:00:21.823','0'),
    ('ITEM1','2020-06-01 23:03:12.467','2020-06-08 22:55:20.393','0'),
    ('ITEM1','2020-06-08 22:58:27.710','9999-12-31 00:00:00.000','0');

返回:

+-------+-------------------------+-------------------------+--------+
| ITEM  |         FROM_DT         |          TO_DT          | STATUS |
+-------+-------------------------+-------------------------+--------+
| ITEM1 | 2020-01-02 22:57:01.200 | 2020-01-07 22:54:52.930 |      0 |
| ITEM1 | 2020-01-07 22:57:21.950 | 2020-02-03 22:54:30.500 |      0 |
| ITEM1 | 2020-02-03 22:58:21.180 | 2020-03-02 22:54:27.253 |      0 |
| ITEM1 | 2020-03-02 22:56:30.737 | 2020-03-18 17:21:23.390 |      0 |
| ITEM1 | 2020-03-18 17:21:23.403 | 2020-03-19 09:05:38.060 |      0 |
| ITEM1 | 2020-03-19 09:05:38.063 | 2020-03-19 13:57:03.567 |      0 |
| ITEM1 | 2020-03-19 13:57:03.570 | 2020-03-19 23:01:41.403 |      0 |
| ITEM1 | 2020-03-19 23:03:49.900 | 2020-03-20 23:02:25.437 |      0 |
| ITEM1 | 2020-03-20 23:04:53.610 | 2020-04-01 22:59:39.220 |      0 |
| ITEM1 | 2020-04-01 23:01:45.620 | 2020-05-01 22:59:09.153 |      0 |
| ITEM1 | 2020-05-01 23:01:11.980 | 2020-05-14 14:30:21.930 |      0 |
| ITEM1 | 2020-05-14 14:30:21.930 | 2020-05-14 22:57:24.753 |      0 |
| ITEM1 | 2020-05-14 22:59:17.623 | 2020-05-15 17:48:34.000 |      0 |
| ITEM1 | 2020-05-15 17:48:35.000 | 2020-05-15 22:57:15.923 |      0 |
| ITEM1 | 2020-05-15 22:59:11.933 | 2020-05-16 22:54:31.750 |      1 |
| ITEM1 | 2020-05-16 22:56:26.793 | 2020-05-18 22:55:01.050 |      1 |
| ITEM1 | 2020-05-18 23:00:23.103 | 2020-05-21 22:55:24.400 |      0 |
| ITEM1 | 2020-05-21 22:57:01.723 | 2020-06-01 23:00:21.823 |      0 |
| ITEM1 | 2020-06-01 23:03:12.467 | 2020-06-08 22:55:20.393 |      0 |
| ITEM1 | 2020-06-08 22:58:27.710 | 9999-12-31 00:00:00.000 |      0 |
+-------+-------------------------+-------------------------+--------+

【问题讨论】:

请edit 更新您的问题,而不是添加 cmets。另外,请删除该数据图片并将数据以纯文本形式粘贴回去,以便其他人可以复制它进行测试。 这是我用来生成表格数据的工具:ozh.github.io/ascii-tables @RonenAriely 谢谢您的回复。我们按 FROM_DT 列排序的数据,标识更改的列是 STATUS/EXCL 很棒的工作@AndyH,感谢您提供缺失的信息。我在您已有的解决方案中添加了另一种解决方案。 谢谢@RonenAriely 【参考方案1】:

通过使用lag 检测状态变化,然后suming 状态变化,我们可以按此总和进行分组,得到所需的分组。

declare @Test table (ITEM varchar(24), FROM_DT date, TO_DT date, [STATUS] bit)

INSERT INTO @test VALUES  ('ITEM1','2020-01-02 22:57:01.200','2020-01-07 22:54:52.930','0');
INSERT INTO @test VALUES  ('ITEM1','2020-01-07 22:57:21.950','2020-02-03 22:54:30.500','0');
INSERT INTO @test VALUES  ('ITEM1','2020-02-03 22:58:21.180','2020-03-02 22:54:27.253','0');
INSERT INTO @test VALUES  ('ITEM1','2020-03-02 22:56:30.737','2020-03-18 17:21:23.390','0');
INSERT INTO @test VALUES  ('ITEM1','2020-03-18 17:21:23.403','2020-03-19 09:05:38.060','0');
INSERT INTO @test VALUES  ('ITEM1','2020-03-19 09:05:38.063','2020-03-19 13:57:03.567','0');
INSERT INTO @test VALUES  ('ITEM1','2020-03-19 13:57:03.570','2020-03-19 23:01:41.403','0');
INSERT INTO @test VALUES  ('ITEM1','2020-03-19 23:03:49.900','2020-03-20 23:02:25.437','0');
INSERT INTO @test VALUES  ('ITEM1','2020-03-20 23:04:53.610','2020-04-01 22:59:39.220','0');
INSERT INTO @test VALUES  ('ITEM1','2020-04-01 23:01:45.620','2020-05-01 22:59:09.153','0');
INSERT INTO @test VALUES  ('ITEM1','2020-05-01 23:01:11.980','2020-05-14 14:30:21.930','0');
INSERT INTO @test VALUES  ('ITEM1','2020-05-14 14:30:21.930','2020-05-14 22:57:24.753','0');
INSERT INTO @test VALUES  ('ITEM1','2020-05-14 22:59:17.623','2020-05-15 17:48:34.000','0');
INSERT INTO @test VALUES  ('ITEM1','2020-05-15 17:48:35.000','2020-05-15 22:57:15.923','0');
INSERT INTO @test VALUES  ('ITEM1','2020-05-15 22:59:11.933','2020-05-16 22:54:31.750','1');
INSERT INTO @test VALUES  ('ITEM1','2020-05-16 22:56:26.793','2020-05-18 22:55:01.050','1');
INSERT INTO @test VALUES  ('ITEM1','2020-05-18 23:00:23.103','2020-05-21 22:55:24.400','0');
INSERT INTO @test VALUES  ('ITEM1','2020-05-21 22:57:01.723','2020-06-01 23:00:21.823','0');
INSERT INTO @test VALUES  ('ITEM1','2020-06-01 23:03:12.467','2020-06-08 22:55:20.393','0');
INSERT INTO @test VALUES  ('ITEM1','2020-06-08 22:58:27.710','9999-12-31 00:00:00.000','0');

select ITEM, min(FROM_DT), max(TO_DT), [STATUS]
from (
  select *
    , sum(case when coalesce(lag,0) <> [STATUS] then 1 else 0 end) over (order by FROM_DT, TO_DT) GroupBy
  from (
    select *
      , lag([STATUS]) over (order by FROM_DT) lag
    from @Test
  ) X
) Y
group by ITEM, GroupBy, [STATUS]
order by ITEM, GroupBy;

给予:

ITEM    FROM_DT                 TO_DT                   STATUS
ITEM1   2020-01-02 22:57:01.200 2020-05-15 22:57:15.923 0
ITEM1   2020-05-15 22:59:11.933 2020-05-18 22:55:01.050 1
ITEM1   2020-05-18 23:00:23.103 9999-12-31 00:00:00.000 0

如果您有兴趣了解它是如何工作的,那么只需运行:

select *
  , sum(case when coalesce(lag,0) <> [STATUS] then 1 else 0 end) over (order by FROM_DT, TO_DT) GroupBy
from (
  select *
    , lag([STATUS]) over (order by FROM_DT) lag
  from @Test
) X
order by ITEM, FROM_DT, TO_DT

【讨论】:

【参考方案2】:

感谢发布缺失的信息(DDL+DML),

请检查以下是否能解决您的需求

;With MyCTE as (
    SELECT 
        ITEM, FROM_DT, TO_DT, EXCL
        , MyGROUP = ROW_NUMBER() OVER (ORDER BY FROM_DT) - RANK() OVER (PARTITION BY EXCL ORDER BY FROM_DT)  
    FROM AH_TEMP
)
SELECT ITEM, MIN(FROM_DT), MAX(TO_DT), EXCL as [STATUS]
FROM MyCTE
GROUP BY ITEM, EXCL, MyGROUP
ORDER BY MIN(FROM_DT)
GO

【讨论】:

@DaleK 你为什么删除你的评论?现在我的感谢评论似乎与任何事情无关。是回复你说你喜欢的,现在和之前的任何消息都没有关系了 "Because you'd seen the comment which was my intent and its not helpful for other people in the long run. Unless a comment provides additional value to the question it will get deleted at some point in time. Its all part of leaving clean, neat questions and answers. – Dale K" 在我个人看来,这种做法很糟糕。让我思考十次我为什么要花时间回应 @RonenAriely 我稍作编辑以添加在 ITEM MyGROUP = ROW_NUMBER() OVER (PARTITION BY ITEM ORDER BY FROM_DT) - RANK() OVER (PARTITION BY ITEM, EXCL ORDER BY FROM_DT) 有意义@AndyH :-)

以上是关于根据列查找日期范围的主要内容,如果未能解决你的问题,请参考以下文章

根据日期范围过滤列

根据日期范围查找记录

需要根据日期范围参数包含不同列的报告

根据日期范围参考单元格调整相关性

SQL选择所有日期范围都存在的特定列

SQL 如何查询日期在一定范围内的数据