如何在 SQL 中针对事务表计算一段时间内的覆盖日期?

Posted

技术标签:

【中文标题】如何在 SQL 中针对事务表计算一段时间内的覆盖日期?【英文标题】:How do I calculate coverage dates during a period of time in SQL against a transactional table? 【发布时间】:2017-02-24 18:31:58 【问题描述】:

我正在尝试编译一个日期范围列表,如下所示:

覆盖范围:2016 年 10 月 1 日 - 2016 年 10 月 5 日

覆盖范围:10/9/2016 - 10/31/2016

针对数据库表中的每个策略。该表是事务性的,有一个取消事务代码,但三个代码可以表明覆盖已经开始。此外,在某些情况下,指示覆盖开始的代码可以按顺序出现(从 10/1 开始,然后在 10/5 开始另一个,然后在 10/14 取消)。以下是我想从中生成上述结果的一系列交易的示例:

TransID  PolicyID  EffDate
NewBus   1         9/15/2016
Confirm  1         9/17/2016
Cancel   1         10/5/2016
Reinst   1         10/9/2016
Cancel   1         10/15/2016
Reinst   1         10/15/2016
PolExp   1         3/15/2017

所以在这个数据集中,我想要日期范围 10/1 - 10/31 的以下结果

覆盖范围:2016 年 10 月 1 日 - 2016 年 10 月 5 日

覆盖范围:10/9/2016 - 10/31/2016

请注意,由于取消和恢复发生在同一天,因此我将它们排除在结果集中。我尝试将事务与子查询配对:

CONVERT(varchar(10), 
        CASE WHEN overall.sPTRN_ID in (SELECT code FROM @cancelTransCodes)
        -- This is a coverage cancellationentry
            THEN -- Set coverage start date using previous paired record                    
                CASE WHEN((SELECT MAX(inn.PD_EffectiveDate) FROM PolicyData inn WHERE inn.sPTRN_ID in (SELECT code FROM @startCoverageTransCodes)
                    and inn.PD_EffectiveDate <= overall.PD_EffectiveDate
                    and inn.PD_PolicyCode = overall.PD_PolicyCode) < @sDate) THEN @sDate 
                        ELSE
                        (SELECT MAX(inn.PD_EffectiveDate) FROM PolicyData inn WHERE inn.sPTRN_ID in (SELECT code FROM @startCoverageTransCodes)
                            and inn.PD_EffectiveDate <= overall.PD_EffectiveDate
                            and inn.PD_PolicyCode = overall.PD_PolicyCode)
                    END
            ELSE -- Set coverage start date using current record
                CASE WHEN (overall.PD_EffectiveDate < @sDate) THEN @sDate ELSE overall.PD_EffectiveDate END END, 101)                   
    as [Effective_Date]

除了我上面列出的情况外,这大部分都有效。如果我能提供帮助,我宁愿不重写这个查询。我有一个类似的到期日期行:

ISNULL(CONVERT(varchar(10),                     
        CASE WHEN overall.sPTRN_ID in (SELECT code FROM @cancelTransCodes) -- This is a coverage cancellation entry                 
            THEN  -- Set coverage end date with current record                  
                overall.PD_EffectiveDate            
            ELSE -- check for future coverage end dates             
                CASE WHEN               
                    (SELECT COUNT(*) FROM PolicyData pd WHERE pd.PD_EffectiveDate > overall.PD_EffectiveDate and pd.sPTRN_ID in (SELECT code FROM @cancelTransCodes)) > 1           
                THEN -- There are future end dates          
                    CASE WHEN((SELECT TOP 1 pd.PD_ExpirationDate FROM PolicyData pd         
                        WHERE pd.PD_PolicyCode = overall.PD_PolicyCode      
                        and pd.PD_EntryDate between @sDate and @eDate   
                        and pd.sPTRN_ID in (SELECT code FROM @cancelTransCodes))) > @eDate
                        THEN @eDate 
                        ELSE
                            (SELECT TOP 1 pd.PD_ExpirationDate FROM PolicyData pd       
                                WHERE pd.PD_PolicyCode = overall.PD_PolicyCode      
                                and pd.PD_EntryDate between @sDate and @eDate   
                                and pd.sPTRN_ID in (SELECT code FROM @cancelTransCodes))
                        END

            ELSE -- No future coverage end dates                
                CASE WHEN(overall.PD_ExpirationDate > @eDate) THEN @eDate ELSE overall.PD_ExpirationDate END            
            END             

END, 101), CONVERT(varchar(10), CASE WHEN(overall.PD_ExpirationDate > @eDate) THEN @eDate ELSE overall.PD_ExpirationDate END, 101))                 
as [Expiration_Date]

我不禁觉得这里缺少一个更简单的解决方案。所以我的问题是:如何修改查询的上述部分以适应上述情况?或 更好的答案是什么?如果我可以简化这一点,我很想听听如何。

这是我最终实施的解决方案 我使用了一个简化表,将所有 START 事务代码煮沸为 START,将所有取消事务代码煮沸为 CANCEL。当我基于此查看表格时,更容易观察我的逻辑如何影响结果。我最终使用了一个简化的系统,在该系统中我使用 CASE WHEN 子句来识别特定场景并以此为基础构建我的日期范围。我还改变了我的起点,不再查看取消和查找相关的开始,然后将其反转(查找开始,然后查找相关的取消)。所以这是我实现的代码:

/* Get Coverage Dates */                

    ,cast((CASE WHEN sPTRN_ID in (SELECT code FROM @startCoverageTransCodes) THEN
                         CASE WHEN (cast(overall.PD_EntryDate as date) <= @sDate) THEN @sDate 
                         WHEN (cast(overall.PD_EntryDate as date) > @sDate AND cast(overall.PD_EntryDate as date) <= @eDate) THEN overall.PD_EntryDate
                         WHEN (cast(overall.PD_EntryDate as date) > @eDate) THEN @eDate 
                         ELSE cast(overall.PD_EntryDate as date) END
                    ELSE
                         null
                    END) as date) as Effective_Date
        ,cast((CASE WHEN sPTRN_ID in (SELECT code FROM @startCoverageTransCodes) THEN
                         CASE WHEN (SELECT MIN(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM @cancelTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode) > @eDate THEN @eDate
                         ELSE ISNULL((SELECT MIN(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM @cancelTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode), @eDate) END
                    ELSE
                         CASE WHEN (SELECT MAX(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM @startCoverageTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode) > @eDate THEN @eDate
                         ELSE (SELECT MAX(p.PD_EntryDate) FROM PolicyData p WITH (NOLOCK) WHERE p.sPTRN_ID in (SELECT code FROM @startCoverageTransCodes) AND p.PD_EntryDate > overall.PD_EntryDate AND p.PD_PolicyCOde = overall.PD_PolicyCode) 
                    END END) as date) as Expiration_Date

如您所见,在这种情况下,我依赖于子查询。我有很多这种逻辑作为连接,这会导致我不需要它们的额外行。因此,通过基于子查询创建日期范围逻辑,我最终将存储过程加快了几秒钟,使我的执行时间低于 1 秒,而之前是 2-5 秒。

【问题讨论】:

你使用的是哪个版本的sql server? 我会转换数据(可能在一个公用表表达式中)以使所有开始都符合“START”并且所有结束都符合“END”(加入查找表或定义等价的东西)消除在同一日期相互取消的事件。然后你就可以找到“START 之后的第一个 END”,这在 SQL 中很容易说明(END 的最小日期,其中不存在 START > START 的日期)然后就需要做更多的工作来完成你的外部边界范围一旦您拥有政策的所有覆盖范围,就可以将范围截断为报告的开始和结束日期 @SqlZim,我使用的是 SQL Server 2008 R2。 Cade Roux,这是一个有趣的解决方案。我喜欢它的简单。我相信我通过使用我已经调用的表变量来完成第一部分,但是在我的子查询的 where 子句中添加一个 not exists 会大大简化它...... @Cade Roux 我使用了您想法的修改版本。使用简化列表极大地帮助了我解决逻辑问题,并向我展示了我正在向后看问题。最好将我的日期范围基于报道的开始日期,然后找到相应的 CANCEL(如果有)。 这是我使用的代码: 【参考方案1】:

可能有更简单的解决方案,但我现在看不到。

每个步骤的大纲是:

    为日期范围生成日期,如果您有日历表,则无需执行此操作。 按照您在问题中的描述转换传入的数据集(在同一天跳过开始/取消);并为每一行添加下一个EffDate。 在第 2 步生成的范围之间用每一天的一行展开数据集。 根据连续的收敛天数减少数据集。

测试设置:http://rextester.com/GUNSO45644

/* set date range */
declare @fromdate date = '20161001'
declare @thrudate date = '20161031'
/* generate dates in range -- you can skip this if you have a calendar table */
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
  select top (datediff(day, @fromdate, @thrudate)+1) 
    [Date]=convert(date,dateadd(day,row_number() over(order by (select 1))-1, @fromdate))
  from         n as deka
    cross join n as hecto     /* 100 days   */
    cross join n as kilo      /* 2.73 years */
    cross join n as [tenK]    /* 27.3 years */
  order by [Date]
)
/* reduce test table to desired input*/
, pol as (
  select 
      Coverage = case when max(TransId) in ('Cancel','PolExp') 
                  then 0 else 1 end
    , PolicyId
    , EffDate  = case when max(TransId) in ('Cancel','PolExp') 
                  then dateadd(day,1,EffDate) else EffDate end
    , NextEffDate = oa.NextEffDate
  from t
    outer apply (
        select top 1
          NextEffDate = case 
            when i.TransId in ('Cancel','PolExp')
              then dateadd(day,1,i.EffDate) 
            else i.EffDate
            end
        from t as i
        where i.PolicyId = t.PolicyId
          and i.EffDate > t.EffDate
        order by 
            i.EffDate asc
          , case when i.TransId in ('Cancel','PolExp') then 1 else 0 end desc 
        ) as oa
  group by t.PolicyId, t.EffDate, oa.NextEffDate
)
/* explode desired input by day, add row_numbers() */
, e as (
select pol.PolicyId, pol.Coverage, d.Date
    , rn_x = row_number() over (
        partition by pol.PolicyId
        order by d.Date
        )
    , rn_y = row_number() over (
        partition by pol.PolicyId, pol.Coverage 
        order by d.date)
  from pol
    inner join dates as d
      on d.date >= pol.EffDate
     and d.date < pol.NextEffDate
)
/* reduce to date ranges where Coverage = 1 */
select 
    PolicyId
  , FromDate = convert(varchar(10),min(Date),120)
  , ThruDate = convert(varchar(10),max(Date),120)
from e
where Coverage = 1
group by PolicyId, (rn_x-rn_y);

返回:

+----------+------------+------------+
| PolicyId |  FromDate  |  ThruDate  |
+----------+------------+------------+
|        1 | 2016-10-01 | 2016-10-05 |
|        1 | 2016-10-09 | 2016-10-31 |
+----------+------------+------------+

【讨论】:

以上是关于如何在 SQL 中针对事务表计算一段时间内的覆盖日期?的主要内容,如果未能解决你的问题,请参考以下文章

如何在 Python 中打印特定月份范围内的事务日志

使用事务内的临时表调用存储过程

组内事务 SQL 计算

SQL 将库存计数表连接到日期表

excel查找一段时间内的数据

如何查找一段时间内的 SQL 数据库条目