SQL server:如何计算每个观察的最大连续变化

Posted

技术标签:

【中文标题】SQL server:如何计算每个观察的最大连续变化【英文标题】:SQL server: How to count maximum consercutive change for each observation in 【发布时间】:2018-08-25 09:47:46 【问题描述】:

我使用的是 sql server 2012。我想计算每个 obs 的最大连续更改。像这样的表

snapshot_date	customer_id	Number	Max_consercutive_increase_as_of_each_row
Jan-14	12342	0	0
Feb-14	12342	15	1
Mar-14	12342	45	2
Apr-14	12342	0	2
May-14	12342	15	2
Jun-14	12342	45	2
Jul-14	12342	75	3
Aug-14	12342	105	4
Sep-14	12342	135	5
Oct-14	12342	0	4
Nov-14	12342	0	3
Dec-14	12342	0	2
Jan-15	12342	0	1
Feb-15	12342	0	0
Mar-15	12342	0	0
Apr-15	12342	0	0

从每一行开始,倒数到前面的 06 行(包括当前行)。当然,有些起始行只有 01 或 02 行之前。基于“数字”列增加。在 06 行中,如果最大连续在 2 和 3 之间 --> 我想取 3。

我尝试将游标与 fetch relative -n 行一起使用,但我的代码不起作用。 所以请帮我解决它。

非常感谢!

【问题讨论】:

你能把你的脚本发布到现在你是如何尝试的吗? 我尝试过将游标与 fetch relative 一起使用,例如: fetch relative -5 from Test1cursor into ... fetch relative -5 from Test1cursor into ... end 【参考方案1】:

这应该适合你:

-- Create test data

declare @t table(snapshot_date date,customer_id int,Number int);
insert into @t values ('20140101',12342,    0   ),('20140201',12342,    15  ),('20140301',12342,    45  ),('20140401',12342,    0   ),('20140501',12342,    15  ),('20140601',12342,    45  ),('20140701',12342,    75  ),('20140801',12342,    105 ),('20140901',12342,    135 ),('20141001',12342,    0   ),('20141101',12342,    0   ),('20141201',12342,    0   ),('20150101',12342,    0   ),('20150201',12342,    0   ),('20150301',12342,    0   ),('20150401',12342,    0   );

with d as    -- Add a row number to the dataset
(
    select snapshot_date
            ,customer_id
            ,Number
            ,row_number() over (order by snapshot_date) as rn
    from @t
)
,c as        -- Use a recursive CTE to loop through the dataset and check for increases
(
    select snapshot_date
            ,customer_id
            ,Number
            ,rn
            ,0 as ConsecutiveIncreases
    from d
    where rn = 1

    union all

    select t.snapshot_date
            ,t.customer_id
            ,t.Number
            ,t.rn
            ,case when t.Number > c.Number then c.ConsecutiveIncreases + 1 else 0 end
    from d as t
        join c
            on t.rn = c.rn+1
)
-- Take the MAX consecutive increase where the current row is also an increase,
-- unless the row is not an increase, then subtract the number of non-increases
-- from the MAX consecutive increase to find the number of increases within the last 6 rows.
-- If less than 6 rows to use, just take the MAX increase.
select c.snapshot_date
        ,c.customer_id
        ,c.Number
        ,case when isnull(sum(c2.ConsecutiveIncreases),0) = 0
                then 0
              when count(c2.ConsecutiveIncreases) < 6
                then max(c2.ConsecutiveIncreases)
              else max(c2.ConsecutiveIncreases) - case when c.ConsecutiveIncreases = 0
                                                       then sum(case when c2.ConsecutiveIncreases = 0
                                                                     then 1
                                                                     else 0
                                                                     end
                                                               )
                                                       else 0
                                                       end
          end as MaxConsecutiveIncreases
from c
    left join c as c2
        on c2.rn between c.rn-5 and c.rn
group by c.snapshot_date
        ,c.customer_id
        ,c.Number
        ,c.ConsecutiveIncreases
order by 1

输出:

+---------------+-------------+--------+-------------------------+
| snapshot_date | customer_id | Number | MaxConsecutiveIncreases |
+---------------+-------------+--------+-------------------------+
| 2014-01-01    |       12342 |      0 |                       0 |
| 2014-02-01    |       12342 |     15 |                       1 |
| 2014-03-01    |       12342 |     45 |                       2 |
| 2014-04-01    |       12342 |      0 |                       2 |
| 2014-05-01    |       12342 |     15 |                       2 |
| 2014-06-01    |       12342 |     45 |                       2 |
| 2014-07-01    |       12342 |     75 |                       3 |
| 2014-08-01    |       12342 |    105 |                       4 |
| 2014-09-01    |       12342 |    135 |                       5 |
| 2014-10-01    |       12342 |      0 |                       4 |
| 2014-11-01    |       12342 |      0 |                       3 |
| 2014-12-01    |       12342 |      0 |                       2 |
| 2015-01-01    |       12342 |      0 |                       1 |
| 2015-02-01    |       12342 |      0 |                       0 |
| 2015-03-01    |       12342 |      0 |                       0 |
| 2015-04-01    |       12342 |      0 |                       0 |
+---------------+-------------+--------+-------------------------+

【讨论】:

感谢您的回答。但我认为在 snapshot_date-2014-04-01,Maxconsecutiveincrease 应该是 2,因为前 06 行中的“数字”列增加了 02 倍。请分享更多关于你的逻辑,因为它对我来说是新的 还有一件事@iamdave。我的表大约有 1.000.000 行,所以如果我使用递归 cte 会导致临时数据非常大吗? @trato 更新了我的脚本。正因为如此,你正在寻找这样做,我还没有看到一种方法以一种有效的基于集合的方式来做到这一点,所以它总是很慢。

以上是关于SQL server:如何计算每个观察的最大连续变化的主要内容,如果未能解决你的问题,请参考以下文章

在 SQL Server 2012+ 中选择连续期间的最小开始和最大结束

请教各位高手,我现在有一个sqlserver数据库A,有字段月份a(1到12月)、b (内容为编号,会有重复的)

如何从MS SQL Server 2012中的不同表中减去连续的行?

为啥 SQL Server 标量值函数变慢?

如何在sql中的某个位置连续计算车辆的时间

sql server:索引视图包含每组最大的行