在SQL中以破坏序列升序/降序对结果最小值/最大值进行分组

Posted

技术标签:

【中文标题】在SQL中以破坏序列升序/降序对结果最小值/最大值进行分组【英文标题】:Grouping Result Min/Max in a breaking series ascending/descending order in SQL 【发布时间】:2016-03-19 05:43:54 【问题描述】:

如果序列破坏了数据顺序,我想在升序/降序序列中选择最小值和最大值

假设我有按 DateTime 顺序排列的数据:

LogDate      StartValue EndValue    Multiplier  DiffValue
2016-02-08   7661.25    7677.62     6.94        16.37
2016-02-09   7677.62    7693.02     6.94        15.4
2016-02-10   7693.02    7709.82     6.94        16.8
2016-02-11   7709.82    7727.08     6.94        17.26
2016-02-12   7727.08    7740.93     6.94        13.85
2016-02-13   3.02       12.22       6.94        9.2
2016-02-14   12.22      20.73       6.94        8.51
2016-02-15   20.73      37.04       6.94        16.31
2016-02-16   37.04      52.56       7           15.52
2016-02-17   52.56      67.82       7           15.26
2016-02-18   67.82      83.66       7           15.84
2016-02-19   83.66      98.77       7           15.11
2016-02-20   98.77      108.37      7           9.61

我想要这样的结果:

LogDateMin  LogDateMax  StartValue  EndValue    Multiplier  SumOfDiffValue
2016-02-08  2016-02-12  7661.25     7740.93     6.94        79.68
2016-02-13  2016-02-15  3.02        37.04       6.94        34.02
2016-02-16  2016-02-20  37.04       108.37      7           71.34

这里我也按 Multiplier 对结果进行分组并得到 defValue 的总和

我们怎样才能做到这一点

请帮忙

【问题讨论】:

上例中如果序列中断数据顺序是什么意思 查看秒表的示例,其数据按升序流动,但如果用户重置手表,它将从零或最小值重新启动 CurrValue 【参考方案1】:

如果我理解正确,“中断”是指随着时间的推移,值之间的最小值阈值。

为了获得结果,我使用了LEADLAG 函数来查找中断,因为它们在当前记录之前和之后提供值,而不使用self JOIN

然后,我创建了仅包含“中断”附近的第一条和最后一条记录的组。 resultset 包含日期和值作为行,因此需要 UNPIVOT

最终查询应如下所示:

declare @Threshold NUMERIC(18, 2) = 1000

;with DeltaCte as (
    SELECT DateTime, CurrValue, 
        LAG(CurrValue, 1, CurrValue - @Threshold - 1) OVER (ORDER BY DateTime) AS PrevVal, 
        LEAD(CurrValue, 1, CurrValue - @Threshold - 1) OVER (ORDER BY DateTime) AS NextVal
    FROM RawData
)
,GroupsCTE AS (
    select DateTime, CurrValue, CurrValue - PrevVal AS Delta1, CurrValue - NextVal AS Delta2, 
        (ROW_NUMBER() OVER (ORDER BY DateTime) + 1) / 2 AS GroupNo
    FROM DeltaCte
    WHERE ABS(CurrValue - PrevVal) > @Threshold OR ABS(CurrValue - NextVal) > @Threshold
)
SELECT GroupNo, MIN(d) AS DateTimeMin, MAX(d) DateTimeMax, 
    MIN(v) AS CurrValueMin, MAX(v) CurrValueMax
from GroupsCTE
UNPIVOT (v FOR nValue IN ([CurrValue])) AS P1
UNPIVOT (d FOR nDate IN ([DateTime])) AS P2
GROUP BY GroupNo

[编辑]

如果“break”表示升序中断,上面的查询会稍微简单一些:

;with DeltaCte as (
    SELECT DateTime, CurrValue, 
        LAG(CurrValue, 1, CurrValue + 1) OVER (ORDER BY DateTime) AS PrevVal, 
        LEAD(CurrValue, 1, CurrValue - 1) OVER (ORDER BY DateTime) AS NextVal
    FROM RawData
)
,GroupsCTE AS (
    select DateTime, CurrValue, CurrValue - PrevVal AS Delta1, CurrValue - NextVal AS Delta2, (ROW_NUMBER() OVER (ORDER BY DateTime) + 1) / 2 AS GroupNo
    FROM DeltaCte
    WHERE (CurrValue - PrevVal < 0) OR (NextVal - CurrValue <  0)
)
SELECT GroupNo, MIN(d) AS DateTimeMin, MAX(d) DateTimeMax, 
    MIN(v) AS CurrValueMin, MAX(v) CurrValueMax
from GroupsCTE
UNPIVOT (v FOR nValue IN ([CurrValue])) AS P1
UNPIVOT (d FOR nDate IN ([DateTime])) AS P2
GROUP BY GroupNo

基本上,增量到阈值的比较被替换为增量到 0 的比较。

【讨论】:

非常感谢@alexei

以上是关于在SQL中以破坏序列升序/降序对结果最小值/最大值进行分组的主要内容,如果未能解决你的问题,请参考以下文章

升序堆和降序堆(优先队列) 洛谷1801

使用mongoose在mongodb中按升序和降序对多个字段进行排序

SQL 按关键字排序

PHP 数组排序

sql里的排序倒序的命令是order by啥

二维,多维数组排序array_multisort()函数的使用