在SQL中以破坏序列升序/降序对结果最小值/最大值进行分组
Posted
技术标签:
【中文标题】在SQL中以破坏序列升序/降序对结果最小值/最大值进行分组【英文标题】:Grouping Result Min/Max in a breaking series ascending/descending order in SQL 【发布时间】:2016-03-19 05:43:54 【问题描述】:如果序列破坏了数据顺序,我想在升序/降序序列中选择最小值和最大值
假设我有按 DateTime 顺序排列的数据:
LogDate StartValue EndValue Multiplier DiffValue
2016-02-08 7661.25 7677.62 6.94 16.37
2016-02-09 7677.62 7693.02 6.94 15.4
2016-02-10 7693.02 7709.82 6.94 16.8
2016-02-11 7709.82 7727.08 6.94 17.26
2016-02-12 7727.08 7740.93 6.94 13.85
2016-02-13 3.02 12.22 6.94 9.2
2016-02-14 12.22 20.73 6.94 8.51
2016-02-15 20.73 37.04 6.94 16.31
2016-02-16 37.04 52.56 7 15.52
2016-02-17 52.56 67.82 7 15.26
2016-02-18 67.82 83.66 7 15.84
2016-02-19 83.66 98.77 7 15.11
2016-02-20 98.77 108.37 7 9.61
我想要这样的结果:
LogDateMin LogDateMax StartValue EndValue Multiplier SumOfDiffValue
2016-02-08 2016-02-12 7661.25 7740.93 6.94 79.68
2016-02-13 2016-02-15 3.02 37.04 6.94 34.02
2016-02-16 2016-02-20 37.04 108.37 7 71.34
这里我也按 Multiplier 对结果进行分组并得到 defValue 的总和
我们怎样才能做到这一点
请帮忙
【问题讨论】:
上例中如果序列中断数据顺序是什么意思 查看秒表的示例,其数据按升序流动,但如果用户重置手表,它将从零或最小值重新启动 CurrValue 【参考方案1】:如果我理解正确,“中断”是指随着时间的推移,值之间的最小值阈值。
为了获得结果,我使用了LEAD
和LAG
函数来查找中断,因为它们在当前记录之前和之后提供值,而不使用self
JOIN
。
然后,我创建了仅包含“中断”附近的第一条和最后一条记录的组。 resultset
包含日期和值作为行,因此需要 UNPIVOT
。
最终查询应如下所示:
declare @Threshold NUMERIC(18, 2) = 1000
;with DeltaCte as (
SELECT DateTime, CurrValue,
LAG(CurrValue, 1, CurrValue - @Threshold - 1) OVER (ORDER BY DateTime) AS PrevVal,
LEAD(CurrValue, 1, CurrValue - @Threshold - 1) OVER (ORDER BY DateTime) AS NextVal
FROM RawData
)
,GroupsCTE AS (
select DateTime, CurrValue, CurrValue - PrevVal AS Delta1, CurrValue - NextVal AS Delta2,
(ROW_NUMBER() OVER (ORDER BY DateTime) + 1) / 2 AS GroupNo
FROM DeltaCte
WHERE ABS(CurrValue - PrevVal) > @Threshold OR ABS(CurrValue - NextVal) > @Threshold
)
SELECT GroupNo, MIN(d) AS DateTimeMin, MAX(d) DateTimeMax,
MIN(v) AS CurrValueMin, MAX(v) CurrValueMax
from GroupsCTE
UNPIVOT (v FOR nValue IN ([CurrValue])) AS P1
UNPIVOT (d FOR nDate IN ([DateTime])) AS P2
GROUP BY GroupNo
[编辑]
如果“break”表示升序中断,上面的查询会稍微简单一些:
;with DeltaCte as (
SELECT DateTime, CurrValue,
LAG(CurrValue, 1, CurrValue + 1) OVER (ORDER BY DateTime) AS PrevVal,
LEAD(CurrValue, 1, CurrValue - 1) OVER (ORDER BY DateTime) AS NextVal
FROM RawData
)
,GroupsCTE AS (
select DateTime, CurrValue, CurrValue - PrevVal AS Delta1, CurrValue - NextVal AS Delta2, (ROW_NUMBER() OVER (ORDER BY DateTime) + 1) / 2 AS GroupNo
FROM DeltaCte
WHERE (CurrValue - PrevVal < 0) OR (NextVal - CurrValue < 0)
)
SELECT GroupNo, MIN(d) AS DateTimeMin, MAX(d) DateTimeMax,
MIN(v) AS CurrValueMin, MAX(v) CurrValueMax
from GroupsCTE
UNPIVOT (v FOR nValue IN ([CurrValue])) AS P1
UNPIVOT (d FOR nDate IN ([DateTime])) AS P2
GROUP BY GroupNo
基本上,增量到阈值的比较被替换为增量到 0 的比较。
【讨论】:
非常感谢@alexei以上是关于在SQL中以破坏序列升序/降序对结果最小值/最大值进行分组的主要内容,如果未能解决你的问题,请参考以下文章