以不带 0 的间隔对数据进行分组
Posted
技术标签:
【中文标题】以不带 0 的间隔对数据进行分组【英文标题】:Group data in intervals without 0 【发布时间】:2020-12-22 12:35:17 【问题描述】:我有下表:
+---------------------+--------+----------+
| MeasureInterval | Car_id | Distance |
+---------------------+--------+----------+
| 2020-12-15 17:00:00 | 1 | 20 |
+---------------------+--------+----------+
| 2020-12-15 17:05:00 | 1 | 30 |
+---------------------+--------+----------+
| 2020-12-15 17:10:00 | 1 | 17 |
+---------------------+--------+----------+
| 2020-12-15 17:15:00 | 1 | 0 |
+---------------------+--------+----------+
| 2020-12-15 17:20:00 | 1 | 0 |
+---------------------+--------+----------+
| 2020-12-15 17:25:00 | 1 | 10 |
+---------------------+--------+----------+
| 2020-12-15 17:30:00 | 1 | 15 |
+---------------------+--------+----------+
| 2020-12-15 17:35:00 | 1 | 0 |
+---------------------+--------+----------+
| 2020-12-15 17:40:00 | 1 | 0 |
+---------------------+--------+----------+
| 2020-12-15 17:45:00 | 1 | 0 |
+---------------------+--------+----------+
我正在尝试选择汽车移动的连续间隔(忽略距离 = 0 的间隔),因此结果将类似于:
+---------------------+---------------------+--------+--------------+
| | | | |
| MeasureInterval_min | MeasureInterval_max | Car_id | Distance_sum |
+---------------------+---------------------+--------+--------------+
| 2020-12-15 17:00:00 | 2020-12-15 17:10:00 | 1 | 67 |
+---------------------+---------------------+--------+--------------+
| 2020-12-15 17:25:00 | 2020-12-15 17:30:00 | 1 | 25 |
+---------------------+---------------------+--------+--------------+
知道如何实现吗?
【问题讨论】:
【参考方案1】:这是一个孤岛问题。岛屿是具有非零距离的相邻记录。
这是一种使用行号之间的差异来识别组的方法:
select
min(measureinterval) as measureinterval_min,
max(measureinterval) as measureinterval_max,
car_id,
sum(distance) as distance
from (
select t.*,
row_number() over(partition by carid order by measureinterval) rn1,
row_number() over(partition by carid, (distance = 0) order by measureinterval) rn2
from mytable t
) t
where distance > 0
group by car_id, rn1 - rn2
【讨论】:
以上是关于以不带 0 的间隔对数据进行分组的主要内容,如果未能解决你的问题,请参考以下文章