如何为间隙和孤岛问题编写查询?
Posted
技术标签:
【中文标题】如何为间隙和孤岛问题编写查询?【英文标题】:How to write a query for a gaps and islands problem? 【发布时间】:2018-10-01 05:28:53 【问题描述】:这是一个空白和孤岛问题。
Meter_id |Realtimeclock |I_Y|I_B|I_X|
201010 |27-09-2018 00:00:00|1.0|2.0|3.0|
201010 |27-09-2018 00:30:00|1.0|2.0|3.0|
201010 |27-09-2018 01:00:00|1.0|2.0|3.0|
201010 |27-09-2018 01:30:00|1.0|2.0|3.0|
201010 |27-09-2018 02:00:00|1.0| 0 |3.0|
201010 |27-09-2018 02:30:00|1.0| 0 |0 |
201010 |27-09-2018 03:00:00|1.0|2.0|3.0|
201010 |27-09-2018 03:30:00|1.0|2.0|3.0|
201011 |27-09-2018 00:00:00|1.0|2.0|3.0|
201011 |27-09-2018 00:30:00|1.0|2.0|3.0|
201010 |28-09-2018 03:00:00|1.0|2.0|3.0|
201010 |28-09-2018 03:30:00|1.0|2.0|3.0|
201011 |28-09-2018 04:00:00|1.0| 0 |0 |
201011 |28-09-2018 00:00:00|1.0|2.0|3.0|
201011 |28-09-2018 00:30:00|1.0|2.0|3.0|
一种方法使用行数差异法:
select * from (
WITH cte1 AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY Meter_id ORDER BY Realtimeclock) rn
FROM yourTable t
),
cte2 AS (
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY Meter_id ORDER BY Realtimeclock) rn
FROM yourTable t
WHERE I_B <> 0
),
cte3 AS (
SELECT t1.*,
t1.rn - t2.rn AS diff
FROM cte1 t1
INNER JOIN cte2 t2
ON t1.Meter_id = t2.Meter_id AND t1.Realtimeclock = t2.Realtimeclock
)
SELECT
Meter_id,
MIN(Realtimeclock) AS start_time,
MAX(Realtimeclock) AS end_time,
COUNT(I_Y) AS I_Y,
COUNT(I_B) AS I_B,
COUNT(I_X) AS I_X,ROW_NUMBER() OVER (PARTITION BY meter_id ORDER BY meter_id ) AS Spell
FROM cte3
GROUP BY
Meter_id,
diff);
输出应该像 ,,请让我知道代码中需要的任何更改。
根据上表中的 I_Y、I_B、I_X 条件,我需要按日拼写作为开始时间和结束时间,这是可数的非零值。 在这里,我们看到 201010 Meter_id 的开始时间有两个法术,因为它们之间存在时间间隔。同样,它必须显示所有咒语以及日期和时间戳。
Meter_id |start_time |End_time |I_Y|I_B|I_X|spell
201010 |27-09-2018 00:00:00|27-09-2018 01:30:00|4 |4 |4 |1
201010 |27-09-2018 03:00:00|27-09-2018 03:30:00|4 |4 |4 |2
201011 |27-09-2018 00:00:00|27-09-2018 00:30:00|2 |2 |2 |1
201010 |28-09-2018 03:00:00|27-09-2018 03:30:00|2 |2 |2 |1
201011 |28-09-2018 00:00:00|28-09-2018 00:30:00|2 |2 |2 |1
如下抛出运行时错误,
[错误]执行(35:22):ORA-01830:日期格式图片在转换整个输入字符串之前结束
嗨,蒂姆,
请调查一下。这对我有很大的帮助。
在给出 trunc(realtimeclock) 而不是 TO_DATE(realtimeclock) ..
感谢蒂姆的帮助。
【问题讨论】:
开始和结束时间是否可以绕到第二天?例如。第二天可以start_time
22:00:00
和end_time
02:00:00
吗?
【参考方案1】:
您只需要对当前方法稍作修改,即可在日期上添加一个分区(除了meter_id
)。然后,在最后的查询中,添加一个COUNT
,它记录给定仪表和日期的法术数量。
WITH cte1 AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY Meter_id, TO_DATE(Realtimeclock)
ORDER BY Realtimeclock) rn
FROM yourTable t
),
cte2 AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY Meter_id, TO_DATE(Realtimeclock)
ORDER BY Realtimeclock) rn
FROM yourTable t
WHERE I_B <> 0
),
cte3 AS (
SELECT t1.*,
t1.rn - t2.rn AS diff
FROM cte1 t1
INNER JOIN cte2 t2
ON t1.Meter_id = t2.Meter_id AND t1.Realtimeclock = t2.Realtimeclock
)
SELECT
Meter_id,
MIN(Realtimeclock) AS start_time,
MAX(Realtimeclock) AS end_time,
COUNT(I_Y) AS I_Y,
COUNT(I_B) AS I_B,
COUNT(I_X) AS I_X,
COUNT(*) OVER (PARTITION BY TO_DATE(Realtimeclock), Meter_id
ORDER BY MIN(Realtimeclock)) AS spell
FROM cte3
GROUP BY
Meter_id,
TO_DATE(Realtimeclock),
diff;
Demo
请注意,此答案假定轮班不会从一个日历日持续到下一个日历日。如果这可能发生,并且您需要对此进行解释,那么您应该告诉我们有关计算此类事件的逻辑是什么。
在 SQL Server 中再次演示,尽管上面的查询是 Oracle 代码,应该可以正常运行。
【讨论】:
以上是关于如何为间隙和孤岛问题编写查询?的主要内容,如果未能解决你的问题,请参考以下文章