使用条件对行进行排名
Posted
技术标签:
【中文标题】使用条件对行进行排名【英文标题】:rank rows with conditions 【发布时间】:2021-03-01 09:01:27 【问题描述】:我有下表,我需要根据“索引”列(在表中给出)计算“排名索引”。 意思是,仅当自上一个时间戳以来已过去 6 小时或更长时间时才提升索引。 应该是key1+key2分区。
有什么想法吗?
【问题讨论】:
【参考方案1】:如果你在 11g 上也可以使用它
with Your_table (key1, key2, datetime, IDX) as (
select 11, 22, to_date('2021-01-01 00:00', 'yyyy-mm-dd hh24:mi'), 321 from dual union all
select 11, 22, to_date('2021-01-01 01:00', 'yyyy-mm-dd hh24:mi'), 322 from dual union all
select 11, 22, to_date('2021-01-01 02:00', 'yyyy-mm-dd hh24:mi'), 323 from dual union all
select 11, 22, to_date('2021-01-01 08:30', 'yyyy-mm-dd hh24:mi'), 324 from dual union all
select 11, 22, to_date('2021-01-01 09:00', 'yyyy-mm-dd hh24:mi'), 325 from dual union all
select 11, 22, to_date('2021-01-01 16:00', 'yyyy-mm-dd hh24:mi'), 326 from dual union all
select 11, 22, to_date('2021-01-01 17:00', 'yyyy-mm-dd hh24:mi'), 327 from dual union all
select 11, 22, to_date('2021-01-02 04:00', 'yyyy-mm-dd hh24:mi'), 328 from dual union all
---
select 999, 777, to_date('2021-01-01 00:00', 'yyyy-mm-dd hh24:mi'), 17 from dual union all
select 999, 777, to_date('2021-01-01 01:00', 'yyyy-mm-dd hh24:mi'), 18 from dual union all
select 999, 777, to_date('2021-01-22 02:00', 'yyyy-mm-dd hh24:mi'), 19 from dual union all
select 999, 777, to_date('2021-01-22 04:00', 'yyyy-mm-dd hh24:mi'), 20 from dual
)
, temp_rws_ordered (key1, key2, datetime, IDX, rnb) as (
select KEY1, KEY2, DATETIME, IDX, row_number()over(order by KEY1, KEY2, DATETIME)rnb
from Your_table
), cte (key1, key2, datetime, IDX, rnb, treshold, rank_index) as (
select key1, key2, datetime, IDX, rnb, datetime treshold, IDX rank_index
from temp_rws_ordered
where rnb = 1
union all
select t.key1, t.key2, t.datetime, t.IDX, t.rnb
, case
when t.KEY1 = c.KEY1 and t.KEY2 = c.KEY2 then
case
when (t.datetime - c.treshold)*24 > 6 then t.datetime
else c.treshold
end
else t.datetime
end treshold
, case
when t.KEY1 = c.KEY1 and t.KEY2 = c.KEY2 then
case
when (t.datetime - c.treshold)*24 > 6 then c.rank_index + 1
else c.rank_index
end
else t.IDX
end rank_index
from temp_rws_ordered t
join cte c on (t.rnb = c.rnb + 1)
)
select KEY1, KEY2, DATETIME, IDX, RANK_INDEX
from cte
;
【讨论】:
【参考方案2】:从 Oracle 12c 开始,您可以使用MATCH_RECOGNIZE
:
SELECT *
FROM table_name
MATCH_RECOGNIZE(
PARTITION BY key1, key2
ORDER BY datetime
MEASURES
FIRST( idx ) AS rank_idx
ALL ROWS PER MATCH
PATTERN ( within_6_hours* last_row )
DEFINE
within_6_hours AS (
NEXT( datetime ) < LAST( datetime ) + INTERVAL '6' HOUR
)
)
其中,对于您的示例数据:
CREATE TABLE table_name ( key1, key2, datetime, idx ) AS
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '00:00' HOUR TO MINUTE, 321 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '01:00' HOUR TO MINUTE, 322 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '02:00' HOUR TO MINUTE, 323 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '08:30' HOUR TO MINUTE, 324 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '09:00' HOUR TO MINUTE, 325 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '16:00' HOUR TO MINUTE, 326 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-01' + INTERVAL '17:00' HOUR TO MINUTE, 327 FROM DUAL UNION ALL
SELECT 11, 22, DATE '2020-01-02' + INTERVAL '04:00' HOUR TO MINUTE, 328 FROM DUAL UNION ALL
SELECT 999, 777, DATE '2020-01-01' + INTERVAL '00:00' HOUR TO MINUTE, 17 FROM DUAL UNION ALL
SELECT 999, 777, DATE '2020-01-01' + INTERVAL '01:00' HOUR TO MINUTE, 18 FROM DUAL UNION ALL
SELECT 999, 777, DATE '2020-01-22' + INTERVAL '02:00' HOUR TO MINUTE, 19 FROM DUAL UNION ALL
SELECT 999, 777, DATE '2020-01-22' + INTERVAL '04:00' HOUR TO MINUTE, 20 FROM DUAL;
输出:
键1 |键2 |日期时间 | RANK_IDX | IDX ---: | ---: | :----------------- | --------: | --: 11 | 22 | 2020-01-01 00:00:00 | 321 | 321 11 | 22 | 2020-01-01 01:00:00 | 321 | 322 11 | 22 | 2020-01-01 02:00:00 | 321 | 323 11 | 22 | 2020-01-01 08:30:00 | 324 | 324 11 | 22 | 2020-01-01 09:00:00 | 324 | 325 11 | 22 | 2020-01-01 16:00:00 | 326 | 326 11 | 22 | 2020-01-01 17:00:00 | 326 | 327 11 | 22 | 2020-01-02 04:00:00 | 328 | 328 999 |第777章2020-01-01 00:00:00 | 17 | 17 999 |第777章2020-01-01 01:00:00 | 17 | 18 999 |第777章2020-01-22 02:00:00 | 19 | 19 999 |第777章2020-01-22 04:00:00 | 19 | 20
您也可以申请LAG
两次,这将适用于Oracle 12 之前的版本:
SELECT key1,
key2,
datetime,
idx,
COALESCE(
rank_idx,
LAG( rank_idx ) IGNORE NULLS OVER ( PARTITION BY key1, key2 ORDER BY datetime )
) AS rank_idx
FROM (
SELECT t.*,
CASE
WHEN datetime
< LAG( datetime ) OVER ( PARTITION BY key1, key2 ORDER BY datetime )
+ INTERVAL '6' HOUR
THEN NULL
ELSE idx
END AS rank_idx
FROM table_name t
)
db小提琴here
【讨论】:
【参考方案3】:我会为此使用窗口函数:
select t.*,
(min_index - 1 +
sum(case when prev_datetime > datetime - interval '6' hour then 0 else 1 end) over
(partition by key1, key2 order by datetime)
) as rank_index
from (select t.*,
min(index) over (partition by key1, key2) as min_index,
lag(datetime) over (partition by key1, key2 order by datetime) as prev_datetime
from t
) t;
Here 是一个数据库fioddle。
【讨论】:
以上是关于使用条件对行进行排名的主要内容,如果未能解决你的问题,请参考以下文章