ROWNUM()应用案例-实现一个拉链表

Posted 踏叶乘风

tags:

篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了ROWNUM()应用案例-实现一个拉链表相关的知识,希望对你有一定的参考价值。

有以下数据表:事件标识表未这个股票进入(1)或者移除(2)某个股票池。我们需要对数据进行去重,就是连续多次的进入,或者连续多次的移除,都只取第一次的日期。

 

 

如图中,黄色的记录是不需要的。(这个表命名为demo)

 

 

 

首先第一步

 select SCODE,POOLID,EFFECTIVE_DAY,ADJUSTMODE
         ,row_number() over (partition by SCODE,POOLID,ADJUSTMODE order by EFFECTIVE_DAY) as rn
         ,row_number() over (partition by SCODE,POOLID order by EFFECTIVE_DAY) as rn2
  from  demo t 

得到如下结果(我将进入的标注了红色):

 

 

 观察数据的规律,然后我们用RN2-RN1。再按SCODE+POOLID+ADJUSTMODE分组中,按日期进行排序。

SQL:

select x.*,rn2-rn,row_number() over (partition by scode,poolid,adjustmode,rn2-rn order by rn) as rn3
from (      
  select SCODE,POOLID,EFFECTIVE_DAY,ADJUSTMODE
         ,row_number() over (partition by SCODE,POOLID,ADJUSTMODE order by EFFECTIVE_DAY) as rn
         ,row_number() over (partition by SCODE,POOLID order by EFFECTIVE_DAY) as rn2
  from  demo t 
  ) x
order by 1,2,3

 

 

 

然后发现RN3=1的就是我们需要的记录.

 

完整的代码如下:


with demo as (
  select \'00001\' as SCODE,1 POOLID,date\'2010-01-01\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00001\' as SCODE,1 POOLID,date\'2010-01-03\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00001\' as SCODE,1 POOLID,date\'2010-01-05\' as EFFECTIVE_DAY,2 as ADJUSTMODE from dual
      union all
      select \'00001\' as SCODE,1 POOLID,date\'2010-01-07\' as EFFECTIVE_DAY,2 as ADJUSTMODE from dual
      union all
      select \'00001\' as SCODE,1 POOLID,date\'2010-02-01\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00001\' as SCODE,1 POOLID,date\'2010-02-03\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00001\' as SCODE,1 POOLID,date\'2010-02-07\' as EFFECTIVE_DAY,2 as ADJUSTMODE from dual
      union all
      select \'00002\' as SCODE,1 POOLID,date\'2010-01-01\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00002\' as SCODE,1 POOLID,date\'2010-01-05\' as EFFECTIVE_DAY,2 as ADJUSTMODE from dual
      union all
      select \'00002\' as SCODE,1 POOLID,date\'2010-01-07\' as EFFECTIVE_DAY,2 as ADJUSTMODE from dual
      union all
      select \'00002\' as SCODE,1 POOLID,date\'2010-02-01\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00002\' as SCODE,1 POOLID,date\'2010-02-03\' as EFFECTIVE_DAY,1 as ADJUSTMODE from dual
      union all
      select \'00002\' as SCODE,1 POOLID,date\'2010-02-07\' as EFFECTIVE_DAY,2 as ADJUSTMODE from dual
)  
,data as (
select x.*,rn2-rn,row_number() over (partition by scode,poolid,adjustmode,rn2-rn order by rn) as rn3
from (      
  select SCODE,POOLID,EFFECTIVE_DAY,ADJUSTMODE
         ,row_number() over (partition by SCODE,POOLID,ADJUSTMODE order by EFFECTIVE_DAY) as rn
         ,row_number() over (partition by SCODE,POOLID order by EFFECTIVE_DAY) as rn2
  from  demo t
  ) x
)
select * from data where rn3=1
order by 1,2,3

 

以上是关于ROWNUM()应用案例-实现一个拉链表的主要内容,如果未能解决你的问题,请参考以下文章

hive窗口函数极速入门及在拉链表上的运用案例

大数据Hive3.x数仓开发数仓中数据发生变化如何实现数据存储--拉链表详解

数仓-拉链表的详细实现过程

数据仓库之拉链表设计

Hive拉链表实战-SQL模拟hive仓库拉链表实现

基于“Doris”的type2拉链表的Mysql实现