Oracle SQL 组问题

Posted

技术标签:

【中文标题】Oracle SQL 组问题【英文标题】:Oracle SQL Group Issue 【发布时间】:2019-07-11 10:19:38 【问题描述】:

我正在尝试汇总一个员工表,其中一个员工在一个团队中时存在多条记录。我曾尝试按、Min/Max Over Partition By 和 Lead/Lag 团队名称进行分组,但每个结果都以一个代理结束,该代理已从一个团队移动,然后在稍后的日期作为一次事件返回到原始团队组,即使我正在按日期排序。

示例数据库:

Employee Name | Employee ID | Team Leader | Location | Start Date | End Date

John Smith    | 123123      | Team A      | Site A   | 01/JAN/19  | 02/JAN/19

John Smith    | 123123      | Team A      | Site A   | 02/JAN/19  | 03/JAN/19

John Smith    | 123123      | Team B      | Site A   | 03/JAN/19  | 04/JAN/19

John Smith    | 123123      | Team A      | Site A   | 04/JAN/19  | 05/JAN/19

John Smith    | 123123      | Team B      | Site A   | 05/JAN/19  | 06/JAN/19

当我运行示例查询时:

SELECT
Employee Name
,Employee ID
,Team Leader
,Location
,MIN(Start Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, Start Date) AS Starting Date
,MAX(End Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, End Date) AS End Date
FROM TABLE 1

结果如下:

Employee Name | Employee ID | Team Leader | Location | Start Date | End Date

John Smith    | 123123      | Team A      | Site A   | 01/JAN/19  | 05/JAN/19

John Smith    | 123123      | Team B      | Site A   | 03/JAN/19  | 06/JAN/19

任何人都可以帮助实现预期的结果:

Employee Name | Employee ID | Team Leader | Location | Start Date | End Date

John Smith    | 123123      | Team A      | Site A   | 01/JAN/19  | 03/JAN/19

John Smith    | 123123      | Team B      | Site A   | 03/JAN/19  | 04/JAN/19

John Smith    | 123123      | Team A      | Site A   | 04/JAN/19  | 05/JAN/19

John Smith    | 123123      | Team B      | Site A   | 05/JAN/19  | 06/JAN/19

【问题讨论】:

寻找缝隙和孤岛。很多帖子都可以找到 这是一个重复的问题,尽管我很难找到最适合这里的答案。解决方案类似于here。 @PonderStibbons 我从您提供的链接中测试了类似的逻辑,它似乎有效。我将验证何时将更多员工添加到查询中。谢谢! 【参考方案1】:

这是一种选择:

test CTE 代表你的数据(简化一点) 有用的代码从第 8 行开始
SQL> with test (ename, team, start_date, end_date) as
  2    (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
  3     select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
  4     select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
  5     select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
  6     select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
  7    ),
  8  temp as
  9    (select ename, team, start_date, end_date,
 10       row_number() over (order by start_date) rn,
 11       row_number() over (partition by ename, team order by start_date) rna
 12     from test
 13    )
 14  select ename, team, min(start_date) start_date, max(end_date) end_date
 15  from temp
 16  group by ename, team, (rn - rna)
 17  order by 3;

ENAM T START_DATE  END_DATE
---- - ----------- -----------
John A 01/jan/2019 03/jan/2019
John B 03/jan/2019 04/jan/2019
John A 04/jan/2019 05/jan/2019
John B 05/jan/2019 06/jan/2019

SQL>

【讨论】:

【参考方案2】:

如果您有 12c 或更高版本,行模式匹配是一个很好的替代解决方案。与“间隙和孤岛”解决方案不同,我也处理重叠问题。 WITH 子句包含测试数据,解决方案随后开始。

with test (ename, team, start_date, end_date) as
 (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
  select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
  select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
  select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
  select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
 )
select * from test
match_recognize(
  partition by ename, team order by start_date
  measures first(start_date) start_date, last(end_date) end_date
  pattern(a b*)
  define b as start_date <= a.end_date
)
order by ename, start_date;

ENAM T START_DATE       END_DATE        
---- - ---------------- ----------------
John A 2019-01-01 00:00 2019-01-03 00:00
John B 2019-01-03 00:00 2019-01-04 00:00
John A 2019-01-04 00:00 2019-01-05 00:00
John B 2019-01-05 00:00 2019-01-06 00:00

【讨论】:

【参考方案3】:

这看起来像是一种间隙和孤岛形式,其中记录按日期范围链接。

这是一种方法,它使用left join 查找岛屿的起点,然后使用累积和来识别组和聚合:

select employeename, employeeid, teamleader, location,
       min(startdate), max(enddate)
from (select t1.*,
             sum(case when tprev.employeeid is null  -- new group
                      then 1 else 0
                 end) over (partition by employeeid, teamleader, location
                            order by startdate
                           ) as grouping
      from table1 t1 left join
           table1 tprev
           on t1.startdate = tprev.enddate and
              t1.employeeid = tprev.employeeid and
              t1.teamleader = tprev.teamleader and
              t1.location = tprev.location
     ) t
group by employeeid, teamleader, location, grouping
order by employeeid, min(startdate);

【讨论】:

非常感谢,这似乎解决了我的问题并正确分组了代理。

以上是关于Oracle SQL 组问题的主要内容,如果未能解决你的问题,请参考以下文章

oracle 组函数

Oracle SQL 查询计数组按时间戳子串

使用 Oracle SQL 在多个列上旋转多个组的最有效方法?

如何在 Oracle SQL 中选择相关的一组项目

从Oracle SQL中的每个组中选择具有最大值的行[重复]

令人费解的差距和孤岛 ORACLE SQL