Oracle SQL 组问题
Posted
技术标签:
【中文标题】Oracle SQL 组问题【英文标题】:Oracle SQL Group Issue 【发布时间】:2019-07-11 10:19:38 【问题描述】:我正在尝试汇总一个员工表,其中一个员工在一个团队中时存在多条记录。我曾尝试按、Min/Max Over Partition By 和 Lead/Lag 团队名称进行分组,但每个结果都以一个代理结束,该代理已从一个团队移动,然后在稍后的日期作为一次事件返回到原始团队组,即使我正在按日期排序。
示例数据库:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 02/JAN/19
John Smith | 123123 | Team A | Site A | 02/JAN/19 | 03/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 04/JAN/19
John Smith | 123123 | Team A | Site A | 04/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 05/JAN/19 | 06/JAN/19
当我运行示例查询时:
SELECT
Employee Name
,Employee ID
,Team Leader
,Location
,MIN(Start Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, Start Date) AS Starting Date
,MAX(End Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, End Date) AS End Date
FROM TABLE 1
结果如下:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 06/JAN/19
任何人都可以帮助实现预期的结果:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 03/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 04/JAN/19
John Smith | 123123 | Team A | Site A | 04/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 05/JAN/19 | 06/JAN/19
【问题讨论】:
寻找缝隙和孤岛。很多帖子都可以找到 这是一个重复的问题,尽管我很难找到最适合这里的答案。解决方案类似于here。 @PonderStibbons 我从您提供的链接中测试了类似的逻辑,它似乎有效。我将验证何时将更多员工添加到查询中。谢谢! 【参考方案1】:这是一种选择:
test
CTE 代表你的数据(简化一点)
有用的代码从第 8 行开始
SQL> with test (ename, team, start_date, end_date) as
2 (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
3 select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
4 select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
5 select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
6 select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
7 ),
8 temp as
9 (select ename, team, start_date, end_date,
10 row_number() over (order by start_date) rn,
11 row_number() over (partition by ename, team order by start_date) rna
12 from test
13 )
14 select ename, team, min(start_date) start_date, max(end_date) end_date
15 from temp
16 group by ename, team, (rn - rna)
17 order by 3;
ENAM T START_DATE END_DATE
---- - ----------- -----------
John A 01/jan/2019 03/jan/2019
John B 03/jan/2019 04/jan/2019
John A 04/jan/2019 05/jan/2019
John B 05/jan/2019 06/jan/2019
SQL>
【讨论】:
【参考方案2】:如果您有 12c 或更高版本,行模式匹配是一个很好的替代解决方案。与“间隙和孤岛”解决方案不同,我也处理重叠问题。 WITH 子句包含测试数据,解决方案随后开始。
with test (ename, team, start_date, end_date) as
(select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
)
select * from test
match_recognize(
partition by ename, team order by start_date
measures first(start_date) start_date, last(end_date) end_date
pattern(a b*)
define b as start_date <= a.end_date
)
order by ename, start_date;
ENAM T START_DATE END_DATE
---- - ---------------- ----------------
John A 2019-01-01 00:00 2019-01-03 00:00
John B 2019-01-03 00:00 2019-01-04 00:00
John A 2019-01-04 00:00 2019-01-05 00:00
John B 2019-01-05 00:00 2019-01-06 00:00
【讨论】:
【参考方案3】:这看起来像是一种间隙和孤岛形式,其中记录按日期范围链接。
这是一种方法,它使用left join
查找岛屿的起点,然后使用累积和来识别组和聚合:
select employeename, employeeid, teamleader, location,
min(startdate), max(enddate)
from (select t1.*,
sum(case when tprev.employeeid is null -- new group
then 1 else 0
end) over (partition by employeeid, teamleader, location
order by startdate
) as grouping
from table1 t1 left join
table1 tprev
on t1.startdate = tprev.enddate and
t1.employeeid = tprev.employeeid and
t1.teamleader = tprev.teamleader and
t1.location = tprev.location
) t
group by employeeid, teamleader, location, grouping
order by employeeid, min(startdate);
【讨论】:
非常感谢,这似乎解决了我的问题并正确分组了代理。以上是关于Oracle SQL 组问题的主要内容,如果未能解决你的问题,请参考以下文章
使用 Oracle SQL 在多个列上旋转多个组的最有效方法?