SQL：如何将 first_value 忽略为聚合？

Posted 2023-03-31

技术标签:

【中文标题】SQL：如何将 first_value 忽略为聚合？【英文标题】：SQL: How can I get the first_value ignoring nulls as an aggregate? 【发布时间】：2016-02-18 18:11:41 【问题描述】：

我有一个表，用作员工、部门和公司设置的分层数据集。员工可以在任何部门工作，如果他们没有指定，他们会继承部门设置。更具体的设置胜出，我想编写一个查询来获取特定员工/部门对的设置。

设置表有一个employeeid 可以为null，departmentid 也可以为null。如果两者都为空，则为公司范围设置的行。 "nvl(employeeid,0) 和 nvl(departmentid,0)" 有唯一的约束。

设置示例数据集：

employeeid    departmentid    address        phone
null          null            123 Corp Dr.   800-555-1212
10            null            1 ABC Ave.     null
null          1               2 Dept Rd.     null
null          2               3 Dept Rd.     617-555-1212

当我对员工 10 和部门 1 运行查询时，我应该得到一行：地址 = 1 ABC Ave，电话 = 800-555-1212

对于员工 10 和部门 2，我应该获得更新的电话号码：地址 = 1 ABC Ave，电话 = 617-555-1212

到目前为止，我能做的最好的事情是使用 first_value，忽略空值，在表上并按我添加的优先级排序。问题是 first_value over 不是聚合的，所以我需要一个单独的外部查询来选择特定的优先级。在我看来，这就像我应该能够聚合的东西。

select
    address,
    phone
from (
    select
        precedence,
        first_value(address ignore nulls) over (order by precedence) address,
        first_value(phone ignore nulls) over (order by precedence) phone
    from (
        select
            1 precedence,
            *
        from
            settings
        where
            settings.employeeid = ?
            and settings.departmentid is null
        union
        select
            2 precedence,
            *
        from
            settings
        where
            settings.departmentid = ?
            and settings.employeeid is null
        union
        select
            3 precedence,
            *
        from
            settings
        where
            settings.departmentid is null
            and settings.employeeid is null
    )
)
where
    precedence = 3

这得到了正确的答案，但我觉得好像应该有一种方法可以在中间查询中将 first_values 汇总为聚合并删除外部查询，并且可能仅依赖联合的显式排序而不是引入优先级列，尽管这不太重要。

我为此使用 Oracle 11。

【问题讨论】：

您可以在ORDER BY 的FIRST_VALUE() 中使用CASE 表达式。查看您的、我的和 Alex 的工作 - 您是否还必须为员工包括多行的可能性（一个部门为空，以及该员工 ID 为非空部门 ID 的多行?) 如果是这样，那么我们所有的优先级计算都需要更多的工作。 【参考方案1】：

您可以通过保持密集排名和单级内联视图来做到这一点，这只需要点击一次表格 - 您可以使用案例来决定基表中每个相关行的偏好级别，而不是联合：

select min(address) keep (dense_rank first
    order by case when address is null then 1 else 0 end, preference) as address,
  min(phone) keep (dense_rank first
    order by case when phone is null then 1 else 0 end, preference) as phone
from (
  select s.address, s.phone, case when s.employeeid is not null then 1
    when s.employeeid is null and s.departmentid is not null then 2
    else 3 end as preference
  from settings s
  where s.employeeid = 10
  or (s.employeeid is null and s.departmentid = 2)
  or (s.employeeid is null and s.departmentid is null)
);

ADDRESS      PHONE      
------------ ------------
1 ABC Ave.   617-555-1212

...当然，将= 10 和= 2 更改为绑定变量占位符。

传入员工 1 和部门 10 得到：

ADDRESS      PHONE      
------------ ------------
1 ABC Ave.   800-555-1212

您仍然可以改用first_value：

first_value(address) over (order by case when address is null then 1 else 0 end,
  preference) as address,

...但是您必须使用 distinct 来删除重复项。

【讨论】：

我喜欢这个 Alex。我太忙于关注 OP 的 first_value 起点，而忘记考虑 KEEP 选项。您的成本也更高。【参考方案2】：

避免三个联合设置优先级的一个选项如下。它需要将它全部推送到一个嵌套查询中以消除 first_value 上的空值，但它通过表一次性完成所有操作

（请注意，我在 :emp 和 :dept 占位符上使用 Oracle 冒号前缀进行变量替换）：

WITH data as (
select null emp_id, null dept_id, '123 Corp Dr.' addr,   '800-555-1212' phn from dual union all
select 10,          null,         '1 ABC Ave.',           null from dual union all
select null,        1,            '2 Dept Rd.',           null from dual union all
select null ,       2,            '3 Dept Rd.',           '617-555-1212'    from dual )
SELECT DISTINCT addr, phn
FROM (
    SELECT first_value(addr ignore nulls) over (order by precedence) as addr
         , first_value(phn ignore nulls) over (order by precedence)  as phn
    FROM (
        select CASE WHEN emp_id = :emp then 1
                    when dept_id = :dept then 2
                    WHEN emp_id is null and dept_id is null then 3 end as precedence
              , addr
              , phn       
        from data
        )
    where precedence is not null
    )
where addr is not null and phn is not null;

【讨论】：

以上是关于SQL：如何将 first_value 忽略为聚合？的主要内容，如果未能解决你的问题，请参考以下文章