SQL group by 与未分组的列

Posted

技术标签:

【中文标题】SQL group by 与未分组的列【英文标题】:SQL group by with ungrouped columns 【发布时间】:2013-11-05 18:46:47 【问题描述】:

我有一个结构如下的日志表:

CREATE TABLE mytable (
    oid        integer(10),
    department  integer(10),
    cid         integer(10),
    status      integer(1) comment 'statuses: 1-open, 2-accept, 3-done',
    recordtime  datetime
);

此表存储一些有关状态分配的数据。 oid - 组织,cid - 卡 ID。 当此表中的组织更新卡(设置新状态)插入行时,组织属于部门

我尝试从该表中选择统计数据,例如:按部门或组织 (oid) 选择最大/最小接受时间、最大/最小完成时间和平均接受/完成时间。

here is sql fiddle of the example table and my query.

问题是按部门分组时如何在选定的列中获取cid,按部门分组时如何获取oid和oid以及cid。换句话说:我想知道组织(oid)和卡ID(cid),例如,当我选择分组行时的最大接受时间

我需要这些列进行多个连接

UPD: 感谢Roman Pekar,他的回答让我走上了正确的道路。我使用他的第二个查询来编写我的最终查询。

首先:按部门分别选择平均接受/完成时间、最大/最小接受/完成时间,并选择每个部门的最大接受时间oidcid

with cte as (
    select
        oid, cid,
        max(case when status=1 then recorddatetime end) as open,
        max(case when status=2 then recorddatetime end) as accept,
        max(case when status=3 then recorddatetime end) as done
    from 
        mytable
    group by oid, cid
    having 
        max(case when status=1 then recorddatetime end) is not null and max(case when status=2 then recorddatetime end) is not null
        and max(case when status=3 then recorddatetime end) is not null
    order by oid, cid
)
select distinct on(department)
    department, oid, cid,
    ceil(extract(epoch from avg(cte.accept - cte.open) over (partition by department))) as avg_accept_time,
    ceil(extract(epoch from avg(done - open) over (partition by department))) as avg_done_time,
    ceil(extract(epoch from max(accept - open) over (partition by department))) as max_accept_time,
    ceil(extract(epoch from max(done - open) over (partition by department))) as max_done_time,
    ceil(extract(epoch from min(accept - open) over (partition by department))) as min_accept_time,
    ceil(extract(epoch from min(done - open) over (partition by department))) as min_done_time
from cte cte
order by department, max_accept_time desc

第二个:与第一个类似,但为组织选择所有这些值 (oid)

with cte as (
        select
            oid, cid,
            max(case when status=1 then recorddatetime end) as open,
            max(case when status=2 then recorddatetime end) as accept,
            max(case when status=3 then recorddatetime end) as done
        from 
            mytable
        group by oid, cid
        having 
            max(case when status=1 then recorddatetime end) is not null and max(case when status=2 then recorddatetime end) is not null
            and max(case when status=3 then recorddatetime end) is not null
        order by oid, cid
    )
    select distinct on(department, oid)
        department, oid, cid,
        ceil(extract(epoch from avg(cte.accept - cte.open) over (partition by department, oid))) as avg_accept_time,
        ceil(extract(epoch from avg(done - open) over (partition by department, oid))) as avg_done_time,
        ceil(extract(epoch from max(accept - open) over (partition by department, oid))) as max_accept_time,
        ceil(extract(epoch from max(done - open) over (partition by department, oid))) as max_done_time,
        ceil(extract(epoch from min(accept - open) over (partition by department, oid))) as min_accept_time,
        ceil(extract(epoch from min(done - open) over (partition by department, oid))) as min_done_time
    from cte cte
    order by department, oid, max_accept_time desc

【问题讨论】:

当您按部门分组时,您期望cid 的值是多少?最小值、最大值、平均值? 什么是cid?它和department是什么关系? cid - 卡号 - 是另一个表的外键 我更新了我的问题 【参考方案1】:

不知道您要对查询做什么,但这确实过于复杂。您的第一个查询可以使用窗口函数和没有连接更简单:

with cte as (
    select
        oid, department, cid,
        max(case when status=1 then recordtime end) as open,
        max(case when status=2 then recordtime end) as accept,
        max(case when status=3 then recordtime end) as done
    from mytable
    group by oid, department, cid
)
select
    department, oid,
    extract(epoch from avg(accept - open)) as a_time,
    extract(epoch from avg(done - open)) as d_time,
    extract(epoch from max(accept - open)) as max_a_time,
    extract(epoch from max(done - open)) as max_d_time
from cte
group by department, oid
order by department, oid;

sql fiddle demo

如果你想得到cid,你可以从中得到max_time,你可以使用distinct on语法:

with cte as (
    select
        oid, department, cid,
        max(case when status=1 then recordtime end) as open,
        max(case when status=2 then recordtime end) as accept,
        max(case when status=3 then recordtime end) as done
    from mytable
    group by oid, department, cid
)
select distinct on (department, oid)
    department, oid, cid,
    extract(epoch from accept - open) as a_time
from cte
order by department, oid, accept - open desc;

或者使用排名函数row_number():

with cte as (
    select
        oid, department, cid,
        max(case when status=1 then recordtime end) as open,
        max(case when status=2 then recordtime end) as accept,
        max(case when status=3 then recordtime end) as done
    from mytable
    group by oid, department, cid
), cte2 as (
    select
        department, oid, cid,
        accept, open,
        row_number() over(
              partition by department, oid
              order by accept - open desc
        ) as rn
    from cte
)
select
    department, oid, cid,
    extract(epoch from accept - open) as a_time
from cte2
where rn = 1
order by department, oid

sql fiddle demo

【讨论】:

感谢您的宝贵时间。您的查询真的很容易。我稍后在真实桌子上试一下 确实如此。谢啦!你能告诉我,with cte 表达式比join 快吗?

以上是关于SQL group by 与未分组的列的主要内容,如果未能解决你的问题,请参考以下文章

SQL系列—— 分组(group by)

T-SQL:GROUP BY,但保留一个未分组的列(或重新加入它)?

在SQL中分组查询 Group by 的存在条件是啥

sql语言 怎么求每组最大,就是用group by 分组后,求每组某列最大?

sql:用group by分组后,每组随意取一个记录?

sql中 group by排序