当单个字段是 GROUP BY 中的多个字段之一时,如何汇总该字段的数据?

Posted

技术标签:

【中文标题】当单个字段是 GROUP BY 中的多个字段之一时,如何汇总该字段的数据?【英文标题】:How do you summarize data for a single field when it is one of multiple fields in GROUP BY? 【发布时间】:2017-08-07 18:42:38 【问题描述】:

我有一张如下所示的表格:

| loc_id | pilot_ind | last_form_step_completed |
|--------|-----------|--------------------------|
| 9988   | non-pilot | 1                        |
| 9988   | non-pilot | 1                        |
| 9988   | non-pilot | 2                        |
| 9988   | non-pilot | 2                        |
| 9988   | non-pilot | 2                        |
| 9988   | non-pilot | 3                        |
| 9988   | non-pilot | 3                        |
| 9988   | non-pilot | 4                        |
| 9988   | non-pilot | 4                        |
| 1122   | non-pilot | 1                        |
| 1122   | non-pilot | 2                        |
| 1122   | non-pilot | 2                        |
| 1122   | non-pilot | 2                        |
| 1122   | non-pilot | 3                        |
| 1122   | non-pilot | 4                        |
| 1122   | non-pilot | 5                        |
| 5544   | pilot     | 1                        |
| 5544   | pilot     | 1                        |
| 5544   | pilot     | 2                        |
| 5544   | pilot     | 2                        |
| 5544   | pilot     | 2                        |
| 5544   | pilot     | 3                        |
| 5544   | pilot     | 3                        |
| 5544   | pilot     | 4                        |
| 5544   | pilot     | 4                        |
| 5544   | pilot     | 5                        |
| 5544   | pilot     | 5                        |
| 5544   | pilot     | 5                        |
| 5544   | pilot     | 5                        |
| 5544   | pilot     | 5                        |
| 3344   | pilot     | 1                        |
| 3344   | pilot     | 2                        |
| 3344   | pilot     | 2                        |
| 3344   | pilot     | 3                        |
| 3344   | pilot     | 3                        |
| 3344   | pilot     | 3                        |
| 3344   | pilot     | 4                        |
| 3344   | pilot     | 4                        |
| 3344   | pilot     | 4                        |
| 3344   | pilot     | 5                        |
| 3344   | pilot     | 5                        |
| 3344   | pilot     | 5                        |

我需要这样总结:

| pilot_ind | last_step_compl | total_count | total_per_pilot_ind | pct_pilot_ind_total |
|-----------|-----------------|-------------|---------------------|---------------------|
| non-pilot | 1               | 3           | 16                  | 18.8%               |
| non-pilot | 2               | 6           | 16                  | 37.5%               |
| non-pilot | 3               | 3           | 16                  | 18.8%               |
| non-pilot | 4               | 3           | 16                  | 18.8%               |
| non-pilot | 5               | 1           | 16                  | 6.3%                |
| pilot     | 1               | 3           | 26                  | 11.5%               |
| pilot     | 2               | 5           | 26                  | 19.2%               |
| pilot     | 3               | 5           | 26                  | 19.2%               |
| pilot     | 4               | 5           | 26                  | 19.2%               |
| pilot     | 5               | 8           | 26                  | 30.8%               |

我无法在 total_per_pilot_ind 字段中获取 Pilot_ind 的分组总计 - 我尝试使用 OVER (PARTITION BY),但结果不正确,因为 GROUP BY 子句中有两个字段。 这里是示例数据和当前尝试的链接:http://rextester.com/PCE62786

select 
    r.pilot_ind
    ,r.last_form_step_completed
    ,count(*) total_count
    ,count(*) over ()
from
    results r
group by
    r.pilot_ind
    ,r.last_form_step_completed
order by 1,2

编辑: 有没有办法在一个查询中做到这一点?

【问题讨论】:

在您的示例中如何计算total_per_pilot_ind 列中的1626 值? 我已编辑帖子以包含整个表格。表格和数据也可以在 rextester 链接中找到。 total_per_pilot_ind 是针对 Pilot_ind 字段(飞行员或非飞行员)的每个值的所有行的计数。通过扩展,pct_pilot_ind_total 计算为 (total_count / total_per_pilot_ind)。 【参考方案1】:

使用子查询:

SELECT *,
      sum(total_count) over(partition by pilot_ind) As total_per_pilot_ind,
      round(100.0 * total_count / sum(total_count) over(partition by pilot_ind) ,1)
        as pct_pilot_ind_total
FROM (
select 
    r.pilot_ind
    ,r.last_form_step_completed
    ,count(*) total_count
from
    results r
group by
    r.pilot_ind
    ,r.last_form_step_completed
) x
order by 1,2

演示:http://sqlfiddle.com/#!17/0f411/5

【讨论】:

【参考方案2】:

你可以使用左连接

  select 
      t.pilot_ind
      , t.last_form_step_completed
      , sum(a.last_form_step_completed) total_count
  from ( 
  select distinct pilot_ind, last_form_step_completed
  from my_table ) t 
  left join my_table a on t.pilot_ind = a.pilot_ind 
      and t.last_form_step_completed = a.last_form_step_completed
  group by t.pilot_ind
      , t.last_form_step_completed

(不清楚您如何获得示例中的最后两列..所以我省略了它们)

【讨论】:

【参考方案3】:
select a.pilot_id, a.last_step, a.total_count, b.total_per_pilot_ind,
(a.total_count/ b.total_per_pilot_ind)*100 as Per_tage
from
(
select pilot_ind, last_form_step_completed as last_step, count(*) as total_count
from table1
group by pilot_ind, last_form_step_completed
) a,
(
select pilot_ind, sum(last_form_step_completed) as total_per_pilot_ind
from table1
group by pilot_id
) b
where a.pilot_ind=b.pilot_ind

【讨论】:

以上是关于当单个字段是 GROUP BY 中的多个字段之一时,如何汇总该字段的数据?的主要内容,如果未能解决你的问题,请参考以下文章

SQL中的group by为啥是按照分组的第二个字段排序的呢?

group by 多个字段

[Mysql 查询语句]——分组查询group by

mysql 可以group by 两个字段吗

group by 后面可以带两个字段吗

mysql group by 的用法,集合后取出指定的字段