如何在 BigQuery 标准 SQL 中旋转多个列

Posted

技术标签:

【中文标题】如何在 BigQuery 标准 SQL 中旋转多个列【英文标题】:How to pivot multiple columns in BigQuery Standard SQL 【发布时间】:2021-12-16 14:03:20 【问题描述】:

我想在表格中旋转多个指标。下面的语法怎么写?

当前表格

date          iso_week          iso_year          metric1         metric2
2021-12-01       48              2021              1000             500
2021-11-30       48              2021               850             300
...
2020-11-28       48              2020               800             400
2020-11-27       48              2020               950             450
...
2019-11-27       48              2019               700             350
2019-11-26       48              2019               820             380

期望的输出

iso_week          metric1_thisYear        metric1_prevYear       metric1_prev2Year       metric2_thisYear        metric2_prevYear        metric_prev2Year
48                     1850                   1750                   1520                   800                   950                   730

...

【问题讨论】:

【参考方案1】:

考虑以下方法

select * from (
  select * except(date)
  from your_table
)
pivot (sum(metric1) as metric1, sum(metric2) as metric2 for iso_year in (2021, 2020, 2019))        

如果应用于您问题中的样本数据 - 输出是

【讨论】:

很高兴它对你有用。如果它真的有帮助,也考虑投票给答案 谢谢,这对我的语法有所帮助,因为大型查询文档在多列方面有些欠缺。【参考方案2】:

首先,您需要总结您对这些指标进行分组的年份的指标:按年和周。然后,有两种方法可以带来您想要的输出。

在第一种方法中,我只将年份和星期分组一次,但我需要有多个 SELECT 才能将其作为所需的输出。

With Datat as(
   SELECT '2021-12-01' as date, 48 as iso_week, 2021 as iso_year, 1000 as metric1, 500 as metric2 union all
   SELECT '2021-11-30' as date, 48 as iso_week, 2021 as iso_year, 850 as metric1, 300 as metric2 union all
   SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 800 as metric1, 400 as metric2 union all
   SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 950 as metric1, 450 as metric2 union all
   SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 700 as metric1, 350 as metric2 union all
   SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 820 as metric1, 380 as metric2
),
data2 as(
select iso_week, iso_year, sum(metric1) as metric1, sum(metric2) as metric2 from Datat GROUP BY iso_year, iso_week
)
select iso_week, (select d2.metric1 from data2 d2 where d2.iso_year=2021 ) as metric1_thisyear, (select d2.metric1 from data2 d2 where d2.iso_year=2020 ) as metric1_prevyear, (select d2.metric1 from data2 d2 where d2.iso_year=2019 ) as metric1_prev2year,
(select d2.metric2 from data2 d2 where d2.iso_year=2021 ) as metric2_thisyear, (select d2.metric2 from data2 d2 where d2.iso_year=2020 ) as metric2_prevyear, (select d2.metric2 from data2 d2 where d2.iso_year=2019 ) as metric2_prev2year
from data2 d2 group by iso_week

在第二种方法中,我只对星期进行分组,因为我在子查询中过滤了年份。这会消耗更少的时隙;它看起来像这样:

With Datat as(
   SELECT '2021-12-01' as date, 48 as iso_week, 2021 as iso_year, 1000 as metric1, 500 as metric2 union all
   SELECT '2021-11-30' as date, 48 as iso_week, 2021 as iso_year, 850 as metric1, 300 as metric2 union all
   SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 800 as metric1, 400 as metric2 union all
   SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 950 as metric1, 450 as metric2 union all
   SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 700 as metric1, 350 as metric2 union all
   SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 820 as metric1, 380 as metric2
),
data2 as(
select iso_week, sum(metric1) as metric1, sum(metric2) as metric2 from Datat  where Datat.iso_year =2021 GROUP BY iso_week
),
data3 as(
select iso_week, sum(metric1) as metric1, sum(metric2) as metric2 from Datat  where Datat.iso_year =2020 GROUP BY  iso_week
),
data4 as(
select iso_week, sum(metric1) as metric1, sum(metric2) as metric2 from Datat  where Datat.iso_year =2019 GROUP BY  iso_week
)
select d2.iso_week, d2.metric1, d3.metric1 as metric1_prevyear, d4.metric1 as metric1_prev2year,
d2.metric2, d3.metric2 as metric2_prevyear, d4.metric2 as metric2_prev2year  
from data2 d2 join data3 d3 on d2.iso_week =d3.iso_week  join data4 d4 on d3.iso_week=d4.iso_week

【讨论】:

以上是关于如何在 BigQuery 标准 SQL 中旋转多个列的主要内容,如果未能解决你的问题,请参考以下文章

如何在 BigQuery 标准 SQL 中取消嵌套多个数组

在 Bigquery 中,如何使用标准 Sql 过滤 Struct 数组以匹配 Struct 中的多个字段?

如何在 bigquery 中旋转我的 sql 表?

BigQuery:使用标准 SQL 查询多个数据集和表

如何在 Big Query 的标准 SQL 中使用通配符为特定分区查询多个表

BigQuery 标准 SQL - 删除多个表