如何在 BigQuery 标准 SQL 中旋转多个列
Posted
技术标签:
【中文标题】如何在 BigQuery 标准 SQL 中旋转多个列【英文标题】:How to pivot multiple columns in BigQuery Standard SQL 【发布时间】:2021-12-16 14:03:20 【问题描述】:我想在表格中旋转多个指标。下面的语法怎么写?
当前表格
date iso_week iso_year metric1 metric2
2021-12-01 48 2021 1000 500
2021-11-30 48 2021 850 300
...
2020-11-28 48 2020 800 400
2020-11-27 48 2020 950 450
...
2019-11-27 48 2019 700 350
2019-11-26 48 2019 820 380
期望的输出
iso_week metric1_thisYear metric1_prevYear metric1_prev2Year metric2_thisYear metric2_prevYear metric_prev2Year
48 1850 1750 1520 800 950 730
...
【问题讨论】:
【参考方案1】:考虑以下方法
select * from (
select * except(date)
from your_table
)
pivot (sum(metric1) as metric1, sum(metric2) as metric2 for iso_year in (2021, 2020, 2019))
如果应用于您问题中的样本数据 - 输出是
【讨论】:
很高兴它对你有用。如果它真的有帮助,也考虑投票给答案 谢谢,这对我的语法有所帮助,因为大型查询文档在多列方面有些欠缺。【参考方案2】:首先,您需要总结您对这些指标进行分组的年份的指标:按年和周。然后,有两种方法可以带来您想要的输出。
在第一种方法中,我只将年份和星期分组一次,但我需要有多个 SELECT 才能将其作为所需的输出。
With Datat as(
SELECT '2021-12-01' as date, 48 as iso_week, 2021 as iso_year, 1000 as metric1, 500 as metric2 union all
SELECT '2021-11-30' as date, 48 as iso_week, 2021 as iso_year, 850 as metric1, 300 as metric2 union all
SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 800 as metric1, 400 as metric2 union all
SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 950 as metric1, 450 as metric2 union all
SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 700 as metric1, 350 as metric2 union all
SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 820 as metric1, 380 as metric2
),
data2 as(
select iso_week, iso_year, sum(metric1) as metric1, sum(metric2) as metric2 from Datat GROUP BY iso_year, iso_week
)
select iso_week, (select d2.metric1 from data2 d2 where d2.iso_year=2021 ) as metric1_thisyear, (select d2.metric1 from data2 d2 where d2.iso_year=2020 ) as metric1_prevyear, (select d2.metric1 from data2 d2 where d2.iso_year=2019 ) as metric1_prev2year,
(select d2.metric2 from data2 d2 where d2.iso_year=2021 ) as metric2_thisyear, (select d2.metric2 from data2 d2 where d2.iso_year=2020 ) as metric2_prevyear, (select d2.metric2 from data2 d2 where d2.iso_year=2019 ) as metric2_prev2year
from data2 d2 group by iso_week
在第二种方法中,我只对星期进行分组,因为我在子查询中过滤了年份。这会消耗更少的时隙;它看起来像这样:
With Datat as(
SELECT '2021-12-01' as date, 48 as iso_week, 2021 as iso_year, 1000 as metric1, 500 as metric2 union all
SELECT '2021-11-30' as date, 48 as iso_week, 2021 as iso_year, 850 as metric1, 300 as metric2 union all
SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 800 as metric1, 400 as metric2 union all
SELECT '2020-12-01' as date, 48 as iso_week, 2020 as iso_year, 950 as metric1, 450 as metric2 union all
SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 700 as metric1, 350 as metric2 union all
SELECT '2019-12-01' as date, 48 as iso_week, 2019 as iso_year, 820 as metric1, 380 as metric2
),
data2 as(
select iso_week, sum(metric1) as metric1, sum(metric2) as metric2 from Datat where Datat.iso_year =2021 GROUP BY iso_week
),
data3 as(
select iso_week, sum(metric1) as metric1, sum(metric2) as metric2 from Datat where Datat.iso_year =2020 GROUP BY iso_week
),
data4 as(
select iso_week, sum(metric1) as metric1, sum(metric2) as metric2 from Datat where Datat.iso_year =2019 GROUP BY iso_week
)
select d2.iso_week, d2.metric1, d3.metric1 as metric1_prevyear, d4.metric1 as metric1_prev2year,
d2.metric2, d3.metric2 as metric2_prevyear, d4.metric2 as metric2_prev2year
from data2 d2 join data3 d3 on d2.iso_week =d3.iso_week join data4 d4 on d3.iso_week=d4.iso_week
【讨论】:
以上是关于如何在 BigQuery 标准 SQL 中旋转多个列的主要内容,如果未能解决你的问题,请参考以下文章
在 Bigquery 中,如何使用标准 Sql 过滤 Struct 数组以匹配 Struct 中的多个字段?