Bigquery:将多列中的数据转换为行格式

Posted

技术标签:

【中文标题】Bigquery:将多列中的数据转换为行格式【英文标题】:Bigquery: Transform data in multiple columns into row-format 【发布时间】:2019-07-19 07:30:18 【问题描述】:

假设BQ中有如下表格:

SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7

因此,表格可以大到大约 100 列。

我想转换这个查询,这样我就有了结果表:

SELECT "Desktop" AS Device, 24 AS Nr UNION ALL
SELECT "Desktop" AS Device, 9 AS Nr UNION ALL
SELECT "Desktop" AS Device, 28 AS Nr UNION ALL
SELECT "Desktop" AS Device, 7 AS Nr UNION ALL
SELECT "Desktop" AS Device, 98 AS Nr UNION ALL
SELECT "Desktop" AS Device, 77 AS Nr UNION ALL
SELECT "Desktop" AS Device, 59 AS Nr UNION ALL
SELECT "Mobile" AS Device, 8 AS Nr UNION ALL
SELECT "Mobile" AS Device, 43 AS Nr UNION ALL
SELECT "Mobile" AS Device, 75 AS Nr UNION ALL
Etc

有人知道如何实现吗?

【问题讨论】:

【参考方案1】:

以下是 BigQuery 标准 SQL,这里的额外奢侈是它不依赖于要取消透视的列的数量和名称

#standardSQL
WITH raw AS (
  SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
  SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
  SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7
)
SELECT Device, Nr FROM raw t, 
UNNEST(REGEXP_EXTRACT_ALL(TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(Device) FROM UNNEST([t]))), r'":([^,]*)')) Nr 

更新 OP 的评论:我完全忘记在要求中包含列名也应该作为单独的列添加

#standardSQL
SELECT Device, SPLIT(pair, ':')[OFFSET(0)] AS col, SPLIT(pair, ':')[OFFSET(1)] AS Nr 
FROM raw t, 
UNNEST(SPLIT(REGEXP_REPLACE(TO_JSON_STRING((SELECT AS STRUCT * EXCEPT(Device) FROM UNNEST([t]))), r'["]', ''))) pair  

如果现在应用于相同的采样数据结果如下所示

Row Device  col     Nr   
1   Desktop col1    24   
2   Desktop col2    9    
3   Desktop col3    28   
4   Desktop col4    7    
5   Desktop col5    98   
6   Desktop col6    77   
7   Desktop col7    59   
8   Mobile  col1    8    
9   Mobile  col2    43   
10  Mobile  col3    75   
11  Mobile  col4    44   
12  Mobile  col5    38   
13  Mobile  col6    31   
14  Mobile  col7    46   
15  Tablet  col1    7    
16  Tablet  col2    9    
17  Tablet  col3    34   
18  Tablet  col4    86   
19  Tablet  col5    62   
20  Tablet  col6    69   
21  Tablet  col7    74   

【讨论】:

Thnx Mikhail,在很多列的情况下非常方便。但是,我完全忘记了列名也应该作为单独的列添加的要求,所以我知道例如值 24 与第一行的“col1”匹配。这也可能吗? 那么你为什么接受之前的答案呢?无论如何发布您的新问题或用您真正需要的任何内容更新这个问题,我将分别回答或更新我的答案。同时考虑至少投票 无论如何 - 请在我的回答中查看更新,请不要忘记投票 米哈伊尔,为了回答你的问题,我接受了之前的答案,因为它符合我最初的要求。我的额外要求是后来才出现的,我第一次回复你的代码。我会考虑你的反馈,下次我会发布一个新的帖子。无论如何,感谢更新的代码,这完美无缺。 确定,没问题,明白【参考方案2】:

您可以将数字列转换为 ARRAY 并使用 UNNEST:

with raw as (
SELECT "Desktop" AS Device, 24 AS col1, 9 AS col2, 28 AS col3, 7 AS col4, 98 AS col5, 77 AS col6, 59 AS col7 UNION ALL
SELECT "Mobile" AS Device, 8 AS col1, 43 AS col2, 75 AS col3, 44 AS col4, 38 AS col5, 31 AS col6, 46 AS col7 UNION ALL
SELECT "Tablet" AS Device, 7 AS col1, 9 AS col2, 34 AS col3, 86 AS col4, 62 AS col5, 69 AS col6, 74 AS col7
)
select Device,  Nr
from raw
left join UNNEST ([col1, col2, col3,col4,col5,col6,col7]) Nr

【讨论】:

以上是关于Bigquery:将多列中的数据转换为行格式的主要内容,如果未能解决你的问题,请参考以下文章

SQL 将多列转置为行

将嵌套的自定义维度列数据转置为行 Bigquery

将数据从多列转换为行并保留“标签”

Oracle SQL:将多列转置为行

使用案例条件将多列转换为行

将多列转换为 Bigquery 中的记录