Google BigQuery:具有重复名称的联接表的所有列的前缀
Posted
tags:
篇首语:本文由小常识网(cha138.com)小编为大家整理,主要介绍了Google BigQuery:具有重复名称的联接表的所有列的前缀相关的知识,希望对你有一定的参考价值。
在Google BigQuery(使用#standardSQL)上,当两个表之间存在连接时,我需要将固定前缀应用于每个表的所有列。
这是场景,我有这样的结构:
#standardSQL
WITH user AS (
SELECT "john" as name, "smith" as surname, 1 as parent
UNION ALL
SELECT "maggie" as name, "smith" as surname, 2 as parent
),
parent AS (
SELECT 1 as id, "john" as name, "doe" as surname
UNION ALL
SELECT 2 as id, "jane" as name, "smith" as surname
)
用户表
+-----+--------+---------+--------+
| Row | name | surname | parent |
+-----+--------+---------+--------+
| 1 | john | smith | 1 |
| 2 | maggie | smith | 2 |
+-----+--------+---------+--------+
父表
+-----+----+------+---------+
| Row | id | name | surname |
+-----+----+------+---------+
| 1 | 1 | john | doe |
| 2 | 2 | jane | smith |
+-----+----+------+---------+
像这样的查询
SELECT u.*, p.* FROM user u JOIN parent p ON u.parent = p.id
产生以下错误
Error: Duplicate column names in the result are not supported. Found duplicate(s): name, surname
我想避免像这样执行表的自定义别名
SELECT
u.name as user_name,
u.surname as user_surname,
p.name as parent_name,
p.surname as parent_surname
FROM user u JOIN parent p ON u.parent = p.id
+-----+-----------+--------------+-------------+----------------+
| Row | user_name | user_surname | parent_name | parent_surname |
+-----+-----------+--------------+-------------+----------------+
| 1 | john | smith | john | doe |
| 2 | maggie | smith | jane | smith |
+-----+-----------+--------------+-------------+----------------+
如果表将在字段上更改,我将每次都需要编辑语句(或语句)以便应用具有给定前缀的新字段。因此,使用固定列名称的这种方法不是一种合适的方法
有没有办法,一个查询运算符,为了获得那里提到的表,自动应用前缀?就像是:
SELECT u.* AS user_*, p.* AS parent_*
FROM user u JOIN parent p ON u.parent = p.id
答案
到目前为止,我能想到的唯一选择如下
#standardSQL
WITH user AS (
SELECT "john" AS name, "smith" AS surname, 1 AS parent UNION ALL
SELECT "maggie" AS name, "smith" AS surname, 2 AS parent
), parent AS (
SELECT 1 AS id, "john" AS name, "doe" AS surname UNION ALL
SELECT 2 AS id, "jane" AS name, "smith" AS surname
)
SELECT user, parent
FROM user
JOIN parent
ON user.parent = parent.id
结果为
Row user.name user.surname user.parent parent.id parent.name parent.surname
1 john smith 1 1 john doe
2 maggie smith 2 2 jane smith
它并不完全符合您的预期,但最接近它,因为它将各个连接表中的每一行包装到相应的STRUCT中 - 例如:
{
"user": {"name": "john", "surname": "smith","parent": "1"},
"parent": {"id": "1","name": "john","surname": "doe"}
}
以上是关于Google BigQuery:具有重复名称的联接表的所有列的前缀的主要内容,如果未能解决你的问题,请参考以下文章
使用 Google 表格作为具有重复字段的 BigQuery 数据源