BigQuery 中的 UNION ALL 或 CONCATENATE 数据集
Posted
技术标签:
【中文标题】BigQuery 中的 UNION ALL 或 CONCATENATE 数据集【英文标题】:UNION ALL or CONCATENATE Datasets in BigQuery 【发布时间】:2017-05-17 14:22:45 【问题描述】:我正在使用 BigQuery 控制台,需要将 12 个不同的数据集合并,但信息相同,只需更改 de dataset_id,因为日期范围对所有数据集都相同。
我尝试将union all
函数放在第一个查询的末尾,然后放在另一个查询的后面,但不起作用。
Error: SELECT list expression references hits.contentgroup.contentgroup2 which is neither grouped nor aggregated at [2:3]
这是查询:
SELECT
hits.contentgroup.contentgroup2 CampaignGrouping,
custd.value member_PK,
'Web' Canal,
'ES' AS country_id,
SUM(hits.contentGroup.contentGroupUniqueViews2) VistasUnicas
FROM
`id_project.11773102.ga_sessions*`,
UNNEST(customdimensions) custd,
UNNEST(hits) AS hits
WHERE
1 = 1
AND PARSE_TIMESTAMP('%Y%m%d', REGEXP_EXTRACT(_table_suffix, r'.*_(.*)')) BETWEEN TIMESTAMP('2017-04-25') AND TIMESTAMP('2017-04-30')
AND custd.index=30
and hits.contentGroup.contentGroup2 <> '(not set)'
AND custd.value <> 'null'
AND hits.contentGroup.contentGroupUniqueViews2 IS NOT NULL
UNION ALL
SELECT
hits.contentgroup.contentgroup2 CampaignGrouping,
custd.value member_PK,
'Web' Canal,
'ES' AS country_id,
SUM(hits.contentGroup.contentGroupUniqueViews2) VistasUnicas
FROM
`id_project.11773102.ga_sessions*`,
UNNEST(customdimensions) custd,
UNNEST(hits) AS hits
WHERE
1 = 1
AND PARSE_TIMESTAMP('%Y%m%d', REGEXP_EXTRACT(_table_suffix, r'.*_(.*)')) BETWEEN TIMESTAMP('2017-04-25') AND TIMESTAMP('2017-04-30')
AND custd.index=30
and hits.contentGroup.contentGroup2 <> '(not set)'
AND custd.value <> 'null'
AND hits.contentGroup.contentGroupUniqueViews2 IS NOT NULL
GROUP BY
1, 2
ORDER BY 5 ASC
谢谢。
【问题讨论】:
您遇到了什么错误(如果有)。解释“不起作用” - IOW,什么不起作用。 抱歉@SloanThrasher 我现在包含在问题中, 所以,错误信息似乎很清楚。 hits.contentgroup.contentgroup2 来自哪里? 来自我在查询中声明的表。当我在没有联合的情况下运行查询时,运行!我不知道为什么会这样...... 【参考方案1】:您需要在联合中的第一个查询中使用GROUP BY
,例如:
SELECT
hits.contentgroup.contentgroup2 CampaignGrouping,
custd.value member_PK,
'Web' Canal,
'ES' AS country_id,
SUM(hits.contentGroup.contentGroupUniqueViews2) VistasUnicas
FROM
`bigquery-aaaaa-162814.11773102.ga_sessions*`,
UNNEST(customdimensions) custd,
UNNEST(hits) AS hits
WHERE
1 = 1
AND PARSE_TIMESTAMP('%Y%m%d', REGEXP_EXTRACT(_table_suffix, r'.*_(.*)')) BETWEEN TIMESTAMP('2017-04-25') AND TIMESTAMP('2017-04-30')
AND custd.index=30
and hits.contentGroup.contentGroup2 <> '(not set)'
AND custd.value <> 'null'
AND hits.contentGroup.contentGroupUniqueViews2 IS NOT NULL
GROUP BY 1, 2
UNION ALL
SELECT ...
作为UNION ALL
和GROUP BY
的具体示例:
#standardSQL
WITH T AS (
SELECT 1 AS x, 'foo' AS y UNION ALL
SELECT 1, 'bar' UNION ALL
SELECT 2, 'foo'
)
SELECT x, STRING_AGG(y, ',') AS y
FROM T
GROUP BY x
UNION ALL
SELECT SUM(x), y
FROM T
GROUP BY y;
【讨论】:
以上是关于BigQuery 中的 UNION ALL 或 CONCATENATE 数据集的主要内容,如果未能解决你的问题,请参考以下文章
Bigquery - UNION ALL 具有不同参数的相同查询
Bigquery:根据不同的 date_trunc 多次运行查询并将结果合并,而不是多个 UNION ALL
是否可以在 Union All BigQuery SQL 中让 where 子句引用另一个 where 子句?
union all这样写为啥提示“使用 UNION、INTERSECT 或 EXCEPT 运算符合并的所有查询必须在其目标列表中有