联接表结果 Google BigQuery
Posted
技术标签:
【中文标题】联接表结果 Google BigQuery【英文标题】:Join table results Google BigQuery 【发布时间】:2017-04-21 02:36:29 【问题描述】:我有两个 SQL 查询:
SELECT subreddit, count(subreddit) as count
FROM [fh-bigquery:reddit_comments.all]
where author="***********" GROUP by subreddit ORDER BY count DESC;
与
SELECT subreddit, count(subreddit) as count
FROM [redditcollaborativefiltering:aggregate_comments.reddit_posts_all]
where author="***********" GROUP by subreddit ORDER BY count DESC;
我希望能够将这两个查询的结果合并为一个具有相同列的结果,但是,计数是彼此相加的。有什么简单的方法可以做到这一点?
【问题讨论】:
【参考方案1】:对于 BigQuery Legacy SQL(我看到您在示例中使用),您可以在下面使用:
#legacySQL
SELECT subredit, SUM(cnt) as cnt
FROM (SELECT subreddit, COUNT(subreddit) as cnt
FROM [fh-bigquery:reddit_comments.all]
WHERE author = '***********'
GROUP BY subreddit
),
(SELECT subreddit, COUNT(subreddit) as cnt
FROM [redditcollaborativefiltering:aggregate_comments.reddit_posts_all]
WHERE author = '***********'
GROUP by subreddit
)
GROUP BY subreddit
ORDER BY cnt DESC
正如您在此处看到的 - Legacy SQL 中的逗号用作 UNION ALL
以上可以进一步简化
#legacySQL
SELECT subreddit, COUNT(subreddit) as cnt
FROM [fh-bigquery:reddit_comments.all],
[redditcollaborativefiltering:aggregate_comments.reddit_posts_all]
WHERE author = '***********'
GROUP BY subreddit
ORDER BY cnt DESC
您可以阅读有关 BigQuery 旧版 SQL 的 Comma as UNION ALL
的更多信息
【讨论】:
【参考方案2】:您可以使用UNION ALL
和另一个聚合:
SELECT subredit, SUM(cnt) as cnt
FROM ((SELECT subreddit, count(subreddit) as cnt
FROM [fh-bigquery:reddit_comments.all]
WHERE author = '***********'
GROUP BY subreddit
) UNION ALL
(SELECT subreddit, count(subreddit) as cnt
FROM [redditcollaborativefiltering:aggregate_comments.reddit_posts_all]
WHERE author = '***********'
GROUP by subreddit
)
) sc
GROUP BY subreddit
ORDER BY cnt DESC;
【讨论】:
以上是关于联接表结果 Google BigQuery的主要内容,如果未能解决你的问题,请参考以下文章
如何将 Sqlalchemy ORM 查询结果转换为包含关系的单个联接表?