如何在 Google Big Query 中正确使用 GROUP BY 命令?
Posted
技术标签:
【中文标题】如何在 Google Big Query 中正确使用 GROUP BY 命令?【英文标题】:How to use GROUP BY command properly in Google Big Query? 【发布时间】:2015-11-23 09:57:22 【问题描述】:我在尝试仅获取特定数据时遇到了一些问题。首先我不知道如何创建一个 sql 查询(当前的 sql 查询我只能抓取一个用户)所以我可以像这样抓取数据。
其次,我想获取当前日期之前的 1 年数据。以下是我目前完成的 sql 查询(我需要一个一个手动完成)。
SELECT type, COUNT(*) FROM (
TABLE_DATE_RANGE([githubarchive:day.events_],
TIMESTAMP('2013-1-01'),
TIMESTAMP('2015-08-28')
)) AS events
WHERE type IN ("CommitCommentEvent","CreateEvent","DeleteEvent","DeploymentEvent","DeploymentStatusEvent","DownloadEvent","FollowEvent",
"ForkEvent","ForkApplyEvent","GistEvent","GollumEvent","IssueCommentEvent","IssuesEvent","MemberEvent","MembershipEvent","PageBuildEvent",
"PublicEvent","PullRequestEvent","PullRequestReviewCommentEvent","PushEvent","ReleaseEvent","RepositoryEvent","StatusEvent","TeamAddEvent",
"WatchEvent") AND actor.login = "datomnurdin"
GROUP BY type;
参考:
https://www.githubarchive.org/
https://github.com/igrigorik/githubarchive.org
【问题讨论】:
有什么问题? 我想创建一个sql查询来生成上面想要的输出。 您已经有一个 sql 查询。你能具体说明你的问题是什么吗? 您似乎需要“透视”您的数据。首先,更改您的查询,使actor.login
在您的SELECT
和GROUP BY
,而不是WHERE
子句中。然后搜索“How to Pivot in Big Query”(对不起,我自己不知道语法)。 Hth
@Dato'MohammadNurdin 您的查询查找一个 actor.login,这就是为什么它只给出一个 .... 删除“和 actor.login =”,而是通过该参数添加一个组。检查我的答案,应该是你想要的查询
【参考方案1】:
以下是正确旋转数据的方法:
SELECT actor.login,
ifnull(sum(if(type='CommitCommentEvent',1,null)),0) as CommitCommentEvent,
ifnull(sum(if(type='CreateEvent',1,null)),0) as CreateEvent,
ifnull(sum(if(type='DeleteEvent',1,null)),0) as DeleteEvent,
ifnull(sum(if(type='DeploymentEvent',1,null)),0) as DeploymentEvent,
ifnull(sum(if(type='DeploymentStatusEvent',1,null)),0) as DeploymentStatusEvent,
ifnull(sum(if(type='DownloadEvent',1,null)),0) as DownloadEvent,
ifnull(sum(if(type='FollowEvent',1,null)),0) as FollowEvent,
ifnull(sum(if(type='ForkEvent',1,null)),0) as ForkEvent,
ifnull(sum(if(type='ForkApplyEvent',1,null)),0) as ForkApplyEvent,
ifnull(sum(if(type='GistEvent',1,null)),0) as GistEvent,
ifnull(sum(if(type='GollumEvent',1,null)),0) as GollumEvent,
ifnull(sum(if(type='IssueCommentEvent',1,null)),0) as IssueCommentEvent,
ifnull(sum(if(type='IssuesEvent',1,null)),0) as IssuesEvent,
ifnull(sum(if(type='MemberEvent',1,null)),0) as MemberEvent,
ifnull(sum(if(type='MembershipEvent',1,null)),0) as MembershipEvent,
ifnull(sum(if(type='PageBuildEvent',1,null)),0) as PageBuildEvent,
ifnull(sum(if(type='PublicEvent',1,null)),0) as PublicEvent,
ifnull(sum(if(type='PullRequestEvent',1,null)),0) as PullRequestEvent,
ifnull(sum(if(type='PullRequestReviewCommentEvent',1,null)),0) as PullRequestReviewCommentEvent,
ifnull(sum(if(type='PushEvent',1,null)),0) as PushEvent,
ifnull(sum(if(type='ReleaseEvent',1,null)),0) as ReleaseEvent,
ifnull(sum(if(type='RepositoryEvent',1,null)),0) as RepositoryEvent,
ifnull(sum(if(type='StatusEvent',1,null)),0) as StatusEvent,
ifnull(sum(if(type='TeamAddEvent',1,null)),0) as TeamAddEvent,
ifnull(sum(if(type='WatchEvent',1,null)),0) as WatchEvent,
FROM (
TABLE_DATE_RANGE([githubarchive:day.events_],
DATE_ADD(CURRENT_TIMESTAMP(), -1, "YEAR"),
CURRENT_TIMESTAMP()
)) AS events
WHERE type IN ("CommitCommentEvent","CreateEvent","DeleteEvent","DeploymentEvent","DeploymentStatusEvent","DownloadEvent","FollowEvent",
"ForkEvent","ForkApplyEvent","GistEvent","GollumEvent","IssueCommentEvent","IssuesEvent","MemberEvent","MembershipEvent","PageBuildEvent",
"PublicEvent","PullRequestEvent","PullRequestReviewCommentEvent","PushEvent","ReleaseEvent","RepositoryEvent","StatusEvent","TeamAddEvent",
"WatchEvent")
GROUP BY 1
limit 100
【讨论】:
如何获取当前日期前 1 年的数据并仅找到特定位置(“马来西亚”)?我快到了…… 这应该进入另一个问题,因为我在这些旁边看不到任何位置数据。那是 AFAIK 是一个隐私问题。 我明白了。 “约会”这件事怎么样?以上是关于如何在 Google Big Query 中正确使用 GROUP BY 命令?的主要内容,如果未能解决你的问题,请参考以下文章
如何将 Google Cloud SQL 与 Google Big Query 集成
Google Big Query Error: CSV table 遇到太多错误,放弃。行:1 错误:1
如何通过 Google 表格中的二维数组通过 Apps 脚本插入 Big Query?