在 BigQuery 上重新创建 GA 漏斗
Posted
技术标签:
【中文标题】在 BigQuery 上重新创建 GA 漏斗【英文标题】:Recreate GA Funnel on BigQuery 【发布时间】:2017-07-26 16:42:26 【问题描述】:我正在尝试使用 BigQuery 重新创建 GA 漏斗(Google360 上的自定义报告)。 GA 上的漏斗使用每个页面上发生的事件的唯一计数。我在网上发现这个查询大部分都有效:
SELECT
COUNT( s0.firstHit) AS Landing_Page,
COUNT( s1.firstHit) AS Model_Selection
from(
SELECT
s0.fullvisitorID,
s0.firstHit,
s1.firstHit,
FROM (
# Begin Subquery #1 aka s0
SELECT
fullvisitorID,
MIN(hits.hitNumber) AS firstHit
FROm [64269470.ga_sessions_20170720]
WHERE
hits.eventInfo.eventAction in ('landing_page')
AND totals.visits = 1
GROUP BY
fullvisitorID
) s0
# End Subquery #1 aka s0
left join (
# Begin Subquery #2 aka s1
SELECT
fullvisitorID,
MIN(hits.hitNumber) AS firstHit
FROM [64269470.ga_sessions_20170720]
WHERE
hits.eventInfo.eventAction in ('model_selection_page')
AND totals.visits = 1
GROUP BY
fullvisitorID,
) s1
ON
s0.fullvisitorID = s1.fullvisitorID
)
查询工作正常,着陆页的值与我在 GA 上获得的值相同,但 Model_Selection 大约高出 10%。这种差异也会随着漏斗而增加(为了清楚起见,我只发布了 2 个步骤)。 知道我在这里想念什么吗?
【问题讨论】:
【参考方案1】:此查询可以满足您的需要,但在 Standard SQL 版本中:
#standardSQL
SELECT
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection
FROM `64269470.ga_sessions_20170720`
就是这样。 4 条线路,更快更便宜。
您还可以使用模拟数据,例如:
#standardSQL
WITH data AS(
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('landing_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits UNION ALL
SELECT '1' AS fullvisitorid, ARRAY<STRUCT<eventInfo STRUCT<eventAction STRING > >> [STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo), STRUCT(STRUCT('model_selection_page' AS eventAction) AS eventInfo)] AS hits
)
SELECT
SUM((SELECT COUNTIF(eventInfo.eventAction = 'landing_page') FROM UNNEST(hits))) Landing_Page,
SUM((SELECT COUNTIF(eventInfo.eventAction = 'model_selection_page') FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page'))) Model_Selection
FROM data
请注意,在 GA 中构建这种类型的报告可能会有点困难,因为您需要选择至少触发过事件“landing_page”然后触发事件“model_selection_page”的访问者。确保您在 GA 中也正确构建了此报告(一种方法可能是首先构建一个自定义报告,其中仅包含已触发“landing_page”的客户,然后应用第二个过滤器查找“model_selection_page”)。
[编辑]:
您在评论中询问有关在会话和用户级别上进行计数的问题。为了计算每个会话,您可以将每个子查询评估的结果限制为 1,如下所示:
SELECT
SUM((SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
SUM((SELECT 1 FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection
FROM data
对于不同的用户计数,想法是相同的,但您必须应用 COUNT(DISTINCT)
操作,如下所示:
SELECT
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page' LIMIT 1)) Landing_Page,
COUNT(DISTINCT(SELECT fullvisitorid FROM UNNEST(hits) WHERE EXISTS(SELECT 1 FROM UNNEST(hits) WHERE eventInfo.eventAction = 'landing_page') AND eventInfo.eventAction = 'model_selection_page' LIMIT 1)) Model_Selection
FROM data
【讨论】:
您好,威廉,谢谢您的回答。这是您一直在使用的一种有趣的方法。快速的问题。我希望您使用这种结构来区分用户和会话。看起来,在这种情况下,被计数的是总事件。谢谢! @Jacob 感谢引用此问题的另一个问题我找到了您的评论,很抱歉花了这么长时间才回复。我编辑了我的答案,希望这就是你要找的。让我知道它是否有效:)以上是关于在 BigQuery 上重新创建 GA 漏斗的主要内容,如果未能解决你的问题,请参考以下文章
如何通过GCS将GA360表从Big query导出到雪花作为json文件而不丢失数据?
将 Google Analytics 360 链接到 Big Query,权限问题