Bigquery 中新安装用户的 Firebase 事件发生

Posted

技术标签:

【中文标题】Bigquery 中新安装用户的 Firebase 事件发生【英文标题】:Firebase Event Occurrences for New Installed Users in Bigquery 【发布时间】:2017-10-03 17:37:43 【问题描述】:

鉴于用户的安装日期,我想为我们所有 200 多个用户获取 Firebase (1) 事件发生和 (2) 事件不同用户数第 0 天到第 30 天的 Firebase 事件。我在屏幕截图中模拟了下面的输出表(针对 D0-D30),但代码仅针对 Day0-Day7。

(1) 事件发生

SELECT
  event.name as event_name,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170802' THEN event_count END) AS D0_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170802' AND _TABLE_SUFFIX < '20170803' THEN event_count END) AS D1_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170803' AND _TABLE_SUFFIX < '20170804' THEN event_count END) AS D2_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170804' AND _TABLE_SUFFIX < '20170805' THEN event_count END) AS D3_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170805' AND _TABLE_SUFFIX < '20170806' THEN event_count END) AS D4_USERS,
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170806' AND _TABLE_SUFFIX < '20170807' THEN event_count END) AS D5_USERS,  
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170807' AND _TABLE_SUFFIX < '20170808' THEN event_count END) AS D6_USERS,  
  COUNT(CASE WHEN _TABLE_SUFFIX >= '20170808' AND _TABLE_SUFFIX < '20170809' THEN event_count END) AS D7_USERS    
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170809' AND
  user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000;

(2) 事件不同用户数

SELECT
  event.name as event_name,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170802' THEN user_dim.app_info.app_instance_id END) AS D0_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170802' AND _TABLE_SUFFIX < '20170803' THEN user_dim.app_info.app_instance_id END) AS D1_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170803' AND _TABLE_SUFFIX < '20170804' THEN user_dim.app_info.app_instance_id END) AS D2_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170804' AND _TABLE_SUFFIX < '20170805' THEN user_dim.app_info.app_instance_id END) AS D3_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170805' AND _TABLE_SUFFIX < '20170806' THEN user_dim.app_info.app_instance_id END) AS D4_USERS,
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170806' AND _TABLE_SUFFIX < '20170807' THEN user_dim.app_info.app_instance_id END) AS D5_USERS,  
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170807' AND _TABLE_SUFFIX < '20170808' THEN user_dim.app_info.app_instance_id END) AS D6_USERS,  
  COUNT(DISTINCT CASE WHEN _TABLE_SUFFIX >= '20170808' AND _TABLE_SUFFIX < '20170809' THEN user_dim.app_info.app_instance_id END) AS D7_USERS    
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170809'
  AND user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000
GROUP BY 1;

问题:

有没有更优化的写法?对于少量的列是有意义的(D0-D7),但对于 D0-D30,我认为可能有更好的方法。任何建议都非常感谢!

米哈伊尔反馈后的最终答案:

我将这两个查询合并到一个查询中,然后创建了一个数据透视表。请记住在执行前在 BigQuery 编辑器中选择“标准 SQL”。

SELECT
  event.name AS event_name,
  _TABLE_SUFFIX as day,
  COUNT(1) as event_occurances,
  COUNT(DISTINCT user_dim.app_info.app_instance_id) as event_unique_users
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170901' AND
  user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000
GROUP BY event_name, day
ORDER BY event_name;

附录注释:

2017 年 8 月 1 日的时间戳转换

纪元时间戳:1501545600 以毫秒为单位的时间戳:1501545600000

2017 年 8 月 2 日的时间戳转换

纪元时间戳:1501632000 以毫秒为单位的时间戳:1501632000000

【问题讨论】:

【参考方案1】:

有没有更优化的写法?

1。优化这一点的一种方法是在下面重写

COUNT(CASE WHEN _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170802' THEN event_count END) AS D0_USERS

到这里

COUNTIF(_TABLE_SUFFIX = '20170801') AS D0_USERS

:o(对于 D0-D30 的情况,你仍然需要写这行 31 次,但至少它不那么重

2。另一种(正确的)方法是遵循最佳实践并将数据检索与数据可视化分开

因此您可以执行以下操作来检索所需的数据

#standardSQL
SELECT
  event.name AS event_name,
  _TABLE_SUFFIX as day,
  COUNT(1) as users
FROM `<<project-id>>.app_events_*`, UNNEST(event_dim) AS event
WHERE
  _TABLE_SUFFIX >= '20170801' AND _TABLE_SUFFIX < '20170809' AND
  user_dim.first_open_timestamp_micros BETWEEN 1501545600000000 AND 1501632000000000
GROUP BY event_name, day   

然后你可以用你喜欢的任何工具来调整这个结果

例如,使用BigQuery Mate 而不离开 UI,您可以获得如下所示的枢轴

作为快速披露 - 我是 BigQuery Mate Chrome 扩展程序的作者

请注意:我没有调整或更改您的查询逻辑 - 我只是回答了您的具体问题 - 有没有更优化的方式来编写这个?

【讨论】:

超级!再次感谢米哈伊尔,我已经记下了下载 Mate 并使用它! Ps:上面我一直在研究的“用户进步模型”本质上是从用户保留(***.com/questions/46767982/…)演变而来的。我认为在这里分享它会很好,希望更多的人能从中受益:)

以上是关于Bigquery 中新安装用户的 Firebase 事件发生的主要内容,如果未能解决你的问题,请参考以下文章

BigQuery/Firebase 查询事件,按用户属性排序

如何从 BigQuery 中的 Firebase 事件中获取用户表?

如何通过 Firebase 中的某些用户事件过滤 BigQuery 中的保留计算

如何加入 Firebase 和 BigQuery

Firebase BigQuery 活动用户和界面差异

如何计算Firebase中的MAU?我需要BigQuery吗?