Bigquery 事件分析加入 subselect 语句

Posted

技术标签:

【中文标题】Bigquery 事件分析加入 subselect 语句【英文标题】:Bigquery event analytics join in subselect statement 【发布时间】:2016-07-11 19:50:08 【问题描述】:

我正在尝试从 bigquery 返回一个查询结果,该结果返回会话发生的事件数。我一直在参考以下文章:

http://developer.streak.com/2013/11/using-google-bigquery-for-event-tracking.html

数据库架构非常简单 [sessionId, eventType, createdAt] 返回的结果集类似于谷歌分析中的事件工作流。就像是 [sessionId, num_event1, num_event2, ...]

该方法是按事件类型和时间戳生成子查询,然后创建附加子查询,将每个事件子查询的结果连接起来。我可以单独执行 Step1、step2、step3 子查询:

SELECT COUNT(first_event_timestamp) AS number_first_events,
       COUNT(second_event_timestamp) AS number_second_events,
       COUNT(third_event_timestamp) AS number_third_events
FROM

(SELECT eventUid AS eventUid1,
        createdAt AS timestamp1
 FROM [events_table]
 WHERE eventType = 'first-event') step1,

 (SELECT eventUid AS eventUid2,
        createdAt AS timestamp2
  FROM [events_table]
  WHERE eventType = 'second-event') step2,

 (SELECT
    eventUid as sessionId3,
    createdAt as timestamp3         
  FROM
    [events_table]         
  WHERE
    eventType = "third_event") step3

添加步骤 1_2,步骤 1_2_3 是我碰壁的地方。我收到表中缺少数据集名称的错误。这是完整的查询:

SELECT COUNT(first_event_timestamp) AS num_first,
       COUNT(second_event_timestamp) AS num_second,
       COUNT(third_event_timestamp) AS num_third
FROM (SELECT
             sessionId
             first_event_timestamp,
             second_event_timestamp,
             third_event_timestamp
      FROM steps1_2_3
      GROUP BY sessionId),

      (SELECT
            sessionId AS sessionId1,
            createdAt AS timestamp1         
         FROM
            [events_table]         
         WHERE
            eventType = "first_event") step1,           (SELECT
            eventUid AS sessionId2,
            createdAt AS timestamp2         
         FROM
            [events_table]         
         WHERE
            eventType = "second_event") step2,       (SELECT
            eventUid AS sessionId3,
            createdAt AS timestamp3         
         FROM
            [events_table]         
         WHERE
            eventType = "third_Event") step3,         (SELECT sessionId1,
                    timestamp1,
                    IF(timestamp1 < timestamp2, timestamp2, NULL) AS timestamp2
             FROM
                  (SELECT sessionId1,
                          timestamp1,
                          timestamp2
                   FROM step1
                   LEFT JOIN step2
                   ON sessionId1 = sessionId2) ) steps1_2,  (SELECT sessionId1 as sessionId,
              timestamp1 as first_event_timestamp,
              timestamp2 as second_event_timestamp,
              IF(timestamp2 < timestamp3, timestamp3, NULL) as  third_event_timestamp
       FROM
            (SELECT sessionId2,
                    timestamp2,
                    timestamp3
             FROM steps1_2
             LEFT JOIN step3
             ON sessionId1 = sessionId3)
             ) steps1_2_3

理想的结果集如下所示: sessionId num_first_event num_second_event num_third_event S1 1 空 空 S2 2 3 空 S3 4 5 6

我的第一个问题是是否可以加入子查询step1_2,steps1_2_3?

在 bigquery 中实现诸如工作流之类的事件的替代方法,而不是计算时间戳的数量?

非常感谢任何提示或建议的文档 此外,感谢您的时间和考虑。

【问题讨论】:

【参考方案1】:

怎么样

SELECT
  sessionId,
  SUM(eventType = 'first-event') AS number_first_events,
  SUM(eventType = 'second-event') AS number_second_events,
  SUM(eventType = 'third-event') AS number_third_events
FROM [events_table]
GROUP BY sessionId

【讨论】:

你有机会让它工作吗?还是有问题?

以上是关于Bigquery 事件分析加入 subselect 语句的主要内容,如果未能解决你的问题,请参考以下文章

计算 BigQuery 中的谷歌分析独特事件

如何通过 BigQuery 从 Firebase 分析中获取事件转换

如何加入 Firebase 和 BigQuery

Oracle sql subselect 查找聚合值

BigQuery 表通配符函数别名?

加入 Google Bigquery