Google Analytics 和 BigQuery 之间的会话不匹配,同时将 hits 和 hits.product 取消嵌套

Posted

技术标签:

【中文标题】Google Analytics 和 BigQuery 之间的会话不匹配,同时将 hits 和 hits.product 取消嵌套【英文标题】:Sessions Mismatch between Google analytics and BigQuery while unnesting hits and hits.product together 【发布时间】:2019-11-26 00:11:27 【问题描述】:

当我尝试运行以下查询时,我发现 Google 分析和 BQ 数据之间存在 15% 的数据差异:

SELECT
  SUM(Sessions) AS Sessions
FROM (
  SELECT
    PARSE_DATE("%Y%m%d",
      date) AS DATE,
    COUNT(DISTINCT CONCAT(fullVisitorId,"-",CAST(visitStartTime AS STRING))) AS Sessions,
    (COUNT(DISTINCT
        CASE
          WHEN totals.bounces = 1 THEN CONCAT(fullVisitorId, CAST(visitStartTime AS STRING))
          ELSE NULL
        END ) / COUNT(DISTINCT CONCAT(fullVisitorId, CAST(visitStartTime AS STRING))))*100 AS Bounce_Rate,
    COUNT(DISTINCT hits.transaction.transactionId) AS Transactions,
    SUM(hits.transaction.transactionRevenue)/1000000 AS Revenue,
    SUM(p.productRevenue)/1000000 AS Product_Revenue,
    (COUNT(DISTINCT hits.transaction.transactionId) / COUNT(DISTINCT CONCAT(CAST(fullVisitorId AS STRING), CAST(visitStartTime AS STRING))))*100 AS Ecommerce_Conversion_Rate,
    (SUM(hits.transaction.transactionRevenue)/1000000)/COUNT(DISTINCT hits.transaction.transactionId) AS Avg_Order_Value,
    SUM(hits.item.itemQuantity) / COUNT(hits.transaction.transactionId) AS Avg_Quantity,
    device.deviceCategory AS DeviceCategory,
    channelGrouping AS DefaultChannelGrouping,
    CONCAT(trafficSource.source," / ",trafficSource.medium) AS Source_Medium
  FROM
    `[Project_ID].[Dataset].ga_sessions_2019*`,
    UNNEST(hits) AS hits,
    UNNEST(hits.product) AS p
  GROUP BY
    DATE,
    DeviceCategory,
    DefaultChannelGrouping,
    Source_Medium )
WHERE
  DATE BETWEEN "2019-11-17"
  AND "2019-11-23"

但是当我摆脱UNNEST(hits.product) AS p 时,我得到的差异程度较低。我想知道如何将UNNEST hitshits.product 数据放在一起

【问题讨论】:

【参考方案1】:

您正在与产品数组交叉连接。如果产品数组丢失,交叉连接将导致NULL - 有效地擦除整个命中,有时甚至是整个会话(如果只有一个没有产品信息的命中)。 您必须使用产品数组 LEFT JOIN 以防止删除点击/会话。

FROM `[Project_ID].[Dataset].ga_sessions_2019*` AS t
  CROSS JOIN UNNEST(hits) AS h 
  LEFT JOIN  UNNEST(product) AS p

或简而言之

FROM `[Project_ID].[Dataset].ga_sessions_2019*` AS t, t.hits h LEFT JOIN h.product p

【讨论】:

以上是关于Google Analytics 和 BigQuery 之间的会话不匹配,同时将 hits 和 hits.product 取消嵌套的主要内容,如果未能解决你的问题,请参考以下文章

报告Google Analytics analytics.js异常跟踪的例外情况

Google analytics 登录界面异常

Phonegap 和 Google Analytics 不起作用

javascript 使用JS和Google Analytics进行表单跟踪

javascript 跟踪Google Analytics中的下载和出站链接

php 使用Google Analytics和GAPI获取网站数据