BigQuery LEFT JOIN 是加倍值
Posted
技术标签:
【中文标题】BigQuery LEFT JOIN 是加倍值【英文标题】:BigQuery LEFT JOIN is doubling-up values 【发布时间】:2018-07-10 19:29:14 【问题描述】:我正在尝试合并两个数据集——一个是销售目标,另一个是实际销售额,按天和市场(美国/英国)。
为此,我使用了第三个表,该表使用GENERATE_DATE_ARRAY
创建要报告的日期的主列表 - 这样我就不会在没有设定目标和没有报告销售的情况下出现空白.
我发现我的销售额被计算了两次,因此已将我的数据和查询减少到可重现的状态:
#standardSQL
WITH dates AS (
SELECT day FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2018-07-05', '2018-07-09', INTERVAL 1 DAY)) AS day
),
targets AS (
SELECT DATE '2018-07-06' AS day, 'UK' AS Market, NUMERIC '2.4' AS quantity
UNION ALL SELECT '2018-07-06', "US", 8.4
UNION ALL SELECT '2018-07-06', "US", 1.2
UNION ALL SELECT '2018-07-08', "UK", 3.0
UNION ALL SELECT '2018-07-08', "US", 10.9
),
sales AS (
SELECT DATE '2018-07-08' AS day, 'UK' AS Market, 4 AS quantity
UNION ALL SELECT '2018-07-06', 'US', 15
)
SELECT
dates.day AS day,
targets.market AS market,
SUM(targets.quantity) AS targetQuantity,
SUM(sales.quantity) AS quantity
FROM dates
LEFT JOIN targets
ON dates.day = CAST(targets.day AS DATE)
LEFT JOIN sales
ON dates.day = CAST(sales.day AS DATE) AND targets.market = sales.market
GROUP BY day, market
ORDER BY day, market
这给出了以下结果:
结果显示,7 月 6 日(第 3 行)报告的销售量为 30,尽管数据中为 15。
当targets
数据中有两行该日期和市场时,就会发生这种情况,但我不知道如何为此编码。
感谢您的帮助!
【问题讨论】:
【参考方案1】:下面应该工作。这个想法是预先聚合销售和目标表以避免重复
#standardSQL
WITH dates AS (
SELECT day FROM UNNEST(GENERATE_DATE_ARRAY(DATE '2018-07-05', '2018-07-09', INTERVAL 1 DAY)) AS day
), targets AS (
SELECT DATE '2018-07-06' AS day, 'UK' AS Market, NUMERIC '2.4' AS quantity
UNION ALL SELECT '2018-07-06', "US", 8.4
UNION ALL SELECT '2018-07-06', "US", 1.2
UNION ALL SELECT '2018-07-08', "UK", 3.0
UNION ALL SELECT '2018-07-08', "US", 10.9
), sales AS (
SELECT DATE '2018-07-08' AS day, 'UK' AS Market, 4 AS quantity
UNION ALL SELECT '2018-07-06', 'US', 15
)
SELECT
dates.day AS day,
t.market AS market,
targetQuantity,
quantity
FROM dates
LEFT JOIN (SELECT day, market, SUM(quantity) AS targetQuantity FROM targets GROUP BY day, market) t
ON dates.day = CAST(t.day AS DATE)
LEFT JOIN (SELECT day, market, SUM(quantity) AS quantity FROM sales GROUP BY day, market) s
ON dates.day = CAST(s.day AS DATE) AND t.market = s.market
ORDER BY day, market
【讨论】:
以上是关于BigQuery LEFT JOIN 是加倍值的主要内容,如果未能解决你的问题,请参考以下文章
BigQuery 未在 LEFT JOIN 中返回缺失的 NULL 行
带有 UNNEST、LEFT JOIN 和 WHERE 语句的 Bigquery
Bigquery:按 _PARTITIONTIME 过滤不会在 LEFT JOIN 上传播
LEFT OUTER JOIN 在 bigquery 上创建子查询时出错