Google BigQuery - 根据另一列中的值减去一列的 SUM
Posted
技术标签:
【中文标题】Google BigQuery - 根据另一列中的值减去一列的 SUM【英文标题】:Google BigQuery - Subtract SUMs of a column basing on values in another column 【发布时间】:2021-03-17 04:57:57 【问题描述】:您好,我需要 1 个查询来获得前 10 个 country
,其中在 2019 年至 2020 年之间,goods_type
药物的 [total(import
) - total(export
)] 最大。
数据样本如下:
year | trading_type | country | goods_type | amount
2020 import ABC medicines 12345.67
2017 import ABC medicines null
2019 export DEF foods 987.65
2018 export ABC foods 2345.6
2016 export DEF medicines 120.3
2019 export ABC medicines 345.67
2020 import DEF foods 321.04
... ... ... ... ...
返回的数据应包括country
、goods_type
,以及[total(imports
) - total(export
)]的值。
我想出了下面的查询,但我不知道它是对还是错,我努力扩展它以获取其他列。我在 Google BigQuery 控制台中告诉 select expression column ... not grouped or aggregated...
时出错。
SELECT country, year FROM `trading_records` T <--- error here for the year
WHERE
T.product_type = 'medicines' AND
(T.year = 2019 OR T.year = 2020)
GROUP BY T.country
ORDER BY (
(SELECT SUM(amount) FROM `trading_records`
WHERE trading_type = 'import' AND country = T.country)
-
(SELECT SUM(amount) FROM `trading_records`
WHERE trading_type = 'export' AND country = T.country)
) DESC
LIMIT 10;
感谢您的帮助!谢谢。
【问题讨论】:
【参考方案1】:您可以将其表达为带有GROUP BY
、过滤和条件聚合的单个查询:
SELECT country,
SUM(CASE WHEN trading_type = 'import' THEN amount ELSE - amount END) as total
FROM data
WHERE trading_type in ('import', 'export') AND
goods_type = 'medicines' AND
year >= 2019 AND
year <= 2020
GROUP BY country
ORDER BY total DESC
LIMIT 10;
请注意,这不包括 SELECT
中的 YEAR
,因为它是“聚合的”。
【讨论】:
谢谢!您的查询非常简洁有效。【参考方案2】:我确定还有其他方法,但此查询可以满足您的要求:
WITH data as (
SELECT 2020 as year, "import" as trading_type, "ABC" as country,
"medicines" as goods_type, 12345.67 as amount UNION ALL
SELECT 2019, "import", "ABC", "medicines", null UNION ALL
SELECT 2019, "export", "DEF", "foods", 987.65 UNION ALL
SELECT 2018, "export", "ABC", "foods", 2345.6 UNION ALL
SELECT 2016, "export", "DEF", "medicines", 120.3 UNION ALL
SELECT 2019, "export", "ABC", "medicines", 345.67 UNION ALL
SELECT 2020, "import", "DEF", "foods", 321.04)
,agg_data as (
SELECT year,
country,
IF(trading_type = "import", amount, amount * -1) as total
FROM data
WHERE goods_type = "medicines" AND year in (2019,2020)
)
SELECT country, SUM(total) as total
FROM agg_data
GROUP BY country
LIMIT 1
您应该将最后一个 LIMIT 1 更改为 10
结果: ABC 12000.0
【讨论】:
【参考方案3】:您可以尝试使用条件和。例如 SUM(IF(Condition, true_value, false_value)。这将首先评估您的 Condition。如果为 True,则 true_value(在此示例中为金额)将包含在 SUM 中。如果评估为 False,则 0 将被添加到 SUM 中。
这会给你你想要的
SELECT country, goods_type, SUM(IF(trading_type='import', amount, 0)) - SUM(IF(trading_type='export', amount, 0)) as import_minus_export
FROM `trading_records`
WHERE goods_type='medicines' AND year IN (2019, 2020)
GROUP BY 1, 2
ORDER BY 2 DESC
LIMIT 10
【讨论】:
以上是关于Google BigQuery - 根据另一列中的值减去一列的 SUM的主要内容,如果未能解决你的问题,请参考以下文章
将 Google BigQuery 中一个表中的 XML 数据转换为同一表中另一列中的 JSON 数据