我如何需要对这个查询进行 GROUP 才能聚合 MAX
Posted
技术标签:
【中文标题】我如何需要对这个查询进行 GROUP 才能聚合 MAX【英文标题】:How do I need to GROUP this query to be able to aggregate a MAX 【发布时间】:2019-07-26 05:04:11 【问题描述】:我如何需要GROUP
这个查询(或者我可以PARTITION
以某种方式),以便我在第 4 行得到max(sample_date_time)
。这需要是 all 的max
em> 选择的记录。我收到了错误:
SELECT list references .... label_list which is neither grouped nor aggregated ...
.
label_list
的数据类型为RECORD (STRUCT)
。 UNNEST
运算符接受ARRAY
并返回一个表,ARRAY
中的每个元素对应一行
我看过这个 - BigQuery standard SQL: how to group by an ARRAY field 但这对我没有帮助。我的情况的不同之处在于我也在选择ARRAY
。
SELECT
label_list,
created_date_time,
max(sample_date_time) AS sample_date_time_max, -- <-- HERE
max(created_date_time) OVER (PARTITION By sample_date_time, finger_print_hash ORDER BY sample_date_time) AS created_date_time_max,
sample_date_time,
station,
(
SELECT name
FROM UNNEST(label_list)
WHERE type = "CHL"
) as channel,
value
FROM my.mart
WHERE sample_date_time BETWEEN "2019-07-25 23:00:00.000000+00:00" AND "2019-07-26 04:00:00.000000+00:00"
AND station = '[myGuid]'
AND uom = "[myUom]"
AND is_good_status = true
GROUP BY TO_JSON_STRING(label_list)
样本数据:
Row station label_list.type label_list.name finger_print_hash created_date_time sample_date_time time_zone uom is_good_status
1 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -7.97672E+16 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL PM25 HSPI
2 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -1.35959E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL PM10 HSPI
3 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -6.25737E+17 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL NO2 HSPI
4 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -4.68557E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL Noise Level HSPI
5 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -7.23989E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Victoria HSI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL HSI
6 534e669069b74258b3386c482d11d139 STE Healthy School -7.23989E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Melbourne HSI TRUE
STN Mock Station 1
RPT HSI
INS Calculated
CHL HSI
7 534e669069b74258b3386c482d11d139 STE Healthy School -4.68557E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Melbourne HSPI TRUE
STN Mock Station 1
RPT HSI
INS Calculated
CHL Noise Level HSPI
8 534e669069b74258b3386c482d11d139 STE Healthy School -1.35959E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Melbourne HSPI TRUE
STN Mock Station 1
RPT HSI
INS Calculated
CHL PM10 HSPI
9 534e669069b74258b3386c482d11d139 STE Healthy School -6.25737E+17 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Melbourne HSPI TRUE
STN Mock Station 1
RPT HSI
INS Calculated
CHL NO2 HSPI
10 534e669069b74258b3386c482d11d139 STE Healthy School -7.97672E+16 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Melbourne HSPI TRUE
STN Mock Station 1
RPT HSI
INS Calculated
CHL PM25 HSPI
11 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School 2.57256E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL O3 HSPI
12 534e669069b74258b3386c482d11d139 STE Healthy School 2.57256E+18 2019-07-26 05:15:03.097265 UTC 2019-07-26 04:00:00 UTC Australia/Melbourne HSPI TRUE
STN Mock Station 1
RPT HSI
INS Calculated
CHL O3 HSPI
13 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -4.68557E+18 2019-07-26 04:15:02.536014 UTC 2019-07-26 03:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL Noise Level HSPI
14 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -1.35959E+18 2019-07-26 04:15:02.536014 UTC 2019-07-26 03:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL PM10 HSPI
15 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -7.23989E+18 2019-07-26 04:15:02.536014 UTC 2019-07-26 03:00:00 UTC Australia/Victoria HSI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL HSI
16 0f97ae8cec364768b2df6fa98c20adb5 STE Healthy School -7.97672E+16 2019-07-26 04:15:02.536014 UTC 2019-07-26 03:00:00 UTC Australia/Victoria HSPI TRUE
STN API Test Pod
RPT HSI
INS Calculated
CHL PM25 HSPI
【问题讨论】:
示例数据在这里会有所帮助。 什么是UNNEST
,它的返回类型,因为在您的查询中,这似乎只在任何情况下都给出一个值......需要详细信息
您能帮我澄清几件事吗 - 1) STE Healthy School
是什么字段? 2) 为什么你认为你需要GROUP BY TO_JSON_STRING(label_list)
?你到底想在这里实现什么?
没关系 - 看看我的回答
能否将预期输出与示例数据一起添加。你的要求不清楚。我很困惑,因为选择查询同时具有最大值(sample_date_time)和sample_date_time。单个label_list会有不同的sample_date_time(如果不需要max(single label_list),你要选择哪一个。
【参考方案1】:
我觉得下面才是你真正需要的
SELECT
label_list,
created_date_time,
MAX(sample_date_time) OVER() AS sample_date_time_max, -- <-- HERE
MAX(created_date_time) OVER (PARTITION By sample_date_time, finger_print_hash ORDER BY sample_date_time) AS created_date_time_max,
sample_date_time,
station,
(
SELECT name
FROM UNNEST(label_list)
WHERE type = "CHL"
) as channel,
value
FROM my.mart
WHERE sample_date_time BETWEEN "2019-07-25 23:00:00.000000+00:00" AND "2019-07-26 04:00:00.000000+00:00"
AND station = '[myGuid]'
AND uom = "[myUom]"
AND is_good_status = true
而且您在这里不需要任何 GROUP BY!
【讨论】:
【参考方案2】:我猜你在选择中也需要同样的TO_JSON_STRING(label_list)
。
Select TO_JSON_STRING(label_list)...
...group by
TO_JSON_STRING(label_list)
由于 label_list 在选择分组时无法识别
【讨论】:
我可以这样做并且它解决了该错误,然后我可以按查询中的所有其他列进行分组,但随后我在SELECT name FROM UNNEST(label_list) WHERE type = "CHL"
上收到错误。我不能用TO_JSON_STRING
包装它,因为它需要保留,对于SELECT
是的,内部选择是独立的,所以不同步尝试在 group by 中添加该通道查询,我不明白为什么您需要在 select 中使用该子查询,因为它是独立的,您可以简单地加入它。就像 select all_columns 一样,除了子查询 group by json .... join subquery
如何在group by中添加频道查询?我可以添加“频道”,但我不确定是否要在群组中添加查询。
尝试加入不在 select 中的子查询以上是关于我如何需要对这个查询进行 GROUP 才能聚合 MAX的主要内容,如果未能解决你的问题,请参考以下文章
使用SQL语言了解Django ORM中的分组(group by)和聚合(aggregation)查询
java使用elasticsearch分组进行聚合查询(group by)