org.apache.spark.sql.AnalysisException:表达式 't2.`sum_click_passed`' 既不在 group by 中,也不是聚合函数
Posted
技术标签:
【中文标题】org.apache.spark.sql.AnalysisException:表达式 \'t2.`sum_click_passed`\' 既不在 group by 中,也不是聚合函数【英文标题】:org.apache.spark.sql.AnalysisException: expression 't2.`sum_click_passed`' is neither present in the group by, nor is it an aggregate functionorg.apache.spark.sql.AnalysisException:表达式 't2.`sum_click_passed`' 既不在 group by 中,也不是聚合函数 【发布时间】:2021-09-18 08:43:09 【问题描述】:例如:
SELECT
bucket,
repeat_all_click,
sum_click_passed,
sum_imp_passed,
sum_charge,
sum_click_passed as acp,
sum_roi_cnt / sum_click_passed as shop_cvr,
sum_roi_amt / sum_charge as shop_roi,
sum_roi_pay_cnt / sum_click_passed as pay_cvr,
sum_roi_pay_amt / sum_charge as pay_roi,
1.0 * sum_cpv_all / sum_spv_all as imp_rate
FROM
(
(
SELECT
request_id,
IF (
all_click_cnt = 1,
'= 1',
IF (
all_click_cnt > 1,
'> 1',
'= 0'
)
) as repeat_all_click
FROM
a
WHERE
partition_date BETWEEN '2021-09-08'
AND '2021-09-17'
AND channel = 'HS'
AND slot_id = 5
AND (
rerank_algo = 'algo1'
OR rerank_algo = 'algo2'
)
GROUP BY
1,
2
) t1
JOIN (
SELECT
request_id,
sum(click_passed) as sum_click_passed,
sum(imp_passed) as sum_imp_passed,
sum(charge) as sum_charge,
sum(roi_cnt) as sum_roi_cnt,
sum(roi_amt) as sum_roi_amt,
sum(roi_pay_cnt) as sum_roi_pay_cnt,
sum(roi_pay_amt) as sum_roi_pay_amt,
sum(cpv_all) as sum_cpv_all,
sum(spv_all) as sum_spv_all
FROM
b
WHERE
partition_date BETWEEN '2021-09-14'
AND '2021-09-17'
AND slotid = 5
GROUP BY
request_id
) t2 ON t1.request_id = t2.request_id
JOIN (
SELECT
requestid AS request_id,
IF (strategy_path LIKE '%4-54-2612%', 'EXP', 'BASE') AS bucket
FROM
c
WHERE
(
dt BETWEEN '20210914'
AND '20210917'
AND channel = 'S'
AND (
slot_ids LIKE '%50011%'
OR slot_ids LIKE '%50020%'
)
AND (
strategy_path LIKE '%54-4%'
OR strategy_path LIKE '%54-2%'
)
)
GROUP BY
1,
2
) t3 ON t2.request_id = t3.request_id
)
GROUP BY
1,
2
ORDER BY
1,
2
用户类抛出异常:org.apache.spark.sql.AnalysisException:表达式't2.
sum_click_passed
'既不在group by中,也不是聚合函数。如果您不在乎获得哪个值,请添加到 group by 或包裹在 first() (或 first_value)中。;;排序 [bucket#152 ASC NULLS FIRST, repeat_all_click#141 ASC NULLS FIRST], true +- 聚合 [bucket#152, repeat_all_click#141], [bucket#152, repeat_all_click#141,
我对 Hiveql 不是很熟悉,但在 SQL 中应该不会出错。
不幸的是,它像以前一样出错,我不知道如何正确修复它,因为我认为sum(click_passed) as sum_click_passed
应该是一个聚合函数。
.
谁能帮帮我? 提前致谢。
【问题讨论】:
【参考方案1】:我认为 hql 语句的语法是错误的。在 first from 和 last group by 子句之后删除那些额外的括号。 语法应该是
SELECT ..
FROM
(SELECT... FROM T1)T1
JOIN (SELECT... )T2 ON ...
JOIN (SELECT... )T3 ON ...
GROUP BY...
ORDER BY...
请在下面使用。
SELECT
bucket,
repeat_all_click,
sum_click_passed,
sum_imp_passed,
sum_charge,
sum_click_passed as acp,
sum_roi_cnt / sum_click_passed as shop_cvr,
sum_roi_amt / sum_charge as shop_roi,
sum_roi_pay_cnt / sum_click_passed as pay_cvr,
sum_roi_pay_amt / sum_charge as pay_roi,
1.0 * sum_cpv_all / sum_spv_all as imp_rate
FROM
--( removed/commented out
(
SELECT
request_id,
IF (
all_click_cnt = 1,
'= 1',
IF (
all_click_cnt > 1,
'> 1',
'= 0'
)
) as repeat_all_click
FROM
a
WHERE
partition_date BETWEEN '2021-09-08'
AND '2021-09-17'
AND channel = 'HS'
AND slot_id = 5
AND (
rerank_algo = 'algo1'
OR rerank_algo = 'algo2'
)
GROUP BY
1,
2
) t1
JOIN (
SELECT
request_id,
sum(click_passed) as sum_click_passed,
sum(imp_passed) as sum_imp_passed,
sum(charge) as sum_charge,
sum(roi_cnt) as sum_roi_cnt,
sum(roi_amt) as sum_roi_amt,
sum(roi_pay_cnt) as sum_roi_pay_cnt,
sum(roi_pay_amt) as sum_roi_pay_amt,
sum(cpv_all) as sum_cpv_all,
sum(spv_all) as sum_spv_all
FROM
b
WHERE
partition_date BETWEEN '2021-09-14'
AND '2021-09-17'
AND slotid = 5
GROUP BY
request_id
) t2 ON t1.request_id = t2.request_id
JOIN (
SELECT
requestid AS request_id,
IF (strategy_path LIKE '%4-54-2612%', 'EXP', 'BASE') AS bucket
FROM
c
WHERE
(
dt BETWEEN '20210914'
AND '20210917'
AND channel = 'S'
AND (
slot_ids LIKE '%50011%'
OR slot_ids LIKE '%50020%'
)
AND (
strategy_path LIKE '%54-4%'
OR strategy_path LIKE '%54-2%'
)
)
GROUP BY
1,
2
) t3 ON t2.request_id = t3.request_id
--) removed
GROUP BY
1,
2
ORDER BY
1,
2
【讨论】:
以上是关于org.apache.spark.sql.AnalysisException:表达式 't2.`sum_click_passed`' 既不在 group by 中,也不是聚合函数的主要内容,如果未能解决你的问题,请参考以下文章