如何在 select 语句中包含 PERCENTILE_CONT 列,而不会生成有关 ORDER BY 子句或聚合函数的错误?

Posted

技术标签:

【中文标题】如何在 select 语句中包含 PERCENTILE_CONT 列,而不会生成有关 ORDER BY 子句或聚合函数的错误?【英文标题】:How can I include a PERCENTILE_CONT column within a select statement without generating an error about the ORDER BY clause or aggregate function? 【发布时间】:2020-01-13 09:52:10 【问题描述】:

我需要从一些数据生成特定报告,并且在确定正确使用 PERCENTILE_CONT 以提供我需要的结果时遇到了很多麻烦。我想在查询结果中包含一列,该列显示某个值范围中第 95 个百分位数的值。

我有一张如下表:

customer_id sale_amount sale_date
1   265.75  2019-09-11 00:00:04.000
1   45.75   2019-09-10 01:00:04.000
1   2124.77 2019-09-10 04:00:04.000
1   66.99   2019-09-10 04:20:04.000
1   266.49  2019-09-09 11:20:04.000
1   3266.49 2019-09-08 11:20:04.000

非常简单。

我可以运行以下查询,没问题:

select min(sale_amount) as minimum_sale, max(sale_amount) as maximum_sale, avg(sale_amount) as average_sale from sales where customer_id = 1;

这会导致以下输出:

minimum_sale    maximum_sale    average_sale
45.75           3266.49     1006.040000

我要的是第四列 perc_95,它将计算代表 95 个百分位数的值是 sale_amount。

这可以让我获得价值:

select distinct customer_id, percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95 from sales;

输出:

customer_id perc_95
1            2981.06

但我似乎无法将它们结合起来 - 这失败了:

select distinct(customer id), min(sale_amount) as minimum_sale, max(sale_amount) as maximum_sale,
 avg(sale_amount) as average_sale, percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95
  from sales where customer_id = 1;

输出:

选择列表中的“sales.customer_id”列无效,因为它既不包含在聚合函数中,也不包含在 GROUP BY 子句中。

我大致了解此错误的含义,但我无法弄清楚如何在这种情况下处理它。

我想要的输出:

customer_id     minimum_sale      maximum_sale  average_sale    perc_95
1                   45.75         3266.49  1006.040000     2981.06

【问题讨论】:

【参考方案1】:

使用窗口函数:

select distinct customer_id,
       min(sale_amount) over (partition by customer_id) as minimum_sale, 
       max(sale_amount) over (partition by customer_id) as maximum_sale,
       avg(sale_amount) over (partition by customer_id) as average_sale,
       percentile_cont(0.95) within group (order by sale_amount)  over (partition by customer_id) as perc_95
from sales
where customer_id = 1;

SQL Server 不支持percentile_cont() 等函数作为聚合 函数非常不方便,需要人们使用select distinct 进行聚合。

【讨论】:

【参考方案2】:

不要使用DISTINCT

我先试试这个:

select 
    min(customer_id) AS CustomerID, 
    min(sale_amount) as minimum_sale, 
    max(sale_amount) as maximum_sale,
    avg(sale_amount) as average_sale, 
    percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id) as perc_95
from sales 
where customer_id = 1;

如果您收到相同的错误消息,但这次是关于 percentile_cont,那么也将其包装在 min 函数中:

select 
    min(customer_id) AS CustomerID, 
    min(sale_amount) as minimum_sale, 
    max(sale_amount) as maximum_sale,
    avg(sale_amount) as average_sale, 
    min(percentile_cont(0.95) WITHIN GROUP (order by sale_amount) OVER (partition by customer_id)) as perc_95
from sales 
where customer_id = 1;

【讨论】:

以上是关于如何在 select 语句中包含 PERCENTILE_CONT 列,而不会生成有关 ORDER BY 子句或聚合函数的错误?的主要内容,如果未能解决你的问题,请参考以下文章

如何在 select 语句中包含 PERCENTILE_CONT 列,而不会生成有关 ORDER BY 子句或聚合函数的错误?

如何在执行立即语句中包含变量?

语法错误:在 select 语句中包含存储过程的结果

在 hive 的 select 语句中包含子查询结果

Oracle环境,求一个sql语句,如何查询某字段(bz)中包含三个英文字母的连写记录?

错误 - 函数中包含的 Select 语句无法将数据返回给客户端