SQL - 组中的 Max() 值不起作用

Posted 2023-03-31

技术标签:

【中文标题】SQL - 组中的 Max() 值不起作用【英文标题】：SQL - Max() value from the group is not working 【发布时间】：2020-06-30 01:02:02 【问题描述】：

Image1：样本数据

Image2：输出不正确

Image3：所需的输出

查询：我正在尝试按 Class_Name 和 Customer（image1 示例数据）查询列 (Median_Percentage) 中的最大值

问题：查询显示的是所有客户，而不是具有最大中值的客户（image2 错误结果）。它正在正确计算 Max()，但查询将所有客户的值而不是在 Class_Name 中具有该 Max 值的值放置

我只需要具有 Max(Median_Percentage) 的 Class_Name 并显示给客户。（image3 所需的输出）

Select  
       distinct  
        C.Class_Name,
        C.Customer,
        C.Max_Median_Percentage

        

FROM (
   
    SELECT 
        
        B.Class_Name,
        
        case (when B.Median_Percentage =  Max(B.Median_Percentage) OVER(PARTITION By B.Class_Name ORDER BY B.Median_Percentage desc  )
            then B.Customer
        end as Customer,

        Max(B.Median_Percentage) OVER(PARTITION By B.Class_Name ORDER BY B.Median_Percentage desc  ) as Max_Median_Percentage

    FROM (
        
        SELECT 
  
            
            A.Class_Name,
            A.Customer,
            A.Date_Time
            
            A.Median_Percentage

        From table1 as A

    ) as B


) as C

【问题讨论】：

用您正在使用的数据库标记您的问题。我不确定 SQL 与这个问题有什么关系，这本身似乎很清楚。您使用的是哪种 DBMS 产品？ “SQL”只是一种查询语言，而不是特定数据库产品的名称。请为您使用的数据库产品添加tag。 Why should I tag my DBMS 谢谢！我为数据库添加了标签。我还简化了我正在处理的查询，希望对您有所帮助 【参考方案1】：

如果您的数据库不直接支持“中值”函数，您可以使用percentile_cont()：

select t.*,
       boot_time / percentile_cont(0.5) within group (order by boot_time) over (partition by classid)
from t;

如果您的数据库没有percentile_cont() 或percentile_disc() 函数，您可以使用简单的ntile() 非常接近：

select t.*,
       boot_time / max(case when tile = 1 then boot_time end) over (partition by classid)
from (select t.*,
             ntile(2) over (partition by classid order by boot_time) as tile
      from t
     ) t

如果classid 中有奇数行，这将完全有效。对于偶数，它减 1。您可以轻松处理，但更复杂：

select t.*,
       (boot_time /
        (( max(case when tile_asc = 1 then boot_time end) over (partition by classid) / 2 +
           max(case when tile_desc = 1 then boot_time end) over (partition by classid)
         ) / 2
        )
       )
from (select t.*,
             ntile(2) over (partition by classid order by boot_time) as tile_asc,
             ntile(2) over (partition by classid order by boot_time desc) as tile_desc
      from t
     ) t

【讨论】：

感谢您的回复，percentile_cont() 不适用于我的数据库【参考方案2】：

也许这很有用 -

加载提供的测试数据

val df = spark.sql(
      """
        |select Class_Name, Customer, Date_Time, Median_Percentage
        |from values
        |   ('ClassA', 'A', '6/13/20', 64550),
        |   ('ClassA', 'B', '6/6/20', 40200),
        |   ('ClassB', 'F', '6/20/20', 26800),
        |   ('ClassB', 'G', '6/20/20', 18100)
        |  T(Class_Name, Customer, Date_Time, Median_Percentage)
      """.stripMargin)
    df.show(false)
    df.printSchema()

    /**
      * +----------+--------+---------+-----------------+
      * |Class_Name|Customer|Date_Time|Median_Percentage|
      * +----------+--------+---------+-----------------+
      * |ClassA    |A       |6/13/20  |64550            |
      * |ClassA    |B       |6/6/20   |40200            |
      * |ClassB    |F       |6/20/20  |26800            |
      * |ClassB    |G       |6/20/20  |18100            |
      * +----------+--------+---------+-----------------+
      *
      * root
      * |-- Class_Name: string (nullable = false)
      * |-- Customer: string (nullable = false)
      * |-- Date_Time: string (nullable = false)
      * |-- Median_Percentage: integer (nullable = false)
      */

通过`Class_Name` 找到最大的`Median_Percentage` 行

    df.groupBy("Class_Name")
      .agg(max(struct($"Median_Percentage", $"Date_Time", $"Customer")).as("struct"))
      .selectExpr("Class_Name", "struct.Customer", "struct.Date_Time", "struct.Median_Percentage")
      .show(false)

    /**
      * +----------+--------+---------+-----------------+
      * |Class_Name|Customer|Date_Time|Median_Percentage|
      * +----------+--------+---------+-----------------+
      * |ClassA    |A       |6/13/20  |64550            |
      * |ClassB    |F       |6/20/20  |26800            |
      * +----------+--------+---------+-----------------+
      */

【讨论】：

以上是关于SQL - 组中的 Max() 值不起作用的主要内容，如果未能解决你的问题，请参考以下文章

SQL - 组中的 Max() 值不起作用

加载提供的测试数据

通过Class_Name 找到最大的Median_Percentage 行

通过`Class_Name` 找到最大的`Median_Percentage` 行