在 SQL 中，如何为每个组选择前 2 行

Posted 2023-02-16

技术标签:

【中文标题】在 SQL 中，如何为每个组选择前 2 行【英文标题】：In SQL, how to select the top 2 rows for each group 【发布时间】：2013-04-04 21:03:59 【问题描述】：

我有一张如下表：

NAME    SCORE
-----------------
willy       1
willy       2
willy       3
zoe         4
zoe         5
zoe         6

这是sample

group by 的聚合函数只允许我获得每个name 的最高分。我想查询每个name的最高2分，我该怎么做？

我的预期输出是

NAME    SCORE
-----------------
willy       2
willy       3
zoe         5
zoe         6

【问题讨论】：

如果使用 Oracle SQL，请参阅How do I limit the number of rows returned by an Oracle query after ordering? 【参考方案1】：

SELECT *
FROM   test s
WHERE 
        (
            SELECT  COUNT(*) 
            FROM    test  f
            WHERE f.name = s.name AND 
                  f.score >= s.score
        ) <= 2

SQLFiddle Demo

【讨论】：

但这可能会导致性能问题。还有其他更快的方法来实现这个查询吗？这确实会导致相当严重的性能问题（子选择是二次的）。这可以线性完成，请参阅此处的“mysql Query to Get Top 2”sqlines.com/mysql/how-to/get_top_n_each_group【参考方案2】：

在 MySQL 中，可以使用用户定义的变量来获取每个组中的行号：

select name, score
from
(
  SELECT name,
    score,
    (@row:=if(@prev=name, @row +1, if(@prev:= name, 1, 1))) rn
  FROM test123 t
  CROSS JOIN (select @row:=0, @prev:=null) c
  order by name, score desc 
) src
where rn <= 2
order by name, score;

见Demo

【讨论】：

感谢您的解决方案，我还是 SQL 新手。希望以后能看懂这个:) @waitingkuo 不幸的是，MySQL 没有窗口函数，可以让您轻松地为组中的每一行分配行号。 @bluefeet 感谢这是一个非常好的解决方案，即使在 30k 行上它也能运行得非常快，我之前使用连接的解决方案非常慢这样安全吗？ MySQL 声明“涉及用户变量的表达式的评估顺序未定义。”这是否意味着 @prev:=name 可以在您的案例陈述之前进行评估，因此案例陈述会人为地为真？还是我错过了什么？见dev.mysql.com/doc/refman/5.5/en/user-variables.html【参考方案3】：

如果您不介意有额外的列，那么您可以使用以下代码：

SELECT Name, Score, rank() over(partition by Name order by Score DESC) as rank
From Table
Having rank < 3;

Rank 函数为每个分区提供排名，在您的情况下是名称

【讨论】：

我收到错误消息：Msg 156, Level 15, State 1, Line 5 关键字“order”附近的语法不正确。 “姓名”后面不能有逗号【参考方案4】：

insert into test values('willy',1)
insert into test values('willy',2)
insert into test values('willy',3)
insert into test values('zoe',4)
insert into test values('zoe',5)
insert into test values('zoe',6)


;with temp_cte
as (
    select Name, Score,
       ROW_NUMBER() OVER (
          PARTITION BY Name
          ORDER BY Score desc
       ) row_num
    from test
)
select * from temp_cte
where row_num < 3

【讨论】：

很好的第一个答案！请在发布前测试您的代码（插入后缺少分号）。解释它（使用 cmets）以及使用的概念，例如WITH 和 Common Table Expersion.【参考方案5】：

为此，您可以这样做-

http://www.sqlfiddle.com/#!2/ee665/4

但为了获得前 2 个查询，您应该使用一个 ID，然后对 ID 运行限制，例如 0,2。

【讨论】：

恐怕不是我所期望的是的，我只是在提供一种方法，您可以简单地做到这一点，如果您为每一行（主键）维护了 ID，它将工作得更多，并且您将拥有更多功能。由于您需要它的长代码，而且将来您将更难在它上面使用其他任何东西。【参考方案6】：

你可以这样做：

SET @num :=0, @name :='';   
SELECT name, score,
    @num := IF( @name= name, @num +1, 1 ) AS row_number,
    @name := name AS dummy
FROM test
GROUP BY name, score
HAVING row_number <=2

【讨论】：

【参考方案7】：

SELECT * FROM (   
    SELECT  VD.`cat_id` ,  
       @cat_count := IF( (@cat_id = VD.`cat_id`), @cat_count + 1, 1 ) AS 'DUMMY1', 
       @cat_id := VD.`cat_id` AS 'DUMMY2',
       @cat_count AS 'CAT_COUNT'   
     FROM videos VD   
     INNER JOIN categories CT ON CT.`cat_id` = VD.`cat_id`  
       ,(SELECT @cat_count :=1, @cat_id :=-1) AS CID  
     ORDER BY VD.`cat_id` ASC ) AS `CAT_DETAILS`
     WHERE `CAT_COUNT` < 4

------- STEP FOLLOW ----------  
1 . select * from ( 'FILTER_DATA_HERE' ) WHERE 'COLUMN_COUNT_CONDITION_HERE' 
2.  'FILTER_DATA_HERE'   
    1. pass 2 variable @cat_count=1 and  @cat_id = -1  
    2.  If (@cat_id "match" column_cat_id value)  
        Then  @cat_count = @cat_count + 1    
        ELSE @cat_count = 1      
    3. SET @cat_id = column_cat_id    

 3. 'COLUMN_COUNT_CONDITION_HERE'   
    1. count_column < count_number    

4. ' EXTRA THING '
   1. If you want to execute more than one statement inside " if stmt "
   2. IF(condition, stmt1 , stmt2 )
      1. stmt1 :- CONCAT(exp1, exp2, exp3) 
      2. stmt2 :- CONCAT(exp1, exp2, exp3) 
   3. Final "If" Stmt LIKE 
      1. IF ( condition , CONCAT(exp1, exp2, exp3) , CONCAT(exp1, exp2, exp3) )

【讨论】：

【参考方案8】：

使用这个查询。

select * from fruits 
where type = 'orange'  
order by price limit 2

解决方案：https://www.xaprb.com/blog/2006/12/07/how-to-select-the-firstleastmax-row-per-group-in-sql/

【讨论】：

以上是关于在 SQL 中，如何为每个组选择前 2 行的主要内容，如果未能解决你的问题，请参考以下文章