为啥我必须在这里使用 group by？

Posted 2023-03-31

技术标签:

【中文标题】为啥我必须在这里使用 group by？【英文标题】：why do I have to use group by here?为什么我必须在这里使用 group by？ 【发布时间】：2020-03-17 09:05:15 【问题描述】：

我正在尝试在 leetcode 中解决这个排名分数问题：https://leetcode.com/problems/rank-scores/ 我有两个解决方案（mysql）。两者都有效。

select a.Score as Score,
(select count(distinct b.Score) from Scores as b where b.Score>=a.score) as Rank
from Scores as a
order by a.Score desc;

和

select s1.Score,count(distinct(s2.score)) Rank
from
Scores s1,Scores s2
where
s1.score<=s2.score
group by s1.Id
order by Rank

但我不确定为什么我必须在解决方案二中使用 GROUP BY 以确保 sql 计算每个分数的计数（否则它只返回最低分数），但我不必在解决方案中使用它一个。

【问题讨论】：

'如果在不包含 GROUP BY 子句的语句中使用分组函数，则相当于对所有行进行分组' - dev.mysql.com/doc/refman/8.0/en/group-by-functions.html 【参考方案1】：

但我不确定为什么我必须在解决方案二中使用 GROUP BY 来确保 sql 计算每个分数的计数

第二个查询通过在不等式条件下自连接表来工作：对于别名 s1 中的每一行，您将获得 s2 中得分小于或相等的所有行。然后，您需要聚合，以便计算每个 s1 有多少 s2 行，从而为您提供排名。

注意：如果您运行的是 MySQL 8.0，您可以在没有连接或子查询的情况下执行此操作，使用窗口函数 rank()，这正是您想要的：

 select score, rank() over(order by score desc) rn from scores

最后：从 2020 年开始，您应该使用显式标准连接，而不是老式的隐式连接：

select s1.score, count(distinct(s2.score)) rn
from scores s1
inner join scores s2 on s1.score <= s2.score
group by s1.id, s1.score
order by rn

【讨论】：

【参考方案2】：

每一个不属于聚合函数的列并且在select子句中的列都需要添加到group by子句中

例如：

需求分组：

select col1, col2, count(*) -- count is aggregate function
from table_name
group by col1, col2

或者

不需要分组：

select count(*) -- count is aggregate function
from table_name

这是一个小例子，看看它是如何工作的：click HERE

说你的第二个查询不起作用：这是有效的代码（没有错误）：

select s1.Score
       , count(distinct(s2.score)) `Rank`
from Scores s1
join Scores s2 on s2.Score >= s1.score
group by s1.Score, s1.id
order by `Rank`;

Here is a demo

【讨论】：

以上是关于为啥我必须在这里使用 group by？的主要内容，如果未能解决你的问题，请参考以下文章

为啥没有聚合函数的选择列需要成为 MySQL 中 Group by 子句的一部分？

postgreSQL使用sql归一化数据表的某列，以及出现“字段 ‘xxx’ 必须出现在 GROUP BY 子句中或者在聚合函数中”错误的可能原因之一

为啥在窗口函数中使用 GROUP BY

group_concat sqlite 和 order by

为啥 postgresql 不使用我的 group by 聚合索引？

为啥不应该禁用 ONLY_FULL_GROUP_BY