子查询中匹配条件的每个组的最大值

Posted 2023-03-31

技术标签:

【中文标题】子查询中匹配条件的每个组的最大值【英文标题】：max of each group in subquery that matches a condition 【发布时间】：2016-05-03 17:42:18 【问题描述】：

我有一张如下图所示的表格。

我有一张有 10 列的表格，我对其中的 4 列感兴趣。用 id、name、url、排名说 tableA。

id    |name    |url    |ranking
--------------------------------
1     |apple   |a1.com  |1
2     |apple   |a1.com  |2
3     |apple   |a1z.com |3
4     |orange  |o1.com  |1
5     |orange  |o1.com  |2
6     |apple   |a1.com  |4
7     |apple   |a1z.com |5
8     |orange  |o1z.com |6

我想要 id 为 7,6,3,2 8,5,4 的行。即对于每个组（苹果和橙色） - 排名 > max(ranking)-3 且 url 中包含 z 的所有行。

对于苹果，id 7 ，其中包含 z 的 url 的最大排名为 5

所以我想要排名 >5-3 的苹果行，即。排名大于 2。

id 为 7,6,3 的行。

同样适用于橙色组。（id 为 8,5,4 的行）

【问题讨论】：

【参考方案1】：

嗯。您似乎最多需要每组中的四个记录，按排名排序：

select t.*
from (select t.*,
             row_number() over (partition by name order by ranking desc) as seqnum
      from t
     ) t
where seqnum <= 4
order by name, ranking desc;

糟糕，我想起来了。 Amazon Redshift 不支持row_number()（或者此问题是否已修复？）。累积计数有效：

select t.*
from (select t.*,
             count(*) over (partition by name order by ranking desc range between unbounded preceding and current row) as seqnum
      from t
     ) t
where seqnum <= 4
order by name, ranking desc;

【讨论】：

这看起来不错，但我想要 url 中包含 z 的最大排名，然后从中减去 3。

以上是关于子查询中匹配条件的每个组的最大值的主要内容，如果未能解决你的问题，请参考以下文章