如何在 Redshift SQL 中使用窗口函数

Posted 2023-03-31

技术标签:

【中文标题】如何在 Redshift SQL 中使用窗口函数【英文标题】：How to use Window function in Redshift SQL 【发布时间】：2020-02-28 21:01:24 【问题描述】：

我有一张这样的桌子：

Ans_cnt | Workloadid | Alias
10 | 1 | A
10 | 1 | B
10 | 1 | C
20 | 2 | D
20 | 2 | E
20 | 2 | F

create temp table test
(ans_cnt int, workloadid int, alias varchar(2));

insert into test values
(10, 1, 'A');
insert into test values
(10, 1, 'B');
insert into test values
(10, 1, 'C');

我想得到这样的结果：

Ans_cnt | workloadid
10 | 1
20 | 2

即，对于工作负载 1，总 ans_cnt 仍然是 10。对于工作负载 2，总 ans_cnt 仍然是 20，只是为同一个工作负载分配了多个别名。希望这是有道理的。

我尝试在工作负载 ID 上通过 partitionin 进行求和，但它不起作用：

select sum(ans_cnt) over (partition by workloadid) as ans_cnt from test

请帮忙。

【问题讨论】：

【参考方案1】：

如果您对相同的工作负载使用不同的 ans_cnt 会发生什么？

例如在这种情况下：

Ans_cnt | Workloadid | Alias
10 | 1 | A
10 | 1 | B
20 | 1 | C
10 | 2 | D
20 | 2 | E
30 | 2 | F

我的猜测是您希望为每个工作负载选择最高数量的 ans_cnt。

如果是，您只需要这个 SQL：

select workloadid, max(ans_cnt) as ans_cnt from test
group by workloadid;

这将作为输出：

Ans_cnt | Workloadid
20 | 1
30 | 2

或者如果你想选择最新的 ans_cnt 并且你的别名是按字母顺序分配的，你需要这个 SQL：

select ans_cnt, workloadid 
from (
       select ans_cnt, workloadid
         , row_number() over (partition by workloadid order by alias desc) as rnk 
       from test_1
) as t
where rnk=1

【讨论】：

以上是关于如何在 Redshift SQL 中使用窗口函数的主要内容，如果未能解决你的问题，请参考以下文章