窗口函数从每个组中获取第一行和最后一行
Posted
技术标签:
【中文标题】窗口函数从每个组中获取第一行和最后一行【英文标题】:Window function get first and last row from each group 【发布时间】:2021-04-06 05:44:43 【问题描述】:我在 presto 上使用窗口函数来获取每组的第一行和最后一行。我确实在我的 Name 列上应用了ROW_NUMBER()
用于分区与 Percent 列排序,我得到以下结果
当前查询:
SELECT Name, Price, Percent, Volume, time, date,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Percent DESC) AS rn
FROM TABLE_NAME ORDER BY Name asc
电流输出:
Name Price Percent Volume time date rn
AABB 0.015 42.55 25980719 2020-12-29 10:23:11 2020-12-29 1
AABB 0.014 33.33 22640655 2020-12-29 10:20:42 2020-12-29 2
AABB 0.014 33.33 22640655 2020-12-29 10:21:11 2020-12-29 3
AABB 0.0137 30.0 21466099 2020-12-29 10:19:20 2020-12-29 4
AABB 0.0135 28.57 20461208 2020-12-29 10:17:19 2020-12-29 5
AABB 0.013 23.81 20201208 2020-12-29 10:16:41 2020-12-29 6
AABB 0.013 23.81 19129182 2020-12-29 10:15:20 2020-12-29 7
AABB 0.0125 19.05 14513969 2020-12-29 10:07:15 2020-12-29 8
AABB 0.0125 19.05 15580088 2020-12-29 10:09:14 2020-12-29 9
AABB 0.012 14.29 14313969 2020-12-29 10:06:44 2020-12-29 10
AABB 0.012 14.29 12924448 2020-12-29 10:15:14 2020-12-29 11
ABQQ 0.025 74.83 6809380 2020-12-29 09:50:04 2020-12-29 1
ABQQ 0.024 67.83 4196759 2020-12-29 09:48:10 2020-12-29 2
ABQQ 0.0225 57.34 935554 2020-12-29 09:06:13 2020-12-29 3
ABQQ 0.0143 -5.61 1600927 2020-12-29 09:43:51 2020-12-29 4
ABQQ 0.0143 -5.61 1600927 2020-12-29 09:41:51 2020-12-29 5
ABQQ 0.0143 -5.61 1600927 2020-12-29 09:36:52 2020-12-29 6
预期输出 1:(按百分比排序并仅选择最高+最低百分比值行)
Name Price Percent Volume time date rn
AABB 0.015 42.55 25980719 2020-12-29 10:23:11 2020-12-29 1
AABB 0.012 14.29 12924448 2020-12-29 10:15:14 2020-12-29 11
ABQQ 0.025 74.83 6809380 2020-12-29 09:50:04 2020-12-29 1
ABQQ 0.0143 -5.61 1600927 2020-12-29 09:36:52 2020-12-29 6
预期输出 2:(按时间排序并仅选择最高+最低时间值行)
Name Price Percent Volume time date rn
AABB 0.015 42.55 25980719 2020-12-29 10:23:11 2020-12-29 1
AABB 0.012 14.29 14313969 2020-12-29 10:06:44 2020-12-29 10
ABQQ 0.025 74.83 6809380 2020-12-29 09:50:04 2020-12-29 1
ABQQ 0.0225 57.34 935554 2020-12-29 09:06:13 2020-12-29 3
【问题讨论】:
【参考方案1】:你需要一个子查询:
SELECT Name, Price, Percent, Volume, time, date,
FROM (SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Percent) AS seqnum_asc,
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Percent DESC) AS seqnum_desc
FROM TABLE_NAME t
) t
WHERE 1 IN (seqnum_asc, seqnum_desc)
ORDER BY Name asc;
如果您想将time
作为单独的查询,只需调整窗口子句中的ORDER BY
。如果你想要一个查询,那么添加两个基于time
的新“seqnum”。
【讨论】:
以上是关于窗口函数从每个组中获取第一行和最后一行的主要内容,如果未能解决你的问题,请参考以下文章
使用具有不同 order by 子句的 postgres 窗口函数
在 groupby (Multiindex) 之后从每个组中选择第一行
pandas使用groupby函数first函数last函数分别获得每个分组的第一行和最后一行数据(first/last row of each group in dataframe)