PostgreSQL 选择 r.* by MIN() 并在两列上进行分组
Posted
技术标签:
【中文标题】PostgreSQL 选择 r.* by MIN() 并在两列上进行分组【英文标题】:PostgreSQL Select the r.* by MIN() with group-by on two columns 【发布时间】:2021-04-28 08:30:23 【问题描述】:名为results
的表的示例架构
id | user_id | activity_id | activity_type_id | start_date_local | elapsed_time |
---|---|---|---|---|---|
1 | 100 | 11111 | 1 | 2014-01-07 04:34:38 | 4444 |
2 | 100 | 22222 | 1 | 2015-04-14 06:44:42 | 5555 |
3 | 100 | 33333 | 1 | 2015-04-14 06:44:42 | 7777 |
4 | 100 | 44444 | 2 | 2014-01-07 04:34:38 | 12345 |
5 | 200 | 55555 | 1 | 2015-12-22 16:32:56 | 5023 |
问题
通过activity_type_id
和year
选择每个用户最快活动的结果(即最小经过时间)。
(基本上,在这个简化的示例中,记录 ID=3 应该从选择中排除,因为记录 ID=2 对于给定 activity_type_id 1 和 2015 年的用户 100 是最快的)
我尝试过的
SELECT user_id,
activity_type_id,
EXTRACT(year FROM start_date_local) AS year,
MIN(elapsed_time) AS fastest_time
FROM results
GROUP BY activity_type_id, user_id, year
ORDER BY activity_type_id, user_id, year;
实际
选择我想要的正确结果集,但只包含按列分组
user_id | activity_type_id | year | fastest_time |
---|---|---|---|
100 | 1 | 2014 | 4444 |
100 | 1 | 2015 | 5555 |
100 | 2 | 2014 | 12345 |
200 | 1 | 2015 | 5023 |
目标
拥有所有列的实际完整记录。即results.*
+ year
id | user_id | activity_id | activity_type_id | start_date_local | year | elapsed_time |
---|---|---|---|---|---|---|
1 | 100 | 11111 | 1 | 2014-01-07 04:34:38 | 2014 | 2014 |
2 | 100 | 22222 | 1 | 2015-04-14 06:44:42 | 2015 | 5555 |
4 | 100 | 44444 | 2 | 2014-01-07 04:34:38 | 2014 | 12345 |
5 | 200 | 55555 | 1 | 2015-12-22 16:32:56 | 2015 | 5023 |
【问题讨论】:
【参考方案1】:您可以为此使用窗口函数:
select id, user_id, activity_id, activity_type_id, start_date_local, year, elapsed_time
from (
SELECT id,
user_id,
activity_id,
activity_type_id,
start_date_local,
EXTRACT(year FROM start_date_local) AS year,
elapsed_time,
min(elapsed_time) over (partition by user_id, activity_type_id, EXTRACT(year FROM start_date_local)) as fastest_time
FROM results
) t
where elapsed_time = fastest_time
order by activity_type_id, user_id, year;
或者使用distinct on ()
select distinct on (activity_type_id, user_id, extract(year from start_date_local))
id,
user_id,
activity_id,
activity_type_id,
extract(year from start_date_local) as year,
elapsed_time
from results
order by activity_type_id, user_id, year, elapsed_time;
Online example
【讨论】:
【参考方案2】:我想你想要这个:
SELECT DISTINCT ON (user_id, activity_type_id, EXTRACT(year FROM start_date_local))
*, EXTRACT(year FROM start_date_local) AS year
FROM results
ORDER BY user_id, activity_type_id, year, elapsed_time;
【讨论】:
以上是关于PostgreSQL 选择 r.* by MIN() 并在两列上进行分组的主要内容,如果未能解决你的问题,请参考以下文章
Oracle 转 postgresql 递归 connect_by_isleaf 方案