带有 where 子句的 SQL Lag()
Posted
技术标签:
【中文标题】带有 where 子句的 SQL Lag()【英文标题】:SQL Lag() with where clause 【发布时间】:2021-02-24 18:34:11 【问题描述】:我有一张这样的桌子-
row_no | Movie. | movie_start_time | movie_end_time |
---|---|---|---|
1 | A | 2021-02-01 01:00:00 | 2021-02-01 02:00:00 |
2 | B | 2021-02-01 01:00:00 | 2021-02-01 02:00:00 |
3 | A | 2021-02-01 01:30:00 | 2021-02-01 02:30:00 |
4 | A | 2021-02-01 01:30:00 | 2021-02-01 02:30:00 |
5 | A | 2021-02-01 02:15:00 | 2021-02-01 03:15:00 |
6 | B | 2021-02-01 02:15:00 | 2021-02-01 03:15:00 |
7 | A | 2021-02-01 04:15:00 | 2021-02-01 05:15:00 |
我想在表中添加一个额外的列,该列在同一部电影的先前播放时间和当前播放时间之间存在差异。我还有一个条件是之前的播放时间和当前的播放时间不应该重叠。所以在上面的场景中,结果应该如下所示:
row_no | Movie. | movie_start_time | movie_end_time. | last_play |
---|---|---|---|---|
1 | A | 2021-02-01 01:00:00 | 2021-02-01 02:00:00 | - |
2 | B | 2021-02-01 01:00:00 | 2021-02-01 02:00:00 | - |
3 | A | 2021-02-01 01:30:00 | 2021-02-01 02:30:00 | - |
4 | A | 2021-02-01 01:30:00 | 2021-02-01 02:30:00 | - |
5 | A | 2021-02-01 02:15:00 | 2021-02-01 03:15:00 | 15 minutes |
6 | B | 2021-02-01 02:15:00 | 2021-02-01 03:15:00 | 15 minutes |
7 | A | 2021-02-01 04:15:00 | 2021-02-01 05:15:00 | 60 minutes |
我尝试编写以下查询以获取上一个电影结束时间,以便稍后计算差异:
select movie, movie_start_time, movie_end_time, lag(movie_end_time) over (partition by movie order by movie_start_time) prev_end_time from table where prev_end_time <= movie_start_time
但这不起作用,因为我们不能在窗口函数中添加 where 子句。有没有其他方法可以解决这个问题?
【问题讨论】:
select * from (<that query of yours>) t where prev_end_time <= movie_start_time
?
你确定第五行是正确的吗?因为它与第四行的另一部电影 A 重叠。
@GSerg 这行不通,因为滞后只会给我上一行,我需要满足条件 prev_end_time 的上一行
@ARAT 我认为我选择的示例有点令人困惑 - 第 5 行 15 分钟是与不重叠的先前记录的区别 - 这是第 1 行
【参考方案1】:
所以我用postgreSQL建表:
CREATE TABLE IF NOT EXISTS table1 (
row_no INT,
Movie CHAR,
movie_start_time timestamp,
movie_end_time timestamp
);
并将记录插入到该表中:
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(1, 'A', '2021-02-01 01:00:00', '2021-02-01 02:00:00');
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(2, 'B', '2021-02-01 01:00:00', '2021-02-01 02:00:00');
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(3, 'A', '2021-02-01 01:30:00', '2021-02-01 02:30:00');
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(4, 'A', '2021-02-01 01:30:00', '2021-02-01 02:30:00');
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(5, 'A', '2021-02-01 02:15:00', '2021-02-01 03:15:00');
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(6, 'B', '2021-02-01 02:15:00', '2021-02-01 03:15:00');
INSERT INTO table1 (row_no, Movie, movie_start_time, movie_end_time) VALUES(7, 'A', '2021-02-01 04:15:00', '2021-02-01 05:15:00');
那么你想要的是:
select row_no, movie, movie_start_time, movie_end_time, EXTRACT(EPOCH FROM (movie_start_time - prev_end_time)::INTERVAL)/60 AS last_play FROM
(select row_no,movie, movie_start_time, movie_end_time, lag(movie_end_time) over (partition by movie order by movie_start_time) AS "prev_end_time"
from table1) t
where prev_end_time <= movie_start_time
ORDER BY movie_start_time
【讨论】:
在您的结果中,预期结果中缺少第 5 行。您只是在检查该电影的上一个记录。但是我需要电影的前一个记录,其中 prev_end_time 【参考方案2】:我能够通过以下查询解决问题:
with a as
(select row_no,movie, movie_start_time, movie_end_time,
array_agg(movie_end_time) over (partition by movie order by movie_start_time
rows between unbounded preceding and 1 preceding) AS prev_end_time from `table1`),
b as (select row_no,movie, movie_start_time, movie_end_time,
case
when prev_end_time is null then null
else (select max(i_prev_end_time) from unnest(prev_end_time)i_prev_end_time
where i_prev_end_time <= movie_start_time)
end previous_end_time from a)
select row_no,movie, movie_start_time, movie_end_time,
unix_seconds(movie_start_time) - unix_seconds(previous_end_time) last_run from b
【讨论】:
以上是关于带有 where 子句的 SQL Lag()的主要内容,如果未能解决你的问题,请参考以下文章