连续选择编号较低的记录
Posted
技术标签:
【中文标题】连续选择编号较低的记录【英文标题】:Selecting records that have low numbers consecutively 【发布时间】:2021-01-04 06:08:12 【问题描述】:我有一个如下表(使用 bigquery):
id | year | month | day | rating |
---|---|---|---|---|
111 | 2020 | 11 | 30 | 4 |
111 | 2020 | 12 | 01 | 4 |
112 | 2020 | 11 | 30 | 5 |
113 | 2020 | 11 | 30 | 5 |
有没有一种方法可以让我选择评级连续(两个或多个连续记录)低(两个记录的评级都低于 4.5)的 ID?
例如,我想要的输出是:
id | year | month | day | rating |
---|---|---|---|---|
111 | 2020 | 11 | 30 | 4 |
111 | 2020 | 12 | 01 | 4 |
【问题讨论】:
【参考方案1】:如果你想要所有行,那么你需要同时查看上一个评分和下一个评分:
SELECT t.*
FROM (SELECT t.*,
LAG(rating) OVER (PARTITION BY id ORDER BY year, month, day ASC) AS prev_rating,
LEAD(rating) OVER (PARTITION BY id ORDER BY year, month, day ASC) AS next_rating,
FROM dataset.table t
) t
WHERE (rating < 4.5 and prev_rating < 4.5) OR
(rating < 4.5 and next_rating < 4.5)
【讨论】:
感谢您的回答。我认为这效果更好。它肯定会返回所有行。【参考方案2】:以下是 BigQuery 标准 SQL
select * except(grp, seq_len)
from (
select *, sum(1) over(partition by grp) seq_len
from (
select *,
countif(rating >= 4.5) over(partition by id order by year, month, day) grp
from `project.dataset.table`
)
where rating < 4.5
)
where seq_len > 1
【讨论】:
以上是关于连续选择编号较低的记录的主要内容,如果未能解决你的问题,请参考以下文章